Modern C++ Features

Type inference in C++11

Sometimes your compiler can infer the type of variable you are declaring. Hence, a new feature introduced as a syntactical sugar to the C++ code, i.e the auto keyword. We can use the auto keyword in our variables declarations when their types can be inferred by the compiler in some other way.

For example, if you have function int add( int x , int y) that obviously return int type. When declaring a variable that holds the return value from the add function, we can use auto keyword, no need to explicitly write int.

int add(int x, int y)
{
    return x + y;
}
int main()
{
    auto sum = add(5, 6); // add() returns an int, so sum will be type int
    return 0;
}

Reference: The auto keyword

Range-based for loops

When using the standard template library (STL) containers (i.e data structures), we can iterate over the elements of the standard data structures using a simpler form of for loops.

The following for-constructs are equivalent:

std::vector<int> vec;

vec.push_back( 1 );
vec.push_back( 2 );

for( int i = 0 ; i < vec.size() ; ++i )
{
    vec[i]++;
}

for ( int& element : vec )
{
    element++; // increments the value in the vector
}

The following for-constructs are equivalent:

std::vector<int> vec;

vec.push_back( 1 );
vec.push_back( 2 );

for( int i = 0 ; i < vec.size() ; ++i )
{
    std::cout << vec[i] << std::endl; 
}

for (int element : vec )
{
    std::cout << element << std::endl;
}

References

Revisiting Associative Containers: Hash Tables

We learned earlier how to implement an associative container (i.e dictionary) using a BST. In this tutorial, you are introduced a new data structure that can serve as an associative container, i.e hash table. As associative container are associating keys with values, hash table are also working with the same concept.

In brief, hash table is an array with some additional function. Each element in the array consists of a key and value.

struct HashElement
{
    char key;
    int value;
};

How do we store elements?

hash
A small phone book as a hash table, Creative Commons: Jorge Stolfi, Hash Table | Wikipedia

Hash function is simply transforming element key value into an index to store that element in. A simple hash function would typically utilize the following function:

\[\text{ index } = \text{ key } \% \text{ ARRAY_SIZE }\]
Example: Hashing DNA characters

Consider the following table of DNA four bases along with their ascii code values:

Base Ascii Hash0: $ \text{ key } \% \text{ 3 } $ Hash1: $ \text{ key } \% \text{ 5 } $ Hash2: $ \text{ key } \% \text{ 13 }$
A 65 2 0 0
C 67 1 2 2
G 71 2 1 6
T 84 0 4 6

As you may have guessed, we will have a critical issue that mapping keys are not necessarily results in unique index for each key. For example when we used Hash0, it hashed A(65) and G(71) with same value. We name this problem in hashing world by collision. So, how do we store A and G then?

Add collision to your glossary.

Method 1: Chaining

hash
Hash collision resolved by separate chaining, Creative Commons: Jorge Stolfi, Hash Table | Wikipedia
  1. We hash the key to an index.
  2. All keys mapped to the same index are stored in the same list.
Element Structure
struct HashElement
{
    char key;
    int value;
};
Table Structure
struct HashChainingTable
{
    std::array< std::list< HashElement > , 100 > bucket;
};
Hash Function
int hash(HashChainingTable &table, char key)
{
    return key % table.bucket.size();
}
Creating Hash Table
HashChainingTable create()
{
    HashChainingTable newTable;
    return newTable;
}
Empty Function
bool isEmpty(HashChainingTable &table)
{
    for (int i = 0; i < table.bucket.size(); ++i)
        if (!table.bucket[i].empty())
            return false;
    return true;
}
Size Function
int size(HashChainingTable &table)
{
    int mSize = 0;
    for (int i = 0; i < table.bucket.size(); ++i)
        mSize += table.bucket[i].size();
    return mSize;
}
Find Function
bool find(HashChainingTable &table, char key)
{
    int index = hash(table, key);
    std::list<HashElement> &slot = table.bucket[index];
    for (HashElement &element : slot)
        if (element.key == key)
            return true;
    return false;
}
At Function
int &at(HashChainingTable &table, char key)
{
    int index = hash(table, key);
    std::list<HashElement> &slot = table.bucket[index];
    for (HashElement &element : slot)
        if (element.key == key)
            return element.value;

    std::cout << "Key not found!" << std::endl;
    exit(1);
}
Insert Function
void insert(HashChainingTable &table, char key, int value)
{
    if (!find(table, key))
    {
        HashElement newElement{key, value};
        int index = hash(table, key);
        std::list<HashElement> &slot = table.bucket[index];
        slot.push_back(newElement);
    }
}
Remove Function
void remove(HashChainingTable &table, char key)
{
    int index = hash(table, key);
    std::list<HashElement> &slot = table.bucket[index];
    for (auto it = slot.begin(); it != slot.end(); ++it)
        if (it->key == key)
            slot.erase(it);
}
Value Function
int &value(HashChainingTable &table, char key)
{
    if (!find(table, key))
    {
        insert(table, key , 0 );
    }
    return at(table, key);
}
Clear Function
void clear(HashChainingTable &table)
{
    for (auto &slot : table.bucket )
        slot.clear();
}
void printAll(HashChainingTable &table)
{
    for (auto &slot : table.bucket )
        for (auto &element : slot)
            std::cout << element.key << ":" << element.value << std::endl;
}

Method 2: Linear Probing

hash
Hash collision resolved by open addressing with linear probing, Creative Commons: Jorge Stolfi, Hash Table | Wikipedia
  1. We hash the key to an index.
  2. We check if that index is occupied or not by other element:
    1. if occupied, increment the index until you find available slot.
  3. Store.

Map (ADT) on Hash Table

#include "hash_chaining.hpp"
#include "helpers.hpp"

int main( int argc, char **argv )
{
    if( argc == 2 )
    {
        std::string dna = getFileFirstLine( argv[1] );

        auto cMap = hash::create();

        // Complete here!
        for( auto b : dna )
            hash::value( cMap , b )++;

        // Done here!
        hash::printAll( cMap );
    }

    return 0;
}

Clone the code:

git clone https://github.com/sbme-tutorials/sbe201-hashes.git

Final Project Criteria

Deliverables

First deliverable due 15 May

  • Complete implementation of the project (4 pts).
  • Private repository showing the contribution of each member (3 pt).
  • CMake script to manage the project build (0.5 pt).

Second deliverable 5 days post finals

  • Presentation: your classmates need to understand all the concepts (2 pt).
  • Documentation: Markdown + LaTex (2 pt).

Awards

The top four teams to be awarded a T-shirt :tshirt: to the most active member.

Projects with outstanding output would qualify their team members to join the department researchers in some research projects during the next year.

Programming Languages and Research Interests

As an engineer, you should be highly skilled in few programming languages and must also develop a research interest. You are highly recommended to develop research interests in few domains. Research interests makes your cover letter (i.e motivation letter, i.e statement of purpose) very attractive.

Discussion: Engineers vs. Programmers

Domestic Internships

Domestic internships are typically found and admitted through several channels.

  1. Network
  2. Job fairs and direct announcements.
  3. Wuzzuf search engine; use the appropriate keywords, for example: “internship web”, “internship machine learning”, “internship computer vision”, etc.

International Internships

Common Requirements

  • Recommendation Letters: all your teachers owe you a recommendation letter. Recommenders from diverse schools or places makes potential recommendation.
  • Motivation Letter: to reflect your strong reasons to apply for the internship. You should also convey your engineering skills and your research interests. General guidelines:
    • Paragraph 1: Introduce yourself and talk about some interesting milestones if you have.
    • Paragraph 2-3: Describe the general areas of research that interest you and why. Prove some interests by an overview of online courses, projects, or other internships in the field. Would be nice to add little philosophical views.
    • Paragraph 4: Tell us a little bit about yourself and your life experiences. Why do you feel you need that position?
    • The above elements are essential components in your letter. Try to avoid any elements to prevail on others, as much salt or no salt would make a bad recipe.
  • Grades: GPA (generally > 3.2)
  • Rank (Sometimes): top 5%
  • English Scores (Sometimes)

List of International Summer Schools

Domains: Biology, Biomedical, Medicine.

Some links may be broken or outdates. If so, search with the program name in Google engine.

Research Interests

A suggested list for research interests that you may dedicate this summer to learn about.

  • Artificial Intelligence and Machine Learning
  • Data Science and Big Data
  • High Performance Computing and Distributed Systems
  • Bioinformatics
  • Computational Neuroscience
  • Algorithms Design and Theoretical Computer Science
  • Security, Cryptography, and Blockchain
  • Computer Architecture, Integrated Circuits Manufacturing, MEMS, NEMS