TIW: STL #2

2016-10-20 7 minute read

This should be a relatively easy post. Things covered: auto, iterator, sort, next permutation, reverse and swap. If you know all of these you can skip.

Auto

Auto is a keyword that saves you some thinking time. It is nothing but a lazy way to write code, no concept or algorithm whatsoever. Like how Python induces the data type, you can use auto to force your compiler to do your job.

auto x = 5;
auto p = make_pair(2, 4);
auto s = vector<int>(4);

When would you ever want to be lazy, as a hardworking programmer, you ask? Well it helps shorten the code sometimes, especially in the case of iterators.

Iterators

Since implementation is hidden in a black box for STL data structures, we have less control over how to mess with the data directly. That’s why we have iterators to let us peep into the black box. Iterators for each data structure are different from others, but they mostly do one thing: iterating through data points you put in the data structure. You always initialize an iterator from an instance of a data structure, for example the begin() function of sets and maps, which we have seen in the previous post.

Iterators act as pointers to elements, so you need to dereference to get the value. For sets and maps, they could be quite useful:

vector<int> v{1, 5, 8, 2, 9};
set<int> s;
map<int, int> m;
for (int i = 0; i < v.size(); i++) {
    s.insert(v[i]);
    m[v[i]] = i;
}
for (auto it = s.begin(); it != s.end(); it++)
    cout << *s << endl;
for (auto it = m.begin(); it != m.end(); it++)
    cout << it->first << " " << it->second << endl;

Had I not used auto, I would have written set::iterator and map<int, int>::iterator. Then your code will look uglier, but that's a personal preference.

To explain a little bit, .begin() gives an iterator to the first item in the data structure. In the case of set and map, begin() points to the smallest element.

Then, .end() returns a special iterator marking the end of the data structure. It is not the last element; it is the iterator after the last element. Trying to dereference it will crash your program.

To print things in reverse order, just use rbegin() and rend() in place of begin() and end(), no pain at all. The ++ operator gives you the next iterator, and – gives you the previous one. We will see subtleties later on with these two operators.

Another shorthand for the loop in order above:

for (auto it : m)
    cout << it.first << " " << it.second << endl;

note that the “it” here is not actually the iterator, but a dereferenced key-value pair. Like literally a pair<int, int>.

Sort

To sort a vector, just pass in the begin() and end() iterators. Obviously it modifies the vector you pass into it.

vector<int> v{1, 4, 5, 2, 6, 3}
sort(v.begin(), v.end()); // {1, 2, 3, 4, 5, 6}
sort(v.rbegin(), v.rend()); // {6, 5, 4, 3, 2, 1}

If you are a hardcore c fan, you can even pass in c array pointers, since iterators are like pointers anyways:

int v[] = {1, 4, 5, 2, 6, 3};
sort(v, v+6);

Strings are like vector, so you can sort a string too. For example, for the anagram problem last time, a really short piece of code could be:

sort(s.begin(), s.end());
sort(t.begin(), t.end());
return s == t;

This is going to be O(nlog(n)), n being the sum of length of the two strings. So strictly speaking this is not the best solution ever, but hey it’s only three lines.

Next permutation

This is a really handy way to enumerate all n! ways to permute a vector of size n. Let’s say for some reason you need to generate all 720 permutations of {1, 2, 3, 4, 5}, this is all you need to do:

vector<int> v{1, 2, 3, 4 ,5};
do {
    for (int i = 0; i < v.size(); i++)
        cout << v[i] << " ";
    cout << endl;
} while (next_permutation(v.begin(), v.end()));

It is good to know the algorithm behind this function, basically you just go from the back to the front, until the next number is smaller, then you swap that next number with a slightly larger number from this number to the back, and keep everything in ascending order after this slightly larger guy.

One cool aspect about this function is that you can hack it so it gives you combinations instead of permutations. Let’s say you have 5 guys, and you want to choose 2 of them. Then you just feed the function with a special vector:

vector<int> v{0, 0, 0, 1, 1};
do {
    for (int i = 0; i < v.size(); i++)
        if (v[i] == 1)
            cout << i << " is chosen" << endl;
    cout << endl;
} while (next_permutation(v.begin(), v.end()));

This function is useful in interviews, but rarely applicable in real life. But when you actually need to enumerate, things will be so much easier with a bug-free built-in function.

Reverse

To reverse a vector, there’s a function aptly called reverse. Like sort, that will modify the original vector. If you don’t want to modify it, you can make a new vector via a special constructor.

vector<int> v{1, 2, 3, 4 ,5};
string s = "abcde";
reverse(v.begin(), v.end()); // {5, 4, 3, 2, 1}
reverse(s.begin(), s.end()); // "edcba"
vector<int> u(v.rbegin(), v.rend()); // {1, 2, 3, 4, 5}
string t(s.rbegin(), s.rend()); // "abcde"

Please don’t write a for loop to do what you can do with built-in functions. You will be surprised how many fewer bugs you need to clean with shorter code and fewer loops.

Another trick that is occasionally useful: let’s say you want to reverse the 3rd to 6th character (start counting from 0th) of string s. Then you can:

string s = "12345678";
reverse(s.begin()+3, s.begin()+7);
cout << s << endl; // 12376548

Note that the second parameter is always one after the last element you want to swap.

Swap

There are two swap functions. The first is swap(x, y) which swaps two variables of the same type, the other is a way to rename your black boxes.

To swap two variables:

int x = 3, y = 6;
swap(x, y);
string s = "12345";
swap(s[2], s[4]);

Another is to swap the content of STL data structures.

s0.swap(s1);

But you can just write swap(s0, s1) instead, because since C++11 there is no performance difference between the two.

Move

Sometimes if you want to deal with an object, there are some unnecessary deep copies that make your code less efficient. For example in the following code, if we ignore the function move(), there will be two copies of “test”, one stored in s and one stored in the vector. The point is, we don’t care what s becomes afterwards, so we can destroy it after pushing it to the vector. Move does exactly this: destroy the original, so we only have 1 instance of the object, and we can eliminate the time complexity due to deep copy.

// Destroying a string after pushing to the vector
vector<string> v;
string s = "test";
v.push_back(move(s));
// Setting v1 to v2 in O(1) time
v1 = move(v2);

Ok, that should be enough for us to grab a problem. Here’s a medium level one:

Binary Tree Right Side View

TLDR: given a binary tree, output the rightmost element in each layer in a vector. The algorithm is to traverse the tree layer by layer, each time pushing in the value of the right most node:

vector<int> rightSideView(TreeNode* root) {
    vector<int> ans;
    if (!root)
        return ans;
    vector<TreeNode*> layer{root};
    while (!layer.empty()) {
        ans.push_back(layer.back()->val);
        vector<TreeNode*> next;
        for (auto it : layer)
            for (TreeNode* c : {it->left, it->right})
                if (c)
                    next.push_back(c);
        layer = move(next);
    }
    return ans;
}

Notice the use of for (auto x : y) in this context: x would be the actual element, so I did not dereference it before I used it as a TreeNode*. Also note the move at the end of the loop.

Last small remark: in C++, NULL is the only pointer value that means false, and 0 is the only integer value that means false. So you can write if (ptr) or if (!ptr) to test for null, and if (num) or if (!num) to test for 0 value.

Next, we’ll talk about upper_bound and lower_bound.