Euler Path: 8 Lines Solution

2017-01-05 7 minute read

The problem of Euler path marked a very fundamental moment in algorithm studies. When Euler posed the 7-bridge problem, there was no mathematical tool to solve it, hence he created the tool - graph theory. It sounds like how Newton invented calculus to solve his gravity problems and how Bernoulli invented calculus of variations to solve his brachistochrone problem. As I quote Sutherland,

“Well, I didn’t know it was hard.”

Problem statement

Say you have arrived in a new country with a bunch of islands and one way bridges between some pairs of islands. As a tourist you would like to visit all the bridges once and only once. Abstractly, an Euler path is a path that traverses all the edges once and only once. Is it always possible? If possible, how do we find such a path?

Existence of path

Unlike many other problems, we can determine the existence of the path without actually finding the path. There are 2 conditions for it. The first has to do with in and out degrees, and the second is about the connectivity of the graph.

For a directed graph, the in degree is the number of incoming edges to a vertex, and the out degree is the number of outgoing edges to a vertex. Easily, if we have an Euler path, then there will be a start and end vertex. The start vertex will have an out degree - in degree of 1, the end vertex will have in degree - out degree of 1. Every other vertex will have an equal number of out degree and in degree, because if you have a different number, you obviously cannot go through all of those edges in one path. The only exception is when the start and end vertices are the same, in which case all the vertices have the same number of in and out degrees.

The second condition is that all vertices have to be weakly connected, meaning that by treating all edges as undirected edges, there exists a path between every pair of vertices. The only exception is for the vertices that have no edges - those can just be removed from the graph, and the Euler path trivially does not include them.

Given the above two properties, you can prove there is an Euler path by the following steps. First it is easy to see that if you start walking from the start vertex (out - in = 1) and removing edges as you walk through them, you can only end up at the end vertex (in - out = 1). Then, the remaining graph is full of cycles that can be visited through some vertex in the path we already have, and you just have to merge the cycles and the path to get the Euler path.

It is easier to see the other direction of the proof: if there is an Euler path, both conditions have to be met.

Implementing existence of path

As mentioned above, we code up the two conditions and check whether we can find a valid starting position, otherwise return -1. Vertices are numbered 0 to n-1. The graph is stored as an adjacency list, meaning that adj[i] has all the neighbors (edges go from i to those). So the out degree of i is adj[i].size().

int start(vector<vector<int> >& adj) {
    // condition 1: in and out degrees
    vector<int> deg(adj.size());
    for (int i = 0; i < adj.size(); i++) {
        deg[i] += adj[i].size();
        for (int x : adj[i])
            deg[x]--;
    }
    int ans = -1;
    for (int i = 0; i < deg.size(); i++)
        if ((ans != -1 && deg[i] != 0 && deg[i] != -1)
            deg[i] > 1) return -1;
        else if (deg[i] == 1) ans = i;
    if (ans == -1) // start and end vertices are the same
        for (int i = 0; i < adj.size() && ans == -1; i++)
            if (!adj[i].empty())
                ans = i;
    if (ans == -1) // there is no edge at all
        return -1;
    // condition 2: connectivity
    vector<bool> vis(adj.size());
    vector<int> bfs{ans};
    vis[ans] = true;
    for (int i = 0; i < bfs.size(); i++)
        for (int x : adj[bfs[i]])
            if (!vis[x]) {
                bfs.push_back(x);
                vis[x] = true;
            }
    for (int i = 0; i < adj.size(); i++)
        if (!vis[i] && !adj[i].empty())
            return -1;
    return ans;
}

That is slightly more clumsy that I would have liked, but it should be clear. Basically just counting in and out degrees and running a BFS on the starting vertex.

Actually finding the path

As mentioned above in the sketch of proof, finding a path consists of the 3 steps:

Walk from the start vertex, removing edges as we use them, until there is nowhere to go. Then we have a path from start to end.
For the remaining edges, start at a vertex on the path and randomly walk until we go back to the same vertex. Then we have a cycle. Merge the cycle on the path. For example, if we had a path s->a->b->c->…->t and a cycle c->alpha->beta->c, we merge them to become s->a->b->c->alpha->beta->c->…->t.
Repeat step 2 until there is no remaining edge.

Well, that does not sound very easy to write nor very efficient to run, if we literally implemented the above. How many lines of code would that be?

The 8 lines solution

The answer is 8. Here’s the code:

void euler(vector<vector<int> >& adj, vector<int>& ans, int pos) {
    while (!adj[pos].empty()) {
        int next = adj[pos].back();
        adj[pos].pop_back();
        euler(adj, ans, next);
    }
    ans.push_back(pos);
}

This is a very beautiful solution using recursion, in my opinion. I was quite surprised when I first saw this. Where is everything? Where is getting the main path? Where is getting the cycles? Where are we merging them? The gist of this recursion is a post-order DFS, meaning that we visit the end of the graph first, and then backtrack. This comes from a very crucial observation: we can always be sure what could be at the end of the path, but not the front. It is important to know that the answer array is in reverse order of visit, i.e. we need to reverse it to get the Euler path. Let’s go through the stages of how the program works.

Path from start to end vertex. The first time we push back is when we run out of outgoing edges, which can only be the case of the end vertex, with in-out = 1. In all other vertices, the numbers are the same so if you can go in, you can definitely go out of that vertex. Hence the first push back occurs with the end vertex, and at that point the program execution stack has the entire path.
As we return from a recursive call of the function, we are essentially going back from the end to the start. If a vertex on the main path does not have any outgoing edge, we know we will visit it next, so we push it to the ans vector and return from the function.
But if a vertex on the path does have an outgoing edge, that means there is at least one cycle including this vertex. Then in the next iteration of the while loop, we will visit one of the outgoing edges and start another round of recursion. Again, this recursion must end on the same vertex because it is the only one with in-out = 1. This recursion gets you a cycle and by the time it returns, the cycle would have been pushed to the ans array already, finishing the merge operation.

By studying this code, there is one interesting point to note. That is, the while loop will only be executed 0 to 2 times in any recursive call. It will only be 0 at the end vertex, and on the main path with no branches, it will be 1. On the main path with branches, it will only be 2 but not more regardless of the out degree, because surely the recursive call for cycle needs to end on that vertex but it will not end until all outgoing edges are used up. Therefore to the program, multiple cycles on one vertex is just one big cycle. On the other vertices on cycles, it works the same way whether they have branches or not.

In case you want to see how to run it, here is the main function I wrote to test it:

int main() {
    int n, q;
    cin >> n >> q;
    vector<vector<int> > adj(n);
    for (int i = 0; i < q; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
    }
    int s = start(adj);
    if (s == -1) {
        cout << "no path" << endl;
        return 0;
    }
    vector<int> ans;
    euler(adj, ans, s);
    reverse(ans.begin(), ans.end());
    for (int x : ans)
        cout << x << endl;
    return 0;
}

That’s it - a simple problem with a simple solution. Leetcode has one Euler path problem, and the algorithm in this blog post comes from the discussion of that problem. Everything is pretty much the same for undirected graphs - you just have to use different data structures to store the edges. The proof is mostly the same with the first condition now about odd/even number of edges at each vertex, as there is no distinction between in and out degree.