DFS
Depth-first search is a graph traversal algorithm.
Approach
The basic idea of DFS is to visit every child of the visiting node before moving on to the next unvisited node.
- Every node besides one start out being unvisited (white).
- DFS begin by visiting a source node. Upon entering a node, DFS marks it as visiting (grey).
- DFS then visits all of its child nodes.
- After it has visited all children of a grey node, it marks the visiting parent node as visited (black).
- This continues until every node is black.
Implementation
DFS_visit(u) { time = time + 1 d[u] = time color[u] = 'grey' for each v in adj[u] { if color[u] == white DFS_visit(u) } f[u] = time++ color[u] = black }
Analysis
Since DFS is only called when a node is white, and a node is marked as grey immediately afterwards, DFS is called V times.
In each call, everything is completed in constant time, except the process of iterating the outgoing edges of that node. Over all iterations, that comes down to E times since that is the total number of edges.
Therefore, under the normal implementation of DFS, the runtime complexity is O(V + E).
Adjacency lists
Realistically, the only real change on the runtime would be using something other than adjacency lists to store the graph, in which case the time it takes to search for edges needs to be taken into account. For example, an adjacency matrix would take O(V^2), which is the time it takes to iterate through all the edges.
Variations and Applications
Edge classification and cycle detection
DFS can classify edges based on the color of the node it is visiting.
- If the node is white (unvisited), the edge is called a tree edge. After DFS ends, all the tree edges form a spanning tree stemming from the source node (if the graph is connected).
- If the node is grey (visiting), the edge is called a back edge. During a visit, any grey node is a predecessor of the current node, therefore edges to grey node goes "back" up. Notably, the presence of back edges indicate that there is a cycle.
- If the node is black (visited), the edge is either a forward edge or a cross edge. I have not seen uses for these.
Entry time and exit time
An addition to DFS keeps track of the entry time and exit time of nodes: that is, it keeps an ordering of when nodes are marked as grey/black. An index is used to keep track of this, incrementing every time a node's color change.
This is useful primarily for topological sort. A strategy for this problem involves DFS-ing the graph and sorting the nodes by exit times.