DFS: Difference between revisions

From Rice Wiki
No edit summary
 
(13 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= DFS =
{{Infobox Algorithm|runtime=O(V + E)|space=O(V)|class=[[Graph Algorithms]]}}


Used for cycle detection
'''Depth-first search''' is a graph traversal algorithm.


== Approach ==
= Approach =
The basic idea of DFS is to visit ''every child'' of the visiting node before moving on to the next unvisited node.
 
# Every node besides one start out being unvisited (''white'').
# DFS begin by visiting a source node. Upon entering a node, DFS marks it as visiting (''grey'').
# DFS then visits all of its child nodes.
# After it has visited all children of a grey node, it marks the visiting parent node as visited (''black'').
# This continues until every node is black.
 
= Implementation =
<pre>
DFS_visit(u) {
    time = time + 1
    d[u] = time
    color[u] = 'grey'
    for each v in adj[u] {
        if color[u] == white
            DFS_visit(u)
    }
    f[u] = time++
    color[u] = black
}
</pre>
 
= Analysis =
Since DFS is only called when a node is white, and a node is marked as grey immediately afterwards, DFS is called ''V'' times.
 
In each call, everything is completed in constant time, except the process of iterating the outgoing edges of that node. Over all iterations, that comes down to ''E'' times since that is the total number of edges.
 
Therefore, under the normal implementation of DFS, the runtime complexity is O(V + E).
 
== Adjacency lists ==
Realistically, the only real change on the runtime would be using something other than adjacency lists to store the graph, in which case the time it takes to search for edges needs to be taken into account. For example, an adjacency matrix would take O(V^2), which is the time it takes to iterate through all the edges.
 
= Variations and Applications =
 
== Edge classification and cycle detection ==
DFS can classify edges based on the color of the node it is visiting.
 
* If the node is ''white'' (unvisited), the edge is called a '''tree edge'''. After DFS ends, all the tree edges form a spanning tree stemming from the source node (if the graph is connected).
* If the node is ''grey'' (visiting), the edge is called a '''back edge'''. During a visit, any grey node is a predecessor of the current node, therefore edges to grey node goes "back" up. Notably, the presence of back edges indicate that there is a '''cycle'''.
* If the node is ''black'' (visited), the edge is either a '''forward edge''' or a '''cross edge'''. I have not seen uses for these.
 
== Entry time and exit time ==
An addition to DFS keeps track of the '''entry time''' and '''exit time''' of nodes: that is, it keeps an ordering of when nodes are marked as grey/black. An index is used to keep track of this, incrementing every time a node's color change.
 
This is useful primarily for '''topological sort'''. A strategy for this problem involves DFS-ing the graph and sorting the nodes by exit times.
[[Category:Algorithms]]

Latest revision as of 06:35, 20 March 2024

Depth-first search is a graph traversal algorithm.

Approach

The basic idea of DFS is to visit every child of the visiting node before moving on to the next unvisited node.

  1. Every node besides one start out being unvisited (white).
  2. DFS begin by visiting a source node. Upon entering a node, DFS marks it as visiting (grey).
  3. DFS then visits all of its child nodes.
  4. After it has visited all children of a grey node, it marks the visiting parent node as visited (black).
  5. This continues until every node is black.

Implementation

 DFS_visit(u) {
     time = time + 1
     d[u] = time
     color[u] = 'grey'
     for each v in adj[u] {
         if color[u] == white
             DFS_visit(u)
     }
     f[u] = time++
     color[u] = black
 }

Analysis

Since DFS is only called when a node is white, and a node is marked as grey immediately afterwards, DFS is called V times.

In each call, everything is completed in constant time, except the process of iterating the outgoing edges of that node. Over all iterations, that comes down to E times since that is the total number of edges.

Therefore, under the normal implementation of DFS, the runtime complexity is O(V + E).

Adjacency lists

Realistically, the only real change on the runtime would be using something other than adjacency lists to store the graph, in which case the time it takes to search for edges needs to be taken into account. For example, an adjacency matrix would take O(V^2), which is the time it takes to iterate through all the edges.

Variations and Applications

Edge classification and cycle detection

DFS can classify edges based on the color of the node it is visiting.

  • If the node is white (unvisited), the edge is called a tree edge. After DFS ends, all the tree edges form a spanning tree stemming from the source node (if the graph is connected).
  • If the node is grey (visiting), the edge is called a back edge. During a visit, any grey node is a predecessor of the current node, therefore edges to grey node goes "back" up. Notably, the presence of back edges indicate that there is a cycle.
  • If the node is black (visited), the edge is either a forward edge or a cross edge. I have not seen uses for these.

Entry time and exit time

An addition to DFS keeps track of the entry time and exit time of nodes: that is, it keeps an ordering of when nodes are marked as grey/black. An index is used to keep track of this, incrementing every time a node's color change.

This is useful primarily for topological sort. A strategy for this problem involves DFS-ing the graph and sorting the nodes by exit times.