Advanced Graph Algorithms

Go beyond BFS and DFS. Master Flow Networks (Max-Flow Min-Cut), Strongly Connected Components (Tarjan’s), and Eulerian Paths to optimize flow & scale.

1. The Hook: Optimizing Flow

Imagine you are a senior infrastructure engineer tasked with routing traffic from the East Coast data centers to the West Coast. Standard algorithms like Dijkstra’s tell you the fastest route for a single packet (latency), but what if you need to transfer 50 Petabytes of backup data overnight? Now you don’t care how fast one packet gets there; you care about the maximum total bandwidth you can push through the entire network simultaneously (throughput).

This is where standard traversal algorithms fail and Advanced Graph Algorithms shine.

We will cover three powerful paradigms:

  1. Network Flow (Max-Flow Min-Cut): Finding the absolute maximum throughput of a system.
  2. Strongly Connected Components (SCC): Identifying highly cohesive clusters within directed systems (like finding circular dependencies in microservices).
  3. Eulerian Paths: Optimizing physical routing to touch every connection exactly once.

2. Network Flow: Max-Flow Min-Cut Theorem

Max-Flow is like water pipes. The total water coming out of the tap cannot exceed the capacity of the bottleneck in the system.

Key Insight (Max-Flow Min-Cut Theorem): The maximum amount of flow possible from a source to a sink is exactly equal to the capacity of the bottleneck cut that separates them. If you want to increase overall system throughput, upgrading any component other than the bottleneck is a waste of money.

Real-World Analogy: The Logistics Supply Chain

Think of a warehouse (Source) shipping goods to a retail store (Sink) through various intermediate distribution centers.

  • Vertices: Distribution centers.
  • Edges: Highways connecting them.
  • Capacity: The maximum number of trucks per day a highway can support.
  • Flow Conservation: For any distribution center (except Source and Sink), the number of trucks arriving must equal the number of trucks leaving. Input Flow = Output Flow.

Ford-Fulkerson Algorithm Overview

The Ford-Fulkerson method finds the Max-Flow by iteratively pushing flow along paths with available capacity (augmenting paths).

  1. Start with 0 flow on all edges.
  2. While there is an augmenting path from source to sink in the residual graph (the remaining capacity):
    • Find the bottleneck capacity on this path.
    • Add this flow to the total flow.
    • Update residual capacities (and allow “pushing back” flow along reverse edges).

Interactive Flow Network Visualization

Here is a simplified visualization showing how flow pushes through a bottleneck.

S A B C T Cap: 8 Cap: 4 Cap: 2 (Cut) Cap: 4 Cap: 6

Notice the edge A→C. Even though 8 units can reach A, only 2 can pass through to C.
This Min-Cut limits the entire system's throughput.


3. Strongly Connected Components (SCC)

In a Directed Graph, if you can travel from A -> B and also from B -> A, they are strongly connected. A Strongly Connected Component (SCC) is a maximal subset of vertices where every vertex is reachable from every other vertex.

Use Cases in System Design

  1. Cycle Detection in Build Systems: If microservice A depends on B, B depends on C, and C depends on A, you have a circular dependency (an SCC). You cannot deploy them independently.
  2. Social Networks: Finding “cliques” or highly interactive communities where everyone follows everyone else.

The Algorithms

There are two primary algorithms to find SCCs in $O(V+E)$ time:

1. Kosaraju’s Algorithm

Kosaraju’s algorithm is elegant but requires two DFS passes.

  • Step 1: Run DFS on the original graph, pushing nodes onto a stack based on their finish times (post-order).
  • Step 2: Reverse all edges in the graph (create the Transpose graph).
  • Step 3: Pop nodes from the stack and run DFS on the transposed graph. Each DFS traversal forms one SCC.

2. Tarjan’s Algorithm

Tarjan’s algorithm is more efficient in practice because it requires only one DFS pass. It maintains an id (discovery time) and a low-link value (the smallest node id reachable from that node, including via back-edges).

  • If a node’s low-link value equals its id after exploring all neighbors, it is the “root” of an SCC.

Implementation: Tarjan’s Algorithm

Java
Go
// Simplified Tarjan's Algorithm logic for finding SCCs
public void tarjan(int u) {
  stack.push(u);
  onStack[u] = true;
  ids[u] = low[u] = id++; // Assign current id and low-link value

  for (int v : adj.get(u)) {
    if (ids[v] == -1) {
      // Unvisited node: recurse
      tarjan(v);
      low[u] = Math.min(low[u], low[v]); // Propagate low-link back up
    } else if (onStack[v]) {
      // Visited node that is on the current DFS path (back-edge)
      low[u] = Math.min(low[u], ids[v]);
    }
  }

  // If node u is a root node of an SCC
  if (ids[u] == low[u]) {
    while (!stack.isEmpty()) {
      int node = stack.pop();
      onStack[node] = false;
      low[node] = ids[u]; // Normalize all nodes in this SCC to the same low-link
      if (node == u) break;
    }
    sccCount++;
  }
}
// Simplified Tarjan's Algorithm logic for finding SCCs
func tarjan(u int) {
  stack = append(stack, u)
  onStack[u] = true
  ids[u] = id
  low[u] = id
  id++

  for _, v := range adj[u] {
    if ids[v] == -1 {
      // Unvisited node: recurse
      tarjan(v)
      if low[v] < low[u] {
        low[u] = low[v] // Propagate low-link back up
      }
    } else if onStack[v] {
      // Visited node that is on the current DFS path (back-edge)
      if ids[v] < low[u] {
        low[u] = ids[v]
      }
    }
  }

  // If node u is a root node of an SCC
  if ids[u] == low[u] {
    for {
      node := stack[len(stack)-1]
      stack = stack[:len(stack)-1]
      onStack[node] = false
      low[node] = ids[u] // Normalize all nodes in this SCC
      if node == u {
        break
      }
    }
    sccCount++
  }
}

4. Eulerian and Hamiltonian Paths

While Max-Flow deals with capacity and SCCs deal with connectivity, Path algorithms deal with exhaustive physical routing.

Eulerian Path (Visit Every Edge)

An Eulerian Path visits every edge exactly once. Think of a snowplow that must clear every street in a neighborhood without wasting fuel driving down the same street twice.

Conditions for an Eulerian Path (Undirected Graph):

  1. All vertices with non-zero degree must belong to a single connected component.
  2. Exactly 0 or 2 vertices have an odd degree.
    • If 0: It is an Eulerian Circuit (starts and ends at the same node).
    • If 2: It is a path. The start and end nodes are the ones with odd degrees.

Time Complexity: $O(V+E)$ using Hierholzer’s Algorithm.

Hamiltonian Path (Visit Every Vertex)

A Hamiltonian Path visits every vertex exactly once. Think of the classic Traveling Salesperson Problem (TSP), where a delivery driver must visit every house exactly once.

Key Difference: Unlike Eulerian paths, finding a Hamiltonian Path is NP-Complete. There is no known efficient algorithm (worst-case $O(2^N)$ with dynamic programming or $O(N!)$ with naive backtracking).

| Feature | Eulerian Path | Hamiltonian Path | | :— | :— | :— | | Goal | Visit every Edge exactly once | Visit every Vertex exactly once | | Real-world example | Snowplow routing, Garbage collection | Delivery routing (TSP) | | Time Complexity | $O(V+E)$ (Hierholzer’s Algorithm) | NP-Complete (e.g., $O(N^2 2^N)$ DP) | | Condition Check | Check degrees (Odd count 0 or 2) | No simple mathematical condition |