15- Graphs¶

Overview¶

Previous chapters focused on structuring data to support algorithms. This chapter reverses that perspective: graphs show how the structure of data can drive the creation of new algorithms.

We explore three key graph algorithms:

Dijkstra’s Algorithm – shortest paths
Prim’s Algorithm – minimum-cost spanning trees
Kahn’s Algorithm – topological sort

Introducing Graphs¶

Definition: A graph is composed of a set of nodes and a set of edges.

Applications:

Social networks – people as nodes, friendships as edges
Transportation networks – cities as nodes, routes as edges
Computer networks – devices as nodes, connections as edges

Types of Graphs:

Undirected graphs: two-way relationships (e.g., friendships, two-way roads)
Directed graphs: one-way relationships (e.g., dependencies, one-way streets)
Weighted graphs: edges have a cost or value (e.g., distance, time, strength)

Example Use: Directed graphs can represent task dependencies—such as brewing coffee—where edges indicate the required order of operations.

Representing Graphs¶

Two common representations:

Adjacency List - Each node stores its list of neighbors. - Efficient for sparse graphs. - Example:

Node {
    String: name
    Integer: id
    Array of Edges: edges
}

Edge {
    Integer: from_node
    Integer: to_node
    Float: weight
}

Graph contains all nodes:

Graph {
    Integer: num_nodes
    Array of Nodes: nodes
}

Mirrors real-world relationships like social networks (localized connections).

Adjacency Matrix - A matrix with one row and column per node. - Cell (i, j) stores edge weight from node i to node j. - Zero indicates no edge. - Provides a global view of connections.

Searching Graphs¶

Graph search explores nodes one at a time using edges to guide traversal.

Search Strategies:

Depth-First Search (DFS): - Uses a stack. - Explores as far as possible before backtracking.
Breadth-First Search (BFS): - Uses a queue. - Explores neighbors first before going deeper.
Best-First Search: - Uses a priority ranking (e.g., heuristic scoring).

The graph’s structure directly influences exploration patterns.

Dijkstra’s Algorithm – Shortest Paths¶

Goal: Find the shortest path from one node to all others in a weighted graph (with non-negative edge weights).

Concept: - Maintain tentative distances to all nodes. - Repeatedly visit the closest unvisited node. - Update distances to its neighbors if a shorter path is found.

Pseudocode:

Dijkstras(Graph G, Integer from_node_index):
  Array distance = inf for each node
  Array last = -1 for each node
  Set unvisited = all node indices
  distance[from_node_index] = 0.0

  WHILE unvisited NOT empty:
      next_index = node in unvisited with minimal distance
      current = G.nodes[next_index]
      Remove next_index from unvisited

      FOR EACH edge IN current.edges:
          new_dist = distance[edge.from_node] + edge.weight
          IF new_dist < distance[edge.to_node]:
              distance[edge.to_node] = new_dist
              last[edge.to_node] = edge.from_node

Optimization: Use a min-heap (priority queue) to efficiently retrieve the smallest distance node.

Key Idea: Dijkstra’s uses graph structure to constrain movement—paths are limited by edges.

Prim’s Algorithm – Minimum Spanning Tree¶

Goal: Connect all nodes with the minimum total edge cost (for undirected graphs).

Concept: - Start from any node. - Repeatedly add the smallest edge that connects an unvisited node to the tree. - Continue until all nodes are connected.

Pseudocode:

Prims(Graph G):
  Array distance = inf for each node
  Array last = -1 for each node
  Set unvisited = all node indices
  Set mst_edges = empty

  WHILE unvisited NOT empty:
      next_id = node with minimal distance
      IF last[next_id] != -1:
          Add edge(last[next_id], next_id) to mst_edges
      Remove next_id from unvisited
      current = G.nodes[next_id]

      FOR EACH edge IN current.edges:
          IF edge.to_node in unvisited AND edge.weight < distance[edge.to_node]:
              distance[edge.to_node] = edge.weight
              last[edge.to_node] = current.id

  RETURN mst_edges

Notes: - Works similarly to Dijkstra’s algorithm but focuses on edge costs, not path lengths. - Can yield multiple valid MSTs with equal cost.

Kahn’s Algorithm – Topological Sort¶

Goal: Sort nodes in a Directed Acyclic Graph (DAG) according to dependency order.

Concept: - Nodes with no incoming edges come first. - Remove them, then update incoming edge counts for remaining nodes. - Repeat until all nodes are processed.

Pseudocode:

Kahns(Graph G):
  Array sorted = []
  Array count = 0 for each node
  Stack next = empty

  # Count incoming edges
  FOR EACH node IN G.nodes:
      FOR EACH edge IN node.edges:
          count[edge.to_node] += 1

  # Initialize stack with nodes having no incoming edges
  FOR EACH node IN G.nodes:
      IF count[node.id] == 0:
          next.Push(node)

  # Process nodes
  WHILE NOT next.IsEmpty():
      current = next.Pop()
      Append current to sorted
      FOR EACH edge IN current.edges:
          count[edge.to_node] -= 1
          IF count[edge.to_node] == 0:
              next.Push(G.nodes[edge.to_node])

  RETURN sorted

Notes: - Works only on acyclic graphs. - If sorted list size < number of nodes → cycle detected.

Why This Matters¶

Graphs model real-world systems: roads, networks, dependencies.
Algorithms leverage this structure for: - Shortest path (Dijkstra) - Minimum connections (Prim) - Ordered processing (Kahn)

Key Insight: The structure of data both defines the problem and shapes the algorithmic solution.

Key Terms¶

Node (Vertex): Individual data point.
Edge: Connection between nodes.
Directed Graph: Edges have direction.
Undirected Graph: Edges are bidirectional.
Weighted Edge: Edge includes a cost or value.
Adjacency List / Matrix: Two ways to represent graphs.
MST (Minimum Spanning Tree): Lowest-cost subset connecting all nodes.
DAG (Directed Acyclic Graph): Graph with no cycles.
Topological Sort: Ordering of nodes based on dependency.

Summary¶

Graphs describe relationships and constraints.
Dijkstra’s finds optimal routes.
Prim’s builds efficient connection networks.
Kahn’s enforces dependency ordering.
Together, they illustrate how data structure drives algorithm design.