15- Graphs¶
Overview¶
Previous chapters focused on structuring data to support algorithms. This chapter reverses that perspective: graphs show how the structure of data can drive the creation of new algorithms.
We explore three key graph algorithms:
Dijkstra’s Algorithm – shortest paths
Prim’s Algorithm – minimum-cost spanning trees
Kahn’s Algorithm – topological sort
Introducing Graphs¶
Definition: A graph is composed of a set of nodes and a set of edges.
Applications:
Social networks – people as nodes, friendships as edges
Transportation networks – cities as nodes, routes as edges
Computer networks – devices as nodes, connections as edges
Types of Graphs:
Undirected graphs: two-way relationships (e.g., friendships, two-way roads)
Directed graphs: one-way relationships (e.g., dependencies, one-way streets)
Weighted graphs: edges have a cost or value (e.g., distance, time, strength)
Example Use: Directed graphs can represent task dependencies—such as brewing coffee—where edges indicate the required order of operations.
Representing Graphs¶
Two common representations:
Adjacency List - Each node stores its list of neighbors. - Efficient for sparse graphs. - Example:
Node { String: name Integer: id Array of Edges: edges } Edge { Integer: from_node Integer: to_node Float: weight }
Graph contains all nodes:
Graph { Integer: num_nodes Array of Nodes: nodes }
Mirrors real-world relationships like social networks (localized connections).
Adjacency Matrix - A matrix with one row and column per node. - Cell (i, j) stores edge weight from node i to node j. - Zero indicates no edge. - Provides a global view of connections.
Searching Graphs¶
Graph search explores nodes one at a time using edges to guide traversal.
Search Strategies:
Depth-First Search (DFS): - Uses a stack. - Explores as far as possible before backtracking.
Breadth-First Search (BFS): - Uses a queue. - Explores neighbors first before going deeper.
Best-First Search: - Uses a priority ranking (e.g., heuristic scoring).
The graph’s structure directly influences exploration patterns.
Dijkstra’s Algorithm – Shortest Paths¶
Goal: Find the shortest path from one node to all others in a weighted graph (with non-negative edge weights).
Concept: - Maintain tentative distances to all nodes. - Repeatedly visit the closest unvisited node. - Update distances to its neighbors if a shorter path is found.
Pseudocode:
Dijkstras(Graph G, Integer from_node_index):
Array distance = inf for each node
Array last = -1 for each node
Set unvisited = all node indices
distance[from_node_index] = 0.0
WHILE unvisited NOT empty:
next_index = node in unvisited with minimal distance
current = G.nodes[next_index]
Remove next_index from unvisited
FOR EACH edge IN current.edges:
new_dist = distance[edge.from_node] + edge.weight
IF new_dist < distance[edge.to_node]:
distance[edge.to_node] = new_dist
last[edge.to_node] = edge.from_node
Optimization: Use a min-heap (priority queue) to efficiently retrieve the smallest distance node.
Key Idea: Dijkstra’s uses graph structure to constrain movement—paths are limited by edges.
Prim’s Algorithm – Minimum Spanning Tree¶
Goal: Connect all nodes with the minimum total edge cost (for undirected graphs).
Concept: - Start from any node. - Repeatedly add the smallest edge that connects an unvisited node to the tree. - Continue until all nodes are connected.
Pseudocode:
Prims(Graph G):
Array distance = inf for each node
Array last = -1 for each node
Set unvisited = all node indices
Set mst_edges = empty
WHILE unvisited NOT empty:
next_id = node with minimal distance
IF last[next_id] != -1:
Add edge(last[next_id], next_id) to mst_edges
Remove next_id from unvisited
current = G.nodes[next_id]
FOR EACH edge IN current.edges:
IF edge.to_node in unvisited AND edge.weight < distance[edge.to_node]:
distance[edge.to_node] = edge.weight
last[edge.to_node] = current.id
RETURN mst_edges
Notes: - Works similarly to Dijkstra’s algorithm but focuses on edge costs, not path lengths. - Can yield multiple valid MSTs with equal cost.
Kahn’s Algorithm – Topological Sort¶
Goal: Sort nodes in a Directed Acyclic Graph (DAG) according to dependency order.
Concept: - Nodes with no incoming edges come first. - Remove them, then update incoming edge counts for remaining nodes. - Repeat until all nodes are processed.
Pseudocode:
Kahns(Graph G):
Array sorted = []
Array count = 0 for each node
Stack next = empty
# Count incoming edges
FOR EACH node IN G.nodes:
FOR EACH edge IN node.edges:
count[edge.to_node] += 1
# Initialize stack with nodes having no incoming edges
FOR EACH node IN G.nodes:
IF count[node.id] == 0:
next.Push(node)
# Process nodes
WHILE NOT next.IsEmpty():
current = next.Pop()
Append current to sorted
FOR EACH edge IN current.edges:
count[edge.to_node] -= 1
IF count[edge.to_node] == 0:
next.Push(G.nodes[edge.to_node])
RETURN sorted
Notes: - Works only on acyclic graphs. - If sorted list size < number of nodes → cycle detected.
Why This Matters¶
Graphs model real-world systems: roads, networks, dependencies.
Algorithms leverage this structure for: - Shortest path (Dijkstra) - Minimum connections (Prim) - Ordered processing (Kahn)
Key Insight: The structure of data both defines the problem and shapes the algorithmic solution.
Key Terms¶
Node (Vertex): Individual data point.
Edge: Connection between nodes.
Directed Graph: Edges have direction.
Undirected Graph: Edges are bidirectional.
Weighted Edge: Edge includes a cost or value.
Adjacency List / Matrix: Two ways to represent graphs.
MST (Minimum Spanning Tree): Lowest-cost subset connecting all nodes.
DAG (Directed Acyclic Graph): Graph with no cycles.
Topological Sort: Ordering of nodes based on dependency.
Summary¶
Graphs describe relationships and constraints.
Dijkstra’s finds optimal routes.
Prim’s builds efficient connection networks.
Kahn’s enforces dependency ordering.
Together, they illustrate how data structure drives algorithm design.