=========================== 15- Graphs =========================== Overview ======== Previous chapters focused on structuring data to support algorithms. This chapter reverses that perspective: **graphs** show how the *structure of data* can drive the creation of new algorithms. We explore three key graph algorithms: - **Dijkstra’s Algorithm** – shortest paths - **Prim’s Algorithm** – minimum-cost spanning trees - **Kahn’s Algorithm** – topological sort ---------------------- Introducing Graphs ---------------------- **Definition:** A graph is composed of a **set of nodes** and a **set of edges**. **Applications:** - Social networks – people as nodes, friendships as edges - Transportation networks – cities as nodes, routes as edges - Computer networks – devices as nodes, connections as edges **Types of Graphs:** - **Undirected graphs:** two-way relationships (e.g., friendships, two-way roads) - **Directed graphs:** one-way relationships (e.g., dependencies, one-way streets) - **Weighted graphs:** edges have a cost or value (e.g., distance, time, strength) **Example Use:** Directed graphs can represent task dependencies—such as brewing coffee—where edges indicate the required order of operations. ------------------------ Representing Graphs ------------------------ Two common representations: 1. **Adjacency List** - Each node stores its list of neighbors. - Efficient for sparse graphs. - Example:: Node { String: name Integer: id Array of Edges: edges } Edge { Integer: from_node Integer: to_node Float: weight } - Graph contains all nodes:: Graph { Integer: num_nodes Array of Nodes: nodes } - Mirrors real-world relationships like social networks (localized connections). 2. **Adjacency Matrix** - A matrix with one row and column per node. - Cell (i, j) stores edge weight from node *i* to node *j*. - Zero indicates no edge. - Provides a *global view* of connections. ------------------------ Searching Graphs ------------------------ Graph search explores nodes one at a time using edges to guide traversal. **Search Strategies:** - **Depth-First Search (DFS):** - Uses a stack. - Explores as far as possible before backtracking. - **Breadth-First Search (BFS):** - Uses a queue. - Explores neighbors first before going deeper. - **Best-First Search:** - Uses a priority ranking (e.g., heuristic scoring). The graph’s structure directly influences exploration patterns. ---------------------------------------- Dijkstra’s Algorithm – Shortest Paths ---------------------------------------- **Goal:** Find the shortest path from one node to all others in a weighted graph (with non-negative edge weights). **Concept:** - Maintain tentative distances to all nodes. - Repeatedly visit the closest unvisited node. - Update distances to its neighbors if a shorter path is found. **Pseudocode:** .. code-block:: none Dijkstras(Graph G, Integer from_node_index): Array distance = inf for each node Array last = -1 for each node Set unvisited = all node indices distance[from_node_index] = 0.0 WHILE unvisited NOT empty: next_index = node in unvisited with minimal distance current = G.nodes[next_index] Remove next_index from unvisited FOR EACH edge IN current.edges: new_dist = distance[edge.from_node] + edge.weight IF new_dist < distance[edge.to_node]: distance[edge.to_node] = new_dist last[edge.to_node] = edge.from_node **Optimization:** Use a *min-heap (priority queue)* to efficiently retrieve the smallest distance node. **Key Idea:** Dijkstra’s uses graph structure to constrain movement—paths are limited by edges. ------------------------------------------ Prim’s Algorithm – Minimum Spanning Tree ------------------------------------------ **Goal:** Connect all nodes with the minimum total edge cost (for undirected graphs). **Concept:** - Start from any node. - Repeatedly add the smallest edge that connects an unvisited node to the tree. - Continue until all nodes are connected. **Pseudocode:** .. code-block:: none Prims(Graph G): Array distance = inf for each node Array last = -1 for each node Set unvisited = all node indices Set mst_edges = empty WHILE unvisited NOT empty: next_id = node with minimal distance IF last[next_id] != -1: Add edge(last[next_id], next_id) to mst_edges Remove next_id from unvisited current = G.nodes[next_id] FOR EACH edge IN current.edges: IF edge.to_node in unvisited AND edge.weight < distance[edge.to_node]: distance[edge.to_node] = edge.weight last[edge.to_node] = current.id RETURN mst_edges **Notes:** - Works similarly to Dijkstra’s algorithm but focuses on *edge costs*, not path lengths. - Can yield multiple valid MSTs with equal cost. ----------------------------------- Kahn’s Algorithm – Topological Sort ----------------------------------- **Goal:** Sort nodes in a Directed Acyclic Graph (DAG) according to dependency order. **Concept:** - Nodes with no incoming edges come first. - Remove them, then update incoming edge counts for remaining nodes. - Repeat until all nodes are processed. **Pseudocode:** .. code-block:: none Kahns(Graph G): Array sorted = [] Array count = 0 for each node Stack next = empty # Count incoming edges FOR EACH node IN G.nodes: FOR EACH edge IN node.edges: count[edge.to_node] += 1 # Initialize stack with nodes having no incoming edges FOR EACH node IN G.nodes: IF count[node.id] == 0: next.Push(node) # Process nodes WHILE NOT next.IsEmpty(): current = next.Pop() Append current to sorted FOR EACH edge IN current.edges: count[edge.to_node] -= 1 IF count[edge.to_node] == 0: next.Push(G.nodes[edge.to_node]) RETURN sorted **Notes:** - Works only on *acyclic* graphs. - If sorted list size < number of nodes → cycle detected. ----------------------- Why This Matters ----------------------- - Graphs model real-world systems: roads, networks, dependencies. - Algorithms leverage this structure for: - Shortest path (Dijkstra) - Minimum connections (Prim) - Ordered processing (Kahn) **Key Insight:** The structure of data both *defines the problem* and *shapes the algorithmic solution*. ----------------------- Key Terms ----------------------- - **Node (Vertex):** Individual data point. - **Edge:** Connection between nodes. - **Directed Graph:** Edges have direction. - **Undirected Graph:** Edges are bidirectional. - **Weighted Edge:** Edge includes a cost or value. - **Adjacency List / Matrix:** Two ways to represent graphs. - **MST (Minimum Spanning Tree):** Lowest-cost subset connecting all nodes. - **DAG (Directed Acyclic Graph):** Graph with no cycles. - **Topological Sort:** Ordering of nodes based on dependency. ----------------------- Summary ----------------------- - Graphs describe relationships and constraints. - Dijkstra’s finds optimal routes. - Prim’s builds efficient connection networks. - Kahn’s enforces dependency ordering. - Together, they illustrate how **data structure drives algorithm design**.