15- Graphs

Overview

Previous chapters focused on structuring data to support algorithms. This chapter reverses that perspective: graphs show how the structure of data can drive the creation of new algorithms.

We explore three key graph algorithms:

  • Dijkstra’s Algorithm – shortest paths

  • Prim’s Algorithm – minimum-cost spanning trees

  • Kahn’s Algorithm – topological sort

Introducing Graphs

Definition: A graph is composed of a set of nodes and a set of edges.

Applications:

  • Social networks – people as nodes, friendships as edges

  • Transportation networks – cities as nodes, routes as edges

  • Computer networks – devices as nodes, connections as edges

Types of Graphs:

  • Undirected graphs: two-way relationships (e.g., friendships, two-way roads)

  • Directed graphs: one-way relationships (e.g., dependencies, one-way streets)

  • Weighted graphs: edges have a cost or value (e.g., distance, time, strength)

Example Use: Directed graphs can represent task dependencies—such as brewing coffee—where edges indicate the required order of operations.

Representing Graphs

Two common representations:

  1. Adjacency List - Each node stores its list of neighbors. - Efficient for sparse graphs. - Example:

    Node {
        String: name
        Integer: id
        Array of Edges: edges
    }
    
    Edge {
        Integer: from_node
        Integer: to_node
        Float: weight
    }
    
    • Graph contains all nodes:

      Graph {
          Integer: num_nodes
          Array of Nodes: nodes
      }
      
    • Mirrors real-world relationships like social networks (localized connections).

  2. Adjacency Matrix - A matrix with one row and column per node. - Cell (i, j) stores edge weight from node i to node j. - Zero indicates no edge. - Provides a global view of connections.

Searching Graphs

Graph search explores nodes one at a time using edges to guide traversal.

Search Strategies:

  • Depth-First Search (DFS): - Uses a stack. - Explores as far as possible before backtracking.

  • Breadth-First Search (BFS): - Uses a queue. - Explores neighbors first before going deeper.

  • Best-First Search: - Uses a priority ranking (e.g., heuristic scoring).

The graph’s structure directly influences exploration patterns.

Dijkstra’s Algorithm – Shortest Paths

Goal: Find the shortest path from one node to all others in a weighted graph (with non-negative edge weights).

Concept: - Maintain tentative distances to all nodes. - Repeatedly visit the closest unvisited node. - Update distances to its neighbors if a shorter path is found.

Pseudocode:

Dijkstras(Graph G, Integer from_node_index):
  Array distance = inf for each node
  Array last = -1 for each node
  Set unvisited = all node indices
  distance[from_node_index] = 0.0

  WHILE unvisited NOT empty:
      next_index = node in unvisited with minimal distance
      current = G.nodes[next_index]
      Remove next_index from unvisited

      FOR EACH edge IN current.edges:
          new_dist = distance[edge.from_node] + edge.weight
          IF new_dist < distance[edge.to_node]:
              distance[edge.to_node] = new_dist
              last[edge.to_node] = edge.from_node

Optimization: Use a min-heap (priority queue) to efficiently retrieve the smallest distance node.

Key Idea: Dijkstra’s uses graph structure to constrain movement—paths are limited by edges.

Prim’s Algorithm – Minimum Spanning Tree

Goal: Connect all nodes with the minimum total edge cost (for undirected graphs).

Concept: - Start from any node. - Repeatedly add the smallest edge that connects an unvisited node to the tree. - Continue until all nodes are connected.

Pseudocode:

Prims(Graph G):
  Array distance = inf for each node
  Array last = -1 for each node
  Set unvisited = all node indices
  Set mst_edges = empty

  WHILE unvisited NOT empty:
      next_id = node with minimal distance
      IF last[next_id] != -1:
          Add edge(last[next_id], next_id) to mst_edges
      Remove next_id from unvisited
      current = G.nodes[next_id]

      FOR EACH edge IN current.edges:
          IF edge.to_node in unvisited AND edge.weight < distance[edge.to_node]:
              distance[edge.to_node] = edge.weight
              last[edge.to_node] = current.id

  RETURN mst_edges

Notes: - Works similarly to Dijkstra’s algorithm but focuses on edge costs, not path lengths. - Can yield multiple valid MSTs with equal cost.

Kahn’s Algorithm – Topological Sort

Goal: Sort nodes in a Directed Acyclic Graph (DAG) according to dependency order.

Concept: - Nodes with no incoming edges come first. - Remove them, then update incoming edge counts for remaining nodes. - Repeat until all nodes are processed.

Pseudocode:

Kahns(Graph G):
  Array sorted = []
  Array count = 0 for each node
  Stack next = empty

  # Count incoming edges
  FOR EACH node IN G.nodes:
      FOR EACH edge IN node.edges:
          count[edge.to_node] += 1

  # Initialize stack with nodes having no incoming edges
  FOR EACH node IN G.nodes:
      IF count[node.id] == 0:
          next.Push(node)

  # Process nodes
  WHILE NOT next.IsEmpty():
      current = next.Pop()
      Append current to sorted
      FOR EACH edge IN current.edges:
          count[edge.to_node] -= 1
          IF count[edge.to_node] == 0:
              next.Push(G.nodes[edge.to_node])

  RETURN sorted

Notes: - Works only on acyclic graphs. - If sorted list size < number of nodes → cycle detected.

Why This Matters

  • Graphs model real-world systems: roads, networks, dependencies.

  • Algorithms leverage this structure for: - Shortest path (Dijkstra) - Minimum connections (Prim) - Ordered processing (Kahn)

Key Insight: The structure of data both defines the problem and shapes the algorithmic solution.

Key Terms

  • Node (Vertex): Individual data point.

  • Edge: Connection between nodes.

  • Directed Graph: Edges have direction.

  • Undirected Graph: Edges are bidirectional.

  • Weighted Edge: Edge includes a cost or value.

  • Adjacency List / Matrix: Two ways to represent graphs.

  • MST (Minimum Spanning Tree): Lowest-cost subset connecting all nodes.

  • DAG (Directed Acyclic Graph): Graph with no cycles.

  • Topological Sort: Ordering of nodes based on dependency.

Summary

  • Graphs describe relationships and constraints.

  • Dijkstra’s finds optimal routes.

  • Prim’s builds efficient connection networks.

  • Kahn’s enforces dependency ordering.

  • Together, they illustrate how data structure drives algorithm design.