=========================== 14- Skip Lists =========================== Overview =========== Skip lists are a probabilistic data structure that extend linked lists with multiple layers to improve search, insertion, and deletion efficiency. They provide a simpler alternative to balanced trees while maintaining logarithmic average-case performance. Analogy ========== - Searching in a skip list is like flipping through a book in large chunks, then smaller chunks as you get closer to the target. - Another analogy: passing messages between buildings using flashlights—higher floors skip over lower buildings. Key Properties ================= - Proposed by William Pugh. - Probabilistic: node heights are randomly chosen. - Multi-level linked lists: each level contains a subset of nodes from the level below. - Supports efficient search, insertion, and deletion in expected O(log n) time. Randomized vs Deterministic =============================== - Deterministic structures always produce the same shape from the same data (e.g., BSTs, heaps, tries). - Randomized structures vary their structure probabilistically. - Randomization reduces worst-case scenarios and balances the average case. - Example: inserting nodes with random levels instead of always deterministic placement. Skip List Structure ========================= SkipList ------------ .. code-block:: python SkipList { Integer: top_level # highest level in use Integer: max_level # maximum allowable level SkipListNode: front # dummy node at front } SkipListNode ---------------- .. code-block:: python SkipListNode { Type: key Type: value Integer: height Array of SkipListNodes: next # pointers for each level [0, height] } - Higher-level nodes have fewer nodes but longer skips. - `front` node simplifies insertion/deletion by tracking top-level pointers. Searching ============= Algorithm ------------ .. code-block:: python SkipListSearch(SkipList: list, Type: target): level = list.top_level current = list.front while level >= 0: while current.next[level] != null and current.next[level].key < target: current = current.next[level] level -= 1 result = current.next[0] if result != null and result.key == target: return result.value else: return null - Start at top-left (highest level, front node). - Move right while next key < target. - Drop down a level if no more progress. - Repeat until bottom level reached. Insertion ============== Algorithm ------------- .. code-block:: python SkipListInsert(SkipList: list, Type: key, Type: value): level = list.top_level current = list.front last = array of size list.max_level + 1, initialized with list.front while level >= 0: while current.next[level] != null and current.next[level].key < key: current = current.next[level] last[level] = current level -= 1 result = current.next[0] if result != null and result.key == key: result.value = value return new_level = pick_random_level() if new_level > list.top_level: list.top_level = new_level new_node = SkipListNode(key, value, new_level) for j in range(new_level + 1): new_node.next[j] = last[j].next[j] last[j].next[j] = new_node - Use same traversal logic as search. - Track last node at each level (`last`) to update pointers. - Assign node a random height using probability `p` (commonly 0.5). - Insert node into each level up to its height. Deletion ============ Algorithm ------------- .. code-block:: python SkipListDelete(SkipList: list, Type: target): level = list.top_level current = list.front last = array of size list.max_level + 1, initialized with list.front while level >= 0: while current.next[level] != null and current.next[level].key < target: current = current.next[level] last[level] = current level -= 1 result = current.next[0] if result == null or result.key != target: return for j in range(result.height + 1): last[j].next[j] = result.next[j] result.next[j] = null if result.height == list.top_level: top = list.top_level while top > 0 and list.front.next[top] == null: top -= 1 list.top_level = top - Mirrors insertion logic for traversal. - Update pointers at each level to skip over the deleted node. - Adjust `top_level` if the tallest node was removed. Node Height Selection ========================== - All nodes start at level 0. - Randomly decide to promote node to next level with probability `p`. - Maximum height capped by `max_level`. - Expected distribution: half of level L nodes appear at level L+1 (if p = 0.5). Performance ================= - Average-case search, insert, delete: O(log n) - Worst-case (all nodes same height): O(n) - Expected behavior mirrors balanced binary search trees. Why Skip Lists Matter ========================= - Simpler than balanced search trees. - Randomized structure provides robust average-case performance. - Demonstrates the use of probabilistic structures for efficient algorithms.