===========================
9- Spatial Trees
===========================

Key Concepts
-------------
- **Nearest-neighbor search** extends one-dimensional lookup to multiple dimensions.
- **Grids** are limited due to memory overhead and fixed partitioning.
- **Tree-based spatial partitioning** allows dynamic and adaptive division of space.

Uniform Quadtrees
-----------------
- A **quadtree** recursively partitions two-dimensional space into four equal quadrants.
- Each **node** represents a spatial region and may have up to four children.
- **Uniform** means all splits divide the space evenly at each level.

**Node Structure:**

::

    QuadTreeNode {
        Boolean: is_leaf
        Integer: num_points
        Float: x_min, x_max, y_min, y_max
        Matrix of QuadTreeNodes: children
        Array of Points: points
    }

**Point Structure:**

::

    Point {
        Float: x
        Float: y
    }

Design Considerations
---------------------
- Internal nodes contain metadata and pointers to child nodes.
- Leaf nodes store actual data points.
- Spatial bounds can be stored or computed from the root and path of splits.
- Adaptive partitioning avoids excessive subdivision and reduces memory use.

Building Uniform Quadtrees
---------------------------
Subdivision continues until one of the following criteria is met:
- Too few points to justify a split.
- Spatial bounds are too small for further division.
- Maximum tree depth reached.

This creates a hierarchical, adaptive structure that represents spatial density efficiently.

Adding Points
-------------
**Insertion Algorithm Overview:**

1. Check if point lies within the quadtree’s root bounds.
2. Traverse the tree to locate the appropriate region.
3. If at a leaf:
   - Add the point to the node.
   - Check if splitting criteria are met.
4. If splitting is required:
   - Convert leaf to internal node.
   - Redistribute stored points among child quadrants.

Advantages
----------
- Efficient for spatial searches and nearest-neighbor queries.
- Dynamically adapts to data density.
- Reduces unnecessary memory use compared to fine-grained grids.

Comparison to k-d Trees
-----------------------
- **Quadtrees**: Partition space uniformly into four quadrants (2D focus).
- **k-d Trees**: Binary partitioning adaptable to any dimension.
- k-d trees are more flexible for high-dimensional data.


Removing Points
------------------

Removing a point from a quadtree is similar to insertion but more complex.

- The point is deleted from the **leaf node’s list**.
- The algorithm then traverses **upward**, removing unnecessary splits.
- Nodes may be **collapsed** when they no longer meet the splitting criteria.

Floating-point comparisons require tolerance checks using the ``approx_equal`` helper function.

**Node Collapse Algorithm:**

::

    QuadTreeNodeCollapse(node):
        IF node.is_leaf:
            return node.points
        FOR each child:
            IF child != null:
                sub_pts = QuadTreeNodeCollapse(child)
                append sub_pts to node.points
                child = null
        node.is_leaf = True
        return node.points

**Deletion Process:**

1. Verify the point lies within the tree’s spatial bounds.  
2. Search recursively for the point.  
3. When found, remove it from the leaf node.  
4. Update the ``num_points`` and collapse nodes if needed.  

**Key Operations:**

- **Recursive Search:** Finds the node containing the target point.  
- **Deletion:** Removes the first matching point using approximate equality.  
- **Collapse:** Combines children into a single leaf node if they no longer justify subdivision.  

Searching Uniform QuadTrees
------------------------------

The quadtree search begins at the **root node** and recursively explores potential regions that may contain a **closer neighbor** than the current candidate.

**Process Overview:**

1. **Start at the root** with a dummy candidate of infinite distance.
2. **Check node compatibility:**  
   - Determine if the node could contain a point closer than the current best.
3. **Internal Node:** Recursively search child nodes.
4. **Leaf Node:** Compare target distance to all contained points.
5. **Prune branches:** Skip subtrees that cannot contain a closer point.

**Distance Calculation:**

For a point ``(x, y)`` and node bounds ``[xmin, xmax] × [ymin, ymax]``:

::

    if x < xmin → xdist = xmin - x
    if xmin ≤ x ≤ xmax → xdist = 0
    if x > xmax → xdist = x - xmax

    if y < ymin → ydist = ymin - y
    if ymin ≤ y ≤ ymax → ydist = 0
    if y > ymax → ydist = y - ymax

The **minimum distance** to the node = ``sqrt(xdist² + ydist²)``.

**Search Optimization:**

- **Prioritize quadrants** closest to the query point.
- **Prune quadrants** where the minimum possible distance exceeds the best found distance.
- **Skip empty nodes** (null children) automatically.

**Result:**
The nearest-neighbor search efficiently narrows down candidates, exploring only nodes that could contain closer points and pruning irrelevant regions.

---


Nearest-Neighbor Search in Quadtrees
====================================

* Uses recursive search similar to other tree-based structures.
* Employs *best_dist* parameter to prune branches.

Helper Function: ``MinDist``
----------------------------
Computes the minimum distance from a target point to a node’s bounding box.

.. code-block:: none

   MinDist(node, x, y):
       x_dist = 0.0
       IF x < node.x_min: x_dist = node.x_min - x
       IF x > node.x_max: x_dist = x - node.x_max
       y_dist = 0.0
       IF y < node.y_min: y_dist = node.y_min - y
       IF y > node.y_max: y_dist = y - node.y_max
       return sqrt(x_dist*x_dist + y_dist*y_dist)

Nearest-Neighbor Search Algorithm
---------------------------------
.. code-block:: none

   QuadTreeNearestNeighbor(tree, x, y):
       return QuadTreeNodeNearestNeighbor(tree.root, x, y, Inf)

Recursive Search
----------------
1. **Pruning:** Skip node if ``MinDist >= best_dist``.
2. **Leaf Check:** If leaf, compare all points and update ``best_dist``.
3. **Internal Node:**
   * Compute child bins based on x/y coordinates.
   * Recursively explore closest child first.
   * Update ``best_candidate`` and ``best_dist`` as closer points are found.

Summary
-------
* Pruning reduces search space.
* Recursive traversal ensures all potentially closer nodes are checked.
* Allows searches for points outside tree bounds.

k-d Trees
=========

Overview
--------
* Spatial data structure combining **binary trees** and **quadtrees**.
* Splits data along one dimension at each node.
* Each node divides space into **two children**.

Structure
----------
.. code-block:: none

   KDTreeNode {
       is_leaf
       num_dimensions
       num_points
       x_min[], x_max[]
       split_dim, split_val
       left, right
       points[][]
   }

   KDTree {
       num_dimensions
       root
   }

Advantages
----------
* Handles higher dimensions efficiently.
* Adaptable splits based on data distribution.
* Reduced branching factor (binary).

Tighter Spatial Bounds
----------------------
* Improves pruning by tracking **bounding boxes of actual points**.
* Bounding boxes (L, H) are computed per node:

.. code-block:: none

   ComputeBoundingBox(pts):
       num_points = length(pts)
       num_dims = length(pts[0])
       L, H = new arrays
       FOR each dimension d:
           L[d], H[d] = pts[0][d]
       FOR each point:
           update L[d], H[d] with min/max
       return (L, H)

Building k-d Trees
==================

Wrapper Function
----------------
.. code-block:: none

   BuildKDTree(tree, pts):
       Check dimensional consistency
       IF pts not empty:
           tree.root = KDTreeNode()
           RecursiveBuildKDTree(tree.root, tree.num_dimensions, pts)
       ELSE:
           tree.root = null

Recursive Construction
----------------------
.. code-block:: none

   RecursiveBuildKDTree(node, num_dims, pts):
       node.num_points = length(pts)
       (node.x_min, node.x_max) = ComputeBoundingBox(pts)
       Compute widest dimension
       IF stopping condition met:
           node.points = pts
           return
       node.split_dim = widest dimension
       node.split_val = midpoint of widest dimension
       Partition pts into left_pts, right_pts
       Recursively build left and right children

Stopping Conditions
-------------------
* Minimum number of points.
* Minimum width.
* Maximum recursion depth.

k-d Tree Operations
===================

Insertion
---------
* Choose branch using ``split_dim`` and ``split_val``.
* Update bounding boxes if point extends node bounds.

Deletion
--------
* Locate point using split rules.
* After removal, tighten node bounds and collapse internal nodes if needed.

Search
------
* Prune nodes based on minimum distance to bounding box.
* Distance computation (D-dimensional):

.. code-block:: none

   KDTreeNodeMinDist(node, pt):
       dist_sum = 0
       FOR each dimension d:
           diff = 0
           IF pt[d] < x_min[d]: diff = x_min[d] - pt[d]
           IF pt[d] > x_max[d]: diff = pt[d] - x_max[d]
           dist_sum += diff^2
       return sqrt(dist_sum)

Why This Matters
================

* **Quadtrees** dynamically adapt to spatial density but scale poorly with dimensions.
* **k-d Trees** balance flexibility and efficiency by splitting along one dimension.
* Support powerful **nearest-neighbor searches** in high-dimensional data.
* Represent a key tradeoff between memory, complexity, and performance.