9- Spatial Trees¶
Key Concepts¶
Nearest-neighbor search extends one-dimensional lookup to multiple dimensions.
Grids are limited due to memory overhead and fixed partitioning.
Tree-based spatial partitioning allows dynamic and adaptive division of space.
Uniform Quadtrees¶
A quadtree recursively partitions two-dimensional space into four equal quadrants.
Each node represents a spatial region and may have up to four children.
Uniform means all splits divide the space evenly at each level.
Node Structure:
QuadTreeNode {
Boolean: is_leaf
Integer: num_points
Float: x_min, x_max, y_min, y_max
Matrix of QuadTreeNodes: children
Array of Points: points
}
Point Structure:
Point {
Float: x
Float: y
}
Design Considerations¶
Internal nodes contain metadata and pointers to child nodes.
Leaf nodes store actual data points.
Spatial bounds can be stored or computed from the root and path of splits.
Adaptive partitioning avoids excessive subdivision and reduces memory use.
Building Uniform Quadtrees¶
Subdivision continues until one of the following criteria is met: - Too few points to justify a split. - Spatial bounds are too small for further division. - Maximum tree depth reached.
This creates a hierarchical, adaptive structure that represents spatial density efficiently.
Adding Points¶
Insertion Algorithm Overview:
Check if point lies within the quadtree’s root bounds.
Traverse the tree to locate the appropriate region.
If at a leaf: - Add the point to the node. - Check if splitting criteria are met.
If splitting is required: - Convert leaf to internal node. - Redistribute stored points among child quadrants.
Advantages¶
Efficient for spatial searches and nearest-neighbor queries.
Dynamically adapts to data density.
Reduces unnecessary memory use compared to fine-grained grids.
Comparison to k-d Trees¶
Quadtrees: Partition space uniformly into four quadrants (2D focus).
k-d Trees: Binary partitioning adaptable to any dimension.
k-d trees are more flexible for high-dimensional data.
Removing Points¶
Removing a point from a quadtree is similar to insertion but more complex.
The point is deleted from the leaf node’s list.
The algorithm then traverses upward, removing unnecessary splits.
Nodes may be collapsed when they no longer meet the splitting criteria.
Floating-point comparisons require tolerance checks using the approx_equal helper function.
Node Collapse Algorithm:
QuadTreeNodeCollapse(node):
IF node.is_leaf:
return node.points
FOR each child:
IF child != null:
sub_pts = QuadTreeNodeCollapse(child)
append sub_pts to node.points
child = null
node.is_leaf = True
return node.points
Deletion Process:
Verify the point lies within the tree’s spatial bounds.
Search recursively for the point.
When found, remove it from the leaf node.
Update the
num_pointsand collapse nodes if needed.
Key Operations:
Recursive Search: Finds the node containing the target point.
Deletion: Removes the first matching point using approximate equality.
Collapse: Combines children into a single leaf node if they no longer justify subdivision.
Searching Uniform QuadTrees¶
The quadtree search begins at the root node and recursively explores potential regions that may contain a closer neighbor than the current candidate.
Process Overview:
Start at the root with a dummy candidate of infinite distance.
Check node compatibility: - Determine if the node could contain a point closer than the current best.
Internal Node: Recursively search child nodes.
Leaf Node: Compare target distance to all contained points.
Prune branches: Skip subtrees that cannot contain a closer point.
Distance Calculation:
For a point (x, y) and node bounds [xmin, xmax] × [ymin, ymax]:
if x < xmin → xdist = xmin - x
if xmin ≤ x ≤ xmax → xdist = 0
if x > xmax → xdist = x - xmax
if y < ymin → ydist = ymin - y
if ymin ≤ y ≤ ymax → ydist = 0
if y > ymax → ydist = y - ymax
The minimum distance to the node = sqrt(xdist² + ydist²).
Search Optimization:
Prioritize quadrants closest to the query point.
Prune quadrants where the minimum possible distance exceeds the best found distance.
Skip empty nodes (null children) automatically.
Result: The nearest-neighbor search efficiently narrows down candidates, exploring only nodes that could contain closer points and pruning irrelevant regions.
—
Nearest-Neighbor Search in Quadtrees¶
Uses recursive search similar to other tree-based structures.
Employs best_dist parameter to prune branches.
Helper Function: MinDist¶
Computes the minimum distance from a target point to a node’s bounding box.
MinDist(node, x, y):
x_dist = 0.0
IF x < node.x_min: x_dist = node.x_min - x
IF x > node.x_max: x_dist = x - node.x_max
y_dist = 0.0
IF y < node.y_min: y_dist = node.y_min - y
IF y > node.y_max: y_dist = y - node.y_max
return sqrt(x_dist*x_dist + y_dist*y_dist)
Nearest-Neighbor Search Algorithm¶
QuadTreeNearestNeighbor(tree, x, y):
return QuadTreeNodeNearestNeighbor(tree.root, x, y, Inf)
Recursive Search¶
Pruning: Skip node if
MinDist >= best_dist.Leaf Check: If leaf, compare all points and update
best_dist.Internal Node: * Compute child bins based on x/y coordinates. * Recursively explore closest child first. * Update
best_candidateandbest_distas closer points are found.
Summary¶
Pruning reduces search space.
Recursive traversal ensures all potentially closer nodes are checked.
Allows searches for points outside tree bounds.
k-d Trees¶
Overview¶
Spatial data structure combining binary trees and quadtrees.
Splits data along one dimension at each node.
Each node divides space into two children.
Structure¶
KDTreeNode {
is_leaf
num_dimensions
num_points
x_min[], x_max[]
split_dim, split_val
left, right
points[][]
}
KDTree {
num_dimensions
root
}
Advantages¶
Handles higher dimensions efficiently.
Adaptable splits based on data distribution.
Reduced branching factor (binary).
Tighter Spatial Bounds¶
Improves pruning by tracking bounding boxes of actual points.
Bounding boxes (L, H) are computed per node:
ComputeBoundingBox(pts):
num_points = length(pts)
num_dims = length(pts[0])
L, H = new arrays
FOR each dimension d:
L[d], H[d] = pts[0][d]
FOR each point:
update L[d], H[d] with min/max
return (L, H)
Building k-d Trees¶
Wrapper Function¶
BuildKDTree(tree, pts):
Check dimensional consistency
IF pts not empty:
tree.root = KDTreeNode()
RecursiveBuildKDTree(tree.root, tree.num_dimensions, pts)
ELSE:
tree.root = null
Recursive Construction¶
RecursiveBuildKDTree(node, num_dims, pts):
node.num_points = length(pts)
(node.x_min, node.x_max) = ComputeBoundingBox(pts)
Compute widest dimension
IF stopping condition met:
node.points = pts
return
node.split_dim = widest dimension
node.split_val = midpoint of widest dimension
Partition pts into left_pts, right_pts
Recursively build left and right children
Stopping Conditions¶
Minimum number of points.
Minimum width.
Maximum recursion depth.
k-d Tree Operations¶
Insertion¶
Choose branch using
split_dimandsplit_val.Update bounding boxes if point extends node bounds.
Deletion¶
Locate point using split rules.
After removal, tighten node bounds and collapse internal nodes if needed.
Search¶
Prune nodes based on minimum distance to bounding box.
Distance computation (D-dimensional):
KDTreeNodeMinDist(node, pt):
dist_sum = 0
FOR each dimension d:
diff = 0
IF pt[d] < x_min[d]: diff = x_min[d] - pt[d]
IF pt[d] > x_max[d]: diff = pt[d] - x_max[d]
dist_sum += diff^2
return sqrt(dist_sum)
Why This Matters¶
Quadtrees dynamically adapt to spatial density but scale poorly with dimensions.
k-d Trees balance flexibility and efficiency by splitting along one dimension.
Support powerful nearest-neighbor searches in high-dimensional data.
Represent a key tradeoff between memory, complexity, and performance.