多维数据搜索树——Range tree介绍多维数据搜索树，主要是Range tree ，给与相关例子辅助理解，并附有C+

查询类型

精确查询
范围查询
最近邻查询

区域树

1D Range Search (Range [x,x'])

Data Structure ：有序数组

Search for x & x' in A by binary search $O(log \ n)$
Output all points between them $O(k)$
Total $O(k+log \ n)$

但是这种数据结构对于高维数据不适用，因为时间/空间复杂度实在是太高了。

Data Structure ：BST二叉树

Prepare an array to save result.
Recursively search on the tree.

//中序遍历搜索
private void rsearch(TreeNode root, int l, int r) 
{
    if (root == null) 
        return;

    if (root.val > l) 
        rsearch(root.left, l, r);

    if (root.val >= l && root.val <= r) 
        results.add(root.val);

    if (root.val < r) 
        rsearch(root.right, l, r);
}

Data Structure 3: BST with data stored in leaves

Perform binary search twice, once using x and the other using x’
Suppose binary search ends at leaves l and l’
The points in [x, x’] are the ones stored between l and l’ plus, possibly, the points stored in l and l’

#include <vector>
#include <algorithm>
using namespace std;

// 定义BST节点结构
struct Node {
    int value;           // 节点存储的值（叶子节点存储实际数据）
    Node* left;          // 左子节点
    Node* right;         // 右子节点
    bool isLeaf;         // 标记是否为叶子节点
    Node(int val, bool isLeaf) : value(val), left(nullptr), right(nullptr), isLeaf(isLeaf) {}
};

// 函数：查找给定值对应的终止叶子节点
Node* findLeaf(Node* root, int x) {
    Node* current = root;
    while (!current->isLeaf) { // 循环直到到达叶子节点
        if (x < current->value) {
            current = current->left; // 向左子树查找
        } else {
            current = current->right; // 向右子树查找
        }
    }
    return current; // 返回最终到达的叶子节点
}

// 函数：中序遍历收集所有叶子节点的值（按升序排列）
vector<int> getLeavesInOrder(Node* root) {
    vector<int> leaves;
    inOrderTraversal(root, leaves);
    return leaves;
}

// 辅助函数：中序遍历填充叶子节点的值到vector中
void inOrderTraversal(Node* node, vector<int>& leaves) {
    if (node == nullptr) return;
    inOrderTraversal(node->left, leaves); // 先遍历左子树
    if (node->isLeaf) {
        leaves.push_back(node->value); // 只收集叶子节点的值
    }
    inOrderTraversal(node->right, leaves); // 再遍历右子树
}

// 函数：范围查询[ x , x' ]内所有数据
vector<int> rangeSearch(Node* root, int x, int x_prime) {
    // 步骤1：查找x和x'对应的终止叶子节点
    Node* l = findLeaf(root, x);     // l是x的终止叶子节点
    Node* l_prime = findLeaf(root, x_prime); // l_prime是x'的终止叶子节点
    
    // 步骤2：获取所有叶子节点的有序列表
    vector<int> leaves = getLeavesInOrder(root); // 中序遍历得到升序的叶子值列表
    
    // 步骤3：找出在[x, x']范围内的所有数据
    vector<int> result;
    
    // 使用二分查找确定起始和结束位置
    auto it_start = lower_bound(leaves.begin(), leaves.end(), x); // 第一个不小于x的元素
    auto it_end = upper_bound(leaves.begin(), leaves.end(), x_prime); // 第一个大于x'的元素
    
    // 将范围内的元素复制到结果中
    result.assign(it_start, it_end);
    
    // 需要考虑l和l_prime的值可能未被包含的情况（如它们的值在范围内但未被包含）
    int val_l = l->value;
    int val_l_prime = l_prime->value;
    
    // 将l和l_prime的值加入结果，然后过滤
    result.push_back(val_l);
    result.push_back(val_l_prime);
    
    // 过滤，只保留在[x, x']内的元素，并排序去重
    vector<int> filtered;
    for (int num : result) {
        if (num >= x && num <= x_prime) {
            filtered.push_back(num);
        }
    }
    sort(filtered.begin(), filtered.end()); // 排序结果
    filtered.erase(unique(filtered.begin(), filtered.end()), filtered.end()); // 去重
    
    return filtered;
}

数据结构	时间复杂度
Sorted Array	$O(k+log \ n )$
BST	$O(k+log \ n )$
BST with data stored in leaves	$O(k+log \ n )$

2D Range Search

2D range tree

在一个二维平面上有很多点，点都有x和y坐标，现在要查询在区域 $[x_1,x_2] × [y_1,y_2]$ 范围内的所有点，常用的方法一般是先将数据点预处理成一棵树，然后通过对数中点集的查找来实现。其中区域树就是一种正交查找的常用方法，主要思路是将点沿X坐标建立一棵树，再将每个节点的子树按照Y坐标再建立一棵树。
二维范围树的结构是分层的，由两层平衡二叉搜索树, 如AVL树、红黑树和辅助结构组成

Tree Build

对于平面上的点，依据x坐标建立X-tree
接下来对每一个节点对应的子树再建立一个Y-tree，比如对于根结点，子树有4个，则Y-tree为
X-tree根节点的两个子节点对应的子树都有两个点，于是对应的Y-tree为
于是区域树为

if t is a node of x-tree:
    t.val:cut value
    t.left,t.right:child
    t.ytree:y-tree
        
if t is a leaf of x-tree:
    t.pt:point
    t.ytree:a y-tree with single point

BuildTree (S, d) //S：point set
1. if |S|=1, return leaf t where
    t.pt and t.ytree is the point of S
    
2. x be median of X coordinates of all points in S

3. L (R) be subset of S whose X coordinates are no greater than (greater than) x

4. Return node t where
    1. t.val = x
    2. t.left = BuildTree (L)
    3. t.right = BuildTree (R)
    4. t.ytree = MergeYTree(t.left.ytree,t.right.ytree)

首先递归地将点集按x坐标的中位数分割，构建平衡二叉搜索树（主树结构），每个节点记录分割值x，并递归创建左右子树。
同时，每个节点的ytree通过合并左右子树的ytree生成，形成该子树下所有点按y坐标排序的结构（如平衡BST或有序数组），从而支持在y方向的高效范围查询。
最终，主树处理x范围，各节点的ytree处理对应y范围，实现二维查询的快速响应。

空间复杂度： $O(nlog\ n)$ ，时间复杂度： $O(nlog\ n)$

Tree Query

Query(t,rX,rY)
//rX, rY: query range in X and Y
    
1. if t is a leaf
    1. if t.pt is inside {rX,rY},return t.pt
    2. else retrun NULL
2. if t.range is inside rX
    QueryY(t.ytree,rY)
3. else if t.range intersects rX
    return Query(t.left,rX,rY)

步骤1：递归缩小x范围
- 当前节点为空：直接返回0。
- 当前节点的x坐标不在查询区间内：
  - 如果当前节点的 x > x2：递归查询左子树（因为左子树的x更小）。
  - 如果当前节点的 x < x1：递归查询右子树（因为右子树的x更大）。
- 当前节点的x坐标在查询区间内：
  - 需要统计该节点的 y_list 中满足 y ∈ [y1, y2] 的点数。
  - 同时递归查询左、右子树（因为左子树的x可能仍在 [x1, x2] 内，右子树同理）。
步骤2：统计当前节点的y范围
- 对当前节点的 y_list 使用二分查找：
  - lower_bound(y1)：找到第一个 ≥ y1 的y坐标的位置。
  - upper_bound(y2)：找到第一个 > y2 的y坐标的位置。
  - 区间 [lower, upper) 内的元素个数即为符合条件的点数。
步骤3：合并结果
- 将当前节点的统计结果与左、右子树的递归结果相加，得到最终结果。

代码

#include <vector>
#include <algorithm>

struct Point {
    double x, y;
    Point(double x = 0, double y = 0) : x(x), y(y) {}
};

struct Node {
    double x;
    std::vector<double> y_list;
    Node* left;
    Node* right;

    Node(const std::vector<Point>& points) {
        // 按x坐标排序点
        std::vector<Point> sorted = points;
        std::sort(sorted.begin(), sorted.end(), [](const Point& a, const Point& b) {
            return a.x < b.x;
        });

        // 选择中间点作为分割点
        size_t mid = sorted.size() / 2;
        x = sorted[mid].x;

        // 收集并排序所有y坐标
        for (const auto& p : sorted) {
            y_list.push_back(p.y);
        }
        std::sort(y_list.begin(), y_list.end());

        // 递归构建左右子树
        left = buildTree(std::vector<Point>(sorted.begin(), sorted.begin() + mid));
        right = buildTree(std::vector<Point>(sorted.begin() + mid + 1, sorted.end()));
    }

    ~Node() {
        delete left;
        delete right;
    }
};

// 构建函数
Node* buildTree(const std::vector<Point>& points) {
    if (points.empty()) return nullptr;
    return new Node(points);
}

// 查询函数
int queryRange(Node* root, double x1, double x2, double y1, double y2) {
    if (!root) return 0;

    // 当前节点x坐标不在查询范围内时的处理
    if (x2 < root->x) {
        return queryRange(root->left, x1, x2, y1, y2);
    } else if (x1 > root->x) {
        return queryRange(root->right, x1, x2, y1, y2);
    } else {
        // 当前节点x在查询范围内，计算y范围内的数量
        int count = 0;
        auto lower = std::lower_bound(root->y_list.begin(), root->y_list.end(), y1);
        auto upper = std::upper_bound(root->y_list.begin(), root->y_list.end(), y2);
        count += (upper - lower);

        // 递归处理左右子树
        count += queryRange(root->left, x1, root->x, y1, y2);
        count += queryRange(root->right, root->x, x2, y1, y2);
        return count;
    }
}

// 示例用法
int main() {
    std::vector<Point> points = {
        Point(1, 2),
        Point(2, 1),
        Point(3, 4),
        Point(4, 5),
        Point(5, 6)
    };

    Node* root = buildTree(points);

    // 查询x在[2,4]，y在[1,5]的点数量
    int result = queryRange(root, 2, 4, 1, 5);
    std::cout << "查询结果点数：" << result << std::endl; // 应输出4

    delete root; // 释放内存
    return 0;
}

多维数据搜索树——Range tree

查询类型

区域树

1D Range Search (Range [x,x'])

Data Structure ：有序数组

Data Structure ：BST二叉树

Data Structure 3: BST with data stored in leaves

2D Range Search

2D range tree

Tree Build

Tree Query

步骤1：递归缩小x范围

步骤2：统计当前节点的y范围

步骤3：合并结果

代码