2.3节后习题 - 高级应用

一、算法实现题

1. 最长异或路径

给定一棵 n 个点的带权树，结点下标从 1 开始到 n。寻找树中找两个结点，求最长的异或路径。（注：异或路径指的是指两个结点之间唯一路径上的所有边权的异或。）

完整解法

cpp

#include <vector>
#include <algorithm>
using namespace std;

const int MAX_BIT = 30; // 处理30位整数

struct TreeNode {
    int to;
    int weight;
    TreeNode(int t, int w) : to(t), weight(w) {}
};

struct BinaryTrieNode {
    BinaryTrieNode* children[2];
    BinaryTrieNode() : children{nullptr, nullptr} {}
};

class BinaryTrie {
private:
    BinaryTrieNode* root;
public:
    BinaryTrie() : root(new BinaryTrieNode()) {}

    void insert(int num) {
        BinaryTrieNode* node = root;
        for (int i = MAX_BIT; i >= 0; i--) {
            int bit = (num >> i) & 1;
            if (!node->children[bit]) {
                node->children[bit] = new BinaryTrieNode();
            }
            node = node->children[bit];
        }
    }

    int queryMaxXor(int num) {
        BinaryTrieNode* node = root;
        int max_xor = 0;
        for (int i = MAX_BIT; i >= 0; i--) {
            int bit = (num >> i) & 1;
            int target = 1 - bit; // 贪心选择相反的位
            if (node->children[target]) {
                max_xor |= (1 << i);
                node = node->children[target];
            } else {
                node = node->children[bit];
            }
        }
        return max_xor;
    }
};

void dfs(int u, int parent, int current_xor, 
         vector<vector<TreeNode>>& adj, vector<int>& xor_path) {
    xor_path[u] = current_xor;
    for (auto& node : adj[u]) {
        if (node.to != parent) {
            dfs(node.to, u, current_xor ^ node.weight, adj, xor_path);
        }
    }
}

int main() {
    int n;
    scanf("%d", &n);
    vector<vector<TreeNode>> adj(n + 1);
    
    // 读取树的边
    for (int i = 0; i < n - 1; ++i) {
        int u, v, w;
        scanf("%d %d %d", &u, &v, &w);
        adj[u].emplace_back(v, w);
        adj[v].emplace_back(u, w);
    }

    // 计算从根节点到每个节点的异或值
    vector<int> xor_path(n + 1, 0);
    dfs(1, -1, 0, adj, xor_path);

    // 使用字典树求最大异或值
    BinaryTrie trie;
    int max_xor = 0;
    trie.insert(xor_path[1]);
    
    for (int i = 2; i <= n; ++i) {
        max_xor = max(max_xor, trie.queryMaxXor(xor_path[i]));
        trie.insert(xor_path[i]);
    }
    
    printf("%d\n", max_xor);
    return 0;
}

算法思路：

构建树结构：使用邻接表存储树的边和权重
DFS计算异或路径：从根节点出发，计算每个节点到根的异或值
二进制字典树优化查询：利用字典树高效计算两个数的最大异或值

关键技巧：

将树上路径问题转换为根到各节点的路径问题
利用异或的性质：path(u,v) = path(root,u) ⊕ path(root,v)
字典树贪心策略：每位都选择与当前位相反的分支

2. 字典树高级实现

给定一个字典树和一系列单词，编写一个程序来将这些单词插入到字典树中。假设字典树中的每个节点存储一个字符，并且每个节点有一个布尔值 isEnd 来标记单词的结尾。

通用字典树实现

cpp

#include <string>
#include <unordered_map>
using namespace std;

class Trie {
private:
    struct TrieNode {
        char ch;
        bool isEnd;
        unordered_map<char, TrieNode*> children;
        TrieNode(char c) : ch(c), isEnd(false) {}
    };
    TrieNode* root;

public:
    Trie() : root(new TrieNode('\0')) {} // 根节点存储空字符

    void insert(string s) {
        TrieNode* current = root;
        for (char c : s) {
            // 检查字符对应的子节点是否存在
            if (current->children.find(c) == current->children.end()) {
                // 创建新子节点并链接
                current->children[c] = new TrieNode(c);
            }
            current = current->children[c];
        }
        current->isEnd = true; // 标记单词结尾
    }
    
    bool search(string word) {
        TrieNode* current = root;
        for (char c : word) {
            if (current->children.find(c) == current->children.end()) {
                return false;
            }
            current = current->children[c];
        }
        return current->isEnd;
    }
    
    bool startsWith(string prefix) {
        TrieNode* current = root;
        for (char c : prefix) {
            if (current->children.find(c) == current->children.end()) {
                return false;
            }
            current = current->children[c];
        }
        return true;
    }
};

实现特点：

节点结构：显式存储字符 ch，通过哈希表管理子节点
动态扩展：支持任意字符集，不限于英文字母
完整功能：包含插入、搜索、前缀匹配功能

时间复杂度：O(L)，其中 L 为单词长度 空间复杂度：O(ALPHABET_SIZE × N × M)，N为单词数，M为平均长度

二、综合分析题

3. 数据结构选择

对于需要支持快速前缀匹配的应用场景，以下哪种数据结构最适合？

A哈希表

B字典树(Trie)

C二叉搜索树

D堆

4. 算法复杂度分析

分析字典树插入操作的时间和空间复杂度，并与其他数据结构比较。

复杂度分析

字典树复杂度：

时间复杂度：O(L)，L为字符串长度
空间复杂度：O(ALPHABET_SIZE × N × M)

与其他结构对比：

数据结构	插入时间	查找时间	前缀匹配	空间效率
字典树	O(L)	O(L)	O(P)	较低
哈希表	O(L)	O(L)	O(N×L)	较高
二叉搜索树	O(L×logN)	O(L×logN)	O(N×L)	中等

字典树优势：

前缀匹配时间复杂度最优
支持模糊匹配和自动补全
插入和查找时间稳定

字典树劣势：

空间开销大，特别是稀疏字典
缓存局部性较差
实现相对复杂

5. 实际应用场景

请列举字典树在实际开发中的应用场景，并说明其优势。

应用场景

1. 搜索引擎自动补全

场景：用户输入搜索关键词时的实时提示
优势：快速前缀匹配，支持增量搜索

2. 拼写检查器

场景：文本编辑器的拼写纠错功能
优势：快速判断单词是否存在，支持相似词匹配

3. IP路由表

场景：网络路由器的地址匹配
优势：最长前缀匹配，路由查找效率高

4. 电话号码匹配

场景：电话系统的号码验证和路由
优势：前缀匹配，支持区号识别

5. 词典和翻译系统

场景：在线词典的词汇查找
优势：支持模糊匹配和词汇联想

6. 代码补全

场景：IDE的智能代码提示
优势：变量名和函数名的快速匹配

三、设计题

6. 优化字典树

设计一个内存优化的字典树，减少空间占用同时保持查询效率。

优化方案

1. 压缩字典树 (Compressed Trie)

cpp

struct CompressedTrieNode {
    string edgeLabel;  // 边上的字符串
    bool isEnd;
    unordered_map<char, CompressedTrieNode*> children;
    
    CompressedTrieNode(string label = "") : 
        edgeLabel(label), isEnd(false) {}
};

2. 数组存储优化

cpp

// 对于已知字符集，使用数组替代哈希表
struct ArrayTrieNode {
    bool isEnd;
    ArrayTrieNode* children[26];  // 仅支持小写字母
    
    ArrayTrieNode() : isEnd(false) {
        memset(children, 0, sizeof(children));
    }
};

3. 双数组Trie (Double Array Trie)

使用两个数组 base[] 和 check[] 存储状态转移
大幅减少内存占用，提高缓存局部性

优化效果对比：

压缩字典树：减少单链节点，节省30-50%空间
数组存储：提高访问速度，减少指针开销
双数组Trie：内存占用最小，但实现复杂度高

知识点总结

高级数据结构

字典树 (Trie)：前缀匹配的专用结构
二进制字典树：位运算优化，异或问题的利器
压缩字典树：空间优化的高级技巧

算法设计思想

分治策略：树形DP和路径分解
贪心算法：二进制字典树的位选择策略
数据结构转换：问题建模的重要技巧

优化技术

时间优化：合理选择数据结构
空间优化：压缩存储和数组替代
实现优化：缓存友好的内存布局

实际应用

网络路由和IP匹配
搜索引擎和自动补全
编译器和代码分析
生物信息学序列匹配

这些高级应用展示了树形数据结构在解决复杂问题中的重要作用，掌握这些技巧对算法竞赛和实际开发都有很大帮助。

2.3节后习题 - 高级应用 ​

一、算法实现题 ​

1. 最长异或路径 ​

2. 字典树高级实现 ​

二、综合分析题 ​

3. 数据结构选择 ​

4. 算法复杂度分析 ​

5. 实际应用场景 ​

三、设计题 ​

6. 优化字典树 ​

知识点总结 ​

高级数据结构 ​

算法设计思想 ​

优化技术 ​

实际应用 ​