字典树 (Trie)

🧮 字典树 (Trie)

难度 / Difficulty: 中等 (Medium)
分类 / Category: 字符串数据结构 (String Data Structure)
时间复杂度 / Time Complexity: O(m)（插入/查询，m为字符串长度）
空间复杂度 / Space Complexity: O(Σm·L)（字符数 × 平均长度）

📖 算法简介 / Introduction

字典树（Trie，源自 Retrieval 一词）是一种树形数据结构，用于高效存储和检索字符串集合。字典树的核心思想是通过共享前缀来节省存储空间，特别擅长处理前缀相关的查询操作。

Trie (pronounced “try”) is a tree-based data structure used to efficiently store and retrieve a dynamic set of strings. The core idea is to share prefixes among all strings, making it especially efficient for prefix-based queries.

💡 算法原理 / Principle

字典树的基本结构：

节点结构：每个节点包含子节点映射（通常用数组或哈希表）和标记位（是否是一个完整单词）
共享前缀：所有字符串的公共前缀被合并为同一条路径
插入过程：从根开始，沿路径遍历或创建子节点，直到字符串结束
查询过程：从根开始，沿着字符路径向下查找，找到返回 True，未找到返回 False

核心思想：Use a tree where each node represents a single character, and paths from root to leaves represent complete words. Common prefixes are stored only once, dramatically reducing storage for large string collections.

🔑 关键概念 / Key Concepts

概念	说明
前缀共享 (Prefix Sharing)	相同前缀的字符串共享节点，节省空间
词尾标记 (End of Word)	标记哪些节点代表完整单词的结尾
路径压缩	可选优化，合并只有一个子节点的链
字符映射	数组（固定26）或哈希表（动态）实现
前缀计数	在每个节点记录经过的单词数量

📝 代码实现 / Implementation

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False
        self.word_count = 0  # 以此节点结尾的单词数

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
            node.word_count += 1
        node.is_end = True
    
    def search(self, word):
        node = self._find_node(word)
        return node is not None and node.is_end
    
    def startsWith(self, prefix):
        return self._find_node(prefix) is not None
    
    def _find_node(self, prefix):
        node = self.root
        for char in prefix:
            if char not in node.children:
                return None
            node = node.children[char]
        return node
    
    def prefix_count(self, prefix):
        node = self._find_node(prefix)
        return node.word_count if node else 0
    
    def delete(self, word):
        def _delete(node, word, depth):
            if depth == len(word):
                if not node.is_end:
                    return False
                node.is_end = False
                return len(node.children) == 0
            
            char = word[depth]
            if char not in node.children:
                return False
            
            should_delete_child = _delete(node.children[char], word, depth + 1)
            if should_delete_child:
                del node.children[char]
                return len(node.children) == 0 and not node.is_end
            return False
        
        _delete(self.root, word, 0)

# 示例
trie = Trie()
words = ["apple", "app", "application", "apply", "apt", "bat", "ball"]
for w in words:
    trie.insert(w)

print(trie.search("app"))          # True
print(trie.search("application")) # True
print(trie.startsWith("ap"))      # True
print(trie.startsWith("ba"))      # True
print(trie.prefix_count("ap"))    # 4

✨ 示例 / Example

插入 ["apple", "app", "bat", "bad"] 后的 Trie 结构：

root
├── a
│   └── p
│       ├── p  (is_end=True, word="app")
│       │   └── l
│       │       └── e (is_end=True, word="apple")
│       └── (new branch for "aplication" if added)
├── b
│   ├── a
│   │   ├── t  (is_end=True, word="bat")
│   │   └── d  (is_end=True, word="bad")

🎯 应用场景 / Applications

前缀搜索 — 搜索引擎自动补全、输入法联想
IP路由 — Longest Prefix Match (LPM)
字符串统计 — 词频统计、前缀计数
拼写检查 — 快速查找词典中是否存在
电话号码簿 — 按姓名前缀快速检索
DNA序列匹配 — 生物信息学中的模式匹配

📊 复杂度分析 / Complexity Analysis

操作	时间复杂度	空间复杂度
插入	O(m)	O(m) 最坏
查找	O(m)	O(1)
前缀查询	O(m)	O(1)
删除	O(m)	O(m) 释放

m 为字符串长度

🏠 实战例题 / Practice Problems

*本文由 AI 自动生成

Generated by AI*