基础数据结构1 - 字典树Trie

158 阅读3分钟

Intro

Trie is like a N-array tree

image.png

性质

  • 根节点(Root)不包含字符,除根节点外的每一个节点都仅包含一个字符
  • 从根节点到某一节点路径上所经过的字符连接起来,即为该节点对应的字符串
  • 任意节点的所有子节点所包含的字符都不相同

模版

class TrieNode:
    def __init__(self):
        self.children = [None] * 26  # list of TrieNode
        self.is_word = False  # terminates the word


class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word):
        node = self.root
        for c in word:
            if not node.children[ord(c) - ord('a')]:
                node.children[ord(c) - ord('a')] = TrieNode()
            node = node.children[ord(c) - ord('a')]
        node.is_word = True  # mark the end of the word
    
    def search(self, word):
        node = self.root
        for c in word:
            if not node.children[ord(c) - ord('a')]:
                return False
            node = node.children[ord(c) - ord('a')]
        return node.is_word
    
    def startsWith(self, prefix):
        node = self.root
        for c in prefix:
            if not node.children[ord(c) - ord('a')]:
                return False
            node = node.children[ord(c) - ord('a')]
        return True

211. 添加与搜索单词 - 数据结构设计(Medium)

image.png

Solu:

  • dfs递归地去search:遇到s[idx] == '.'时,直接跳到下一轮

Code:

class TrieNode:
    def __init__(self):
        self.children = [None] * 26  # list of TrieNode
        self.is_word = False  # terminates the word


class WordDictionary:
    
    def __init__(self):
        self.root = TrieNode()
    
    def addWord(self, word: str) -> None:
        node = self.root
        for c in word:
            if not node.children[ord(c) - ord('a')]:
                node.children[ord(c) - ord('a')] = TrieNode()
            node = node.children[ord(c) - ord('a')]
        node.is_word = True  # mark the end of the word
    
    def search(self, word: str) -> bool:
        def searchHelper(node, word, idx):
            if idx == len(word):
                return node.is_word
            c = word[idx]
            if c != '.':
                return node.children[ord(c) - ord('a')] is not None and searchHelper(node.children[ord(c) - ord('a')],
                                                                                     word, idx + 1)
            for child in node.children:
                if child and searchHelper(child, word, idx + 1):
                    return True
            return False
        
        return searchHelper(self.root, word, 0)


212. 单词搜索 II(Hard)

image.png

Solu:Trie + DFS

  • dfs(i, j, path):在当前path的条件下,从(i,j)出发,暴力展开四个方向,看找到words中的哪些词

Code:

class Trie:
    def __init__(self):
        self.child = [None for _ in range(26)]
        self.is_word = False
    
    def insert(self, word: str) -> None:
        rt = self
        for c in word:
            ID = ord(c) - ord('a')
            if rt.child[ID] == None:
                rt.child[ID] = Trie()
            rt = rt.child[ID]
        rt.is_word = True


class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        #----回溯
        def backtrace(trie, r: int, c: int) -> None:
            #----trie:Trie树中的结点  r,c:board中的坐标
            ID = ord(board[r][c]) - ord('a')
            if trie.child[ID] == None:
                return 
            
            path.append(board[r][c])        #借
            visited[r][c] = True            #标记

            child_trie = trie.child[ID]
            if child_trie.is_word == True:
                res_set.add(''.join(path))

            for nr, nc in [(r-1, c), (r+1,c), (r,c-1), (r,c+1)]:
                if 0 <= nr < Row and 0 <= nc < Col and visited[nr][nc] == False:
                    backtrace(child_trie, nr, nc)
            
            path.pop()                      #还(回溯,有借有还)
            visited[r][c] = False           #还
        

        Row = len(board)
        Col = len(board[0])
        
        T = Trie()
        for word in words:
            T.insert(word)

        res_set = set()
        path = []                                                   #回溯时的容器
        visited = [[False for _ in range(Col)] for _ in range(Row)] #回溯时标记
        for r in range(Row):
            for c in range(Col):
                backtrace(T, r, c)
        return list(res_set)


Bitwise Trie

  • Bitwise Trie represents a binary form of a number in nums
  • 每个node有2个分支,存的是0,1
  • Bitwise Trie is a perfect way to see how different the binary forms of numbers are

image.png

421. 数组中两个数的最大异或值(Medium)

Solu:

  • 先把每个数字存入bitwise trie - O(N)
  • 再对每个数字nums[i],从其最高位找相反的bit,可以形成对nums[i]的最大的XOR

Code:

class Solution:
    def findMaximumXOR(self, nums: List[int]) -> int:
        if not nums: return 0
        # 创建前缀树
        root = {}
        for num in nums:
            cur = root
            for i in range(31, -1, -1):
                cur_bit = (num >> i) & 1
                cur.setdefault(cur_bit, {})
                cur = cur[cur_bit]

        res = float("-inf")
        # 按位找最大值
        for num in nums:
            cur = root
            cur_max = 0
            for i in range(31, -1, -1):
                cur_bit = (num >> i) & 1
                if cur_bit ^ 1 in cur:
                    cur_max += (1 << i)
                    cur = cur[cur_bit ^ 1]
                else:
                    cur = cur[cur_bit]
            res = max(res, cur_max)
        return res


❤️ 472. 连接词(Hard)

image.png

Solu:trie + DFS

  • 先使用trie把所有单词放进去,然后遍历每个单词是否可以在字典树中拆成多个单词
    • 只需要把短的word们放进trie就可以,因为更长的单词本身就是有这些不能再分割的短word组成的
    • dfs(word, idx):递归遍历单词,在trie中寻找是否存在对于word[idx:]的分割,遇到结束符则尝试分割单词
      • word存在合法分割 = 当前位置i是一个单词的结尾 and word[idx:i]也存在合法分割

Code:

class TrieNode:
    def __init__(self):
        self.children = [None] * 26  # list of TrieNode
        self.is_word = False  # terminates the word
    
    def insert(self, word):
        root = self
        for char in word:
            index = ord(char) - ord('a')
            if root.children[index] is None:
                root.children[index] = TrieNode()
            root = root.children[index]
        root.is_word = True


class Solution:
    def __init__(self):
        self.root = TrieNode()
    
    def findAllConcatenatedWordsInADict(self, words: List[str]) -> List[str]:
        @lru_cache(None)
        def dfs(word, idx) -> bool:
            if idx == len(word):
                return True
            node = self.root
            for i in range(idx, len(word)):
                order = ord(word[i]) - ord('a')
                if not node.children[order]:
                    return False
                else:
                    node = node.children[order]
                    if node.is_word and dfs(word, i + 1):
                        return True
            return False
        
        words.sort(key=lambda x: len(x))
        res = []
        for word in words:
            if not word:
                continue
            elif dfs(word, 0):
                res.append(word)
            else:
                self.root.insert(word)
        return res


Reference: