浅析动态路由之前缀树与正则匹配路由是 web开发中无法绕开的一个概念，无论是后端还是前端都需要通过路由来绑定处理程序，一

路由是 web开发中无法绕开的一个概念，无论是后端还是前端都需要通过路由来绑定处理程序，一般而言，路由匹配指的都是动态路由的匹配，即一条路由规则可以匹配某一类型而非某一条固定的路由，例如 /detail:id，可以匹配 /detail/1、/detail/2 等符合一定规则的 url

动态路由有很多种实现方式，支持的规则、性能等也各有千秋，前端的路由实现方案基本都是正则匹配，大多依赖 path-to-regexp 这个库解析路由，例如 react-router、vue-router，后端路由例如 @koa/router 也是依赖了 path-to-regexp

除了这种路由方案之外，还有一种常用的方案即前缀树，Gin 框架的路由模块就是使用了这种方案

接下来分别按照这两种方案，实现可实际运行的路由程序，这个路由程序具备以下两个功能：

参数匹配：例如 /detail/:id 可以匹配 /detail:1、/detail/2，并且可以获取到 id 对应的值，即 params
通配*：例如 /home/*，可以匹配任意以 /home 为前缀 path 的路由

只实现体现原理的代码，所以一些边界值之类的就不管了，方案的实现方式很多，本文只是其中一种

前缀树(Trie)

Trie，又称前缀树或字典树，是一棵有根树，每一个节点的所有的子节点都拥有相同的前缀，例如对于如下这棵前缀树：

可以表示的路由有：

/home
/detail/1
/about/user/center
/about/product

预期程序可以按照以下方式注册路由:

router.get('/home', () => {
  console.log('【home】');
});
router.post('/detail/:id',  (ctx: { params: Record<string, string> }) => {
  console.log('【/detail/:id】', ctx.params);
});

先定义一个路由类，这个类具有两个属性，roots 即根节点，handlers 是匹配到路由执行的方法

type THandleFunc = (ctx: { params: Record<string, string> }) => void
type TMethods = 'get' | 'post'

class TrieRouter {
  roots: { [key in TMethods]?: TrieNode } = {}
  handlers: Record<string, THandleFunc> = {}
}

TrieNode 表示路由节点

class TrieNode {
  // 注册的路由规则，例如 /detail/:id
  pathRule: string
  /** 当前匹配到的一段 path，属于 fullPath 的一部分 */
  part: string = ''
  children: TrieNode[] = []
  /** 是否已经匹配到路由尾部 */
  isEnd: boolean = false
}

前缀树的构造需要具备两个方法：insert、search，即插入/注册路由与搜索/匹配路由

const oneSymbol = ':'
const muitipleSymbol = '*'
const pathSplitSymbol = '/'

class TrieNode {
  // 省略代码...
  /**
   * 插入路由节点
   * @param pathRule 路由规则，例如 /detail/:id
   */
  insert(this: TrieNode, pathRule: string) {
    const fullParts = pathRule.split(pathSplitSymbol).slice(1)
    const len = fullParts.length;
    let part = ''
    let child: TrieNode
    let root = this
    for (let i = 0; i < len; i++) {
      part = fullParts[i]
      child = root.findChild(part)
      if (!child) {
        child = new TrieNode()
        if (part[0] === oneSymbol) {
          child.part = oneSymbol
        } else if (part[0] === muitipleSymbol) {
          child.part = muitipleSymbol
        } else {
          child.part = part
        }
        root.children.push(child)
      } else if (child.part === muitipleSymbol) {
        // 已经可以匹配所有，不需要继续 insert
        return
      }
      root = child
    }
    if (!root.pathRule) {
      root.pathRule = pathRule
    }
    root.isEnd = true
  }
  /** 从 children 中匹配符合 partPath 的唯一子节点 */
  findChild(partPath: string) {
    return this.children.find(c => c.part === partPath || c.part === oneSymbol || c.part === muitipleSymbol) || null
  }
}

注释得比较详细，就不再一一细说了，主要就是根据路由一层层构造前缀树的过程，对于一条路由来说，其在前缀树中的路径肯定只有一个，所以从上到下在每一层找到一个节点，再继续深入即可

class TrieNode {
  // 省略代码...
  search(fullParts: string[]): TrieNode | null {
    if (this.isEnd && fullParts.length === 0) {
      return this
    }
    const children = this.findChildren(fullParts[0])
    const childrenLen = children.length
    let child: TrieNode | null = null
    for (let i = 0; i < childrenLen; i++) {
      child = children[i].search(fullParts.slice(1))
      if (child) {
        return child
      }
    }
    return null
  }
  /** 从 children 中匹配所有符合 partPath 的子节点 */
  findChildren(partPath: string) {
    return this.children.filter(c => c.part === partPath || c.part === oneSymbol || c.part === muitipleSymbol)
  }
}

搜索就是在前面构造好的前缀树上，一层层匹配符合条件的路由，由于是根据前缀匹配，所以在每一层可能存在多个成功匹配前缀的路径，需要在多个匹配到的路径中继续深入，直到匹配到最后得到唯一的路径

前缀树节点的构造基本就是这些了，再来看外部是如何操作这些节点的

const methods: TMethods[] = ['get', 'post']
const getPathKey = (method: TMethods, fullPath: string) => {
  return method + fullPath
}

methods.forEach(method => {
  TrieRouter.prototype[method] = function(pathRule: string, handleFunc: THandleFunc) {
    if (!this.roots[method]) {
      this.roots[method] = new TrieNode()
    }
    this.roots[method]!.insert(pathRule)
    this.handlers[getPathKey(method, pathRule)] = handleFunc
  }
})

当调用形如 router.get('/xx', xxx)、router.post('/xx', xxx)的时候，路由内部执行注册方法，按照 method进行分类，，执行前缀树的构造过程

通过调用 match方法进行路由匹配

class TrieRouter {
  roots: { [key in TMethods]?: TrieNode } = {}
  handlers: Record<string, THandleFunc> = {}
  match(method: TMethods, fullPath: string) {
    const root = this.roots[method];
    if (!root) {
      console.warn("unmatched method: " + method)
      return
    }
    const fullParts = fullPath.split(pathSplitSymbol).slice(1)
    const active = root.search(fullParts)
    if (!active) {
      console.warn('unmatched route:' + fullPath)
      return
    }
    if (active) {
      const activeFullParts = active.pathRule.split(pathSplitSymbol).slice(1)
      const params: Record<string, string> = {}
      activeFullParts.forEach((p, index) => {
        if (p[0] === oneSymbol) {
          params[p.slice(1)] = fullParts[index]
        }
      })
      this.handlers[getPathKey(method, active.pathRule)]({ params })
    }
  }
}

匹配到对应的路由节点后，再根据 : 拿到params，例如对于 /detail/:id 这条路由规则来说，其匹配 /detail/1，:id 对应1，那么 params = { id: '1' }

测试一下

const router = new RegRouter()

router.get('/home/about', ctx => {
  console.log('【home】/home/about', ctx.params)
})
router.get('/:status/info', ctx => {
  console.log('【/:status/info', ctx.params)
})
router.get('/home/*', function (ctx) {
  console.log('【*】', ctx.params);
});
router.post('/detail/:id', function (ctx) {
  console.log('【/detail/:id】', ctx.params);
});
router.post('/detail/:userID/:id', function (ctx) {
  console.log('【/detail/:userID/:id】', ctx.params);
});

// 模拟路由匹配动作
router.match('get', '/home/about') // 【home】/home/about undefined
router.match('get', '/pending/info') // 【/:status/info {status: "pending"}
router.match('post', '/detail/1') // 【/detail/:id】 {id: "1"}
router.match('post', '/detail/2/3') // 【/detail/:userID/:id】 {id: "3", userID: "2"}
router.match('post', '/anypath') // no matched path: /anypath

前缀树方案的优点是，查找的时间和空间复杂度都是线性的，效率很高，很适合路由这种场景使用

正则匹配(Reg)

正则匹配的方式不需要构造树结构，也可以不需要专门的节点来管理路由

class RegRouter {
  roots: { [key in TMethods]?: { path: string; regexp: RegExp, handler: THandleFunc }[] } = {}
  match(method: TMethods, fullPath: string) {
    const methodRoot = this.roots[method]
    if (!methodRoot) {
      console.warn('unmatched route:' + fullPath)
      return
    }
    const root = methodRoot.find(r => r.regexp.test(fullPath))
    if (!root) {
      console.warn("no matched path: " + fullPath)
      return
    }
    root.handler({ params: fullPath.match(root.regexp).groups })
  }
}

根据 method进行划分，平铺缓存所有的路由规则，路由信息结构比较清晰

const methods: TMethods[] = ['get', 'post']
methods.forEach(method => {
  RegRouter.prototype[method] = function(fullPath: string, handleFunc: THandleFunc) {
    if (!this.roots[method]) {
      this.roots[method] = []
    }
    this.roots[method].push({ path: fullPath, regexp: getRegexp(fullPath), handler: handleFunc });
  }
})

重点是正则表达式 regexp，每条路由都需要根据此字段来确定是否匹配，大多数的路由库依赖 path-to-regexp 来解析正则，本文只是实现一个简单的例子，所以就不引入了，自行拼接正则即可

const getRegexp = (fullPath: string) => {
  const parts = fullPath.split('/').slice(1)
  const len = parts.length
  let str = '^'
  for (let i = 0; i < len; i++) {
    if (parts[i][0] === ':') {
      str += '(?:\\/(?<' + parts[i].slice(1) + '>[^\\/#\\?]+?))'
    } else if (parts[i][0] === '*') {
      str += '\\/.*'
    } else {
      str += '\\/' + parts[i]
    }
  }
  return new RegExp(str + '[\\/#\\?]?$')
}

例如，对于 /detail/:userID/:id 这个路由规则，通过 getRegexp 计算后，其 regexp 为 /^\/detail(?:\/(?<userID>[^\/#\?]+?))(?:\/(?<id>[^\/#\?]+?))[\/#\?]?$/，当对 /detail/1/2 进行匹配时，会自动捕获到参数 userID 和 id的值

这里的正则其实也只是一种匹配的方式罢了，完全可以将匹配规则保存下来，然后对待匹配路由进行逐段 path 的遍历从而一一对比找出完全匹配的规则

最后注册出来的 roots类似于：

因为路由都是累加平铺的，并且每条规则都需要经过相同的完全匹配过程（最坏的结果是把所有规则都match一遍），性能上相比于前缀树来说会低一些，路由规则越多差距就越大，但一般一个前端项目里的路由不会很多的，而且都是在用户端执行，这些性能差异可以忽略不计，并且这种方式非常灵活，能够支持更多的功能与规则，所以是比较适合前端路由的