这是我参与8月更文挑战的第6天，活动详情查看：8月更文挑战

前言

在看这一块代码的时候我选择从测试用例入手：

describe('compiler: integration tests', () => {
  const source = `
    <div id="foo" :class="bar.baz">
    {{ world.burn() }}
    <div v-if="ok">yes</div>
    <template v-else>no</template>
    <div v-for="(value, index) in list"><span>{{ value + index }}</span></div>
    </div>
  `.trim()

  test('function mode', () => {
    const result = compile(source, {
      sourceMap: true,
      filename: `foo.vue`
    })
  })
})

这里的调用 compile 函数就是编译的总入口。

在看模板编译这里需要了解的前置知识：

Vue 的编译分为三个阶段，分别是 parse、transform 和 condegen。parse 阶段负责将模板字符串解析为抽象语法树 AST; transform 阶段是对 AST 进行转换，也就是在 AST 上增加一些标记，方便在 condegen 阶段更快生成可执行代码，也可以理解成对 AST 的优化; condegen 阶段就是根据 AST 生成对应的 render 函数字符串。

接下来开始进入 compile 方法。

解析

解析是整个模板编译的第一步，对应的伪代码如下：

const ast = baseParse(template, options)

接下来可以看一下 baseParse 中的代码

export function baseParse(
 content: string,
 options: ParserOptions = {}
): RootNode {
  const context = createParserContext(content, options)
  const start = getCursor(context)
  return createRoot(
    parseChildren(context, TextModes.DATA, []),
    getSelection(context, start)
  )
}

这里先创建了一个解析的上下文得到 context; 后续调用了 getCursor 方法，这个方法其实就是从 context 中取出了 column、line、offset 三个值返回一个对象赋值给了 start; 最后将 createRoot 的返回值 return 了出去，从这里可以知道解析的核心逻辑是在 createRoot 方法中。

因为调用 createRoot 的时候将 parseChildren 和 getSelection 两个方法的返回值传了进去，所以需要先看一下这两个方法。

createRoot 的参数一: parseChildren 方法的返回值

通过这个方法名称也可以看出来这个方法是用来解析子节点的。

看完整个方法之后可以得出的结论是

解析标签是通过正则来匹配各个标签以及这个标签是开始标签还是结束标签。
在匹配的过程中会将当前匹配的开始标签放到缓存中，如果当前标签有自己子节点的话会先去解析子节点并将子节点的开始标签也 push 到缓存中，只有当子节点解析完成之后，才会将子节点的开始标签从缓存中取出来，这个时候父节点就知道子节点解析完成了，然后开始解析父节点的后半部分。
通过第 2 点也可以知道在解析的过程中是遵循了深度优先的原则。

函数的返回值就是解析完成后的子节点组成的 AST。

createRoot 的参数而: getSelection 方法的返回值

这个方法是用来获取当前节点的在源代码中的位置信息，最终的返回值就是这些位置信息。

return {
  start,
  end,
  source: context.originalSource.slice(start.offset, end.offset)
}

知道了 parseChildren 和 getSelection 两个方法的返回值后就可以去看 createRoot 方法了。

createRoot 方法

export function createRoot(
  children: TemplateChildNode[],
  loc = locStub
): RootNode {
  return {
    type: NodeTypes.ROOT,
    children,
    helpers: [],
    components: [],
    directives: [],
    hoists: [],
    imports: [],
    cached: 0,
    temps: 0,
    codegenNode: undefined,
    loc
  }
}

这个方法很简单，就是创建了一个根节点的 AST 并把 type 设置为了 ROOT，children 就是 parseChildren 方法的返回值，也就是所有子节点的 AST。

当执行完 createRoot 方法之后，模板对应的 AST 就全部生成了。

优化

优化 AST 阶段对应的是 transform 方法。

还是先来看 transform 的调用

transform(
  ast,
  extend({}, options, {
    prefixIdentifiers,
    nodeTransforms: [
      ...nodeTransforms,
      ...(options.nodeTransforms || []) // user transforms
    ],
    directiveTransforms: extend(
      {},
      directiveTransforms,
      options.directiveTransforms || {} // user transforms
    )
  })
)

这里需要说明的参数有

prefixIdentifiers: 这个参数决定了代码生成方式。代码生成方式有两种:
- function 模式，特点是使用 const { helpers... } = Vue 的方式来引入帮助函数。向外导出使用 return 返回整个 render() 函数。
- module 模式，特点是使用 es6 模块来导入导出函数，也就是使用 import 和 export。
nodeTransforms 和 directiveTransforms 这两个参数分别是针对 node 和 directive 的单独优化。这两个参数都是从 getBaseTransformPreset 方法中得到的，在这个方法中可以看到这两个参数分别对应的是什么
- nodeTransforms 对应的是
```
transformOnce,
transformIf,
transformFor,
...[transformFilter],
...[    trackVForSlotScopes,    transformExpression],
transformSlotOutlet,
transformElement,
trackSlotScopes,
transformText
```
- directiveTransforms 对应的是
```
on: transformOn,
bind: transformBind,
model: transformModel
```
这就可以看出在各种节点和指令都会有专门的优化方法。

看完这些参数就可以去看 transform 方法了。

export function transform(root: RootNode, options: TransformOptions) {
    const context = createTransformContext(root, options)

    traverseNode(root, context)

    if (options.hoistStatic) {
        hoistStatic(root, context)
    }

    if (!options.ssr) {
        createRootCodegen(root, context)
    }
    // finalize meta information
    root.helpers = [...context.helpers.keys()]
    root.components = [...context.components]
    root.directives = [...context.directives]
    root.imports = context.imports
    root.hoists = context.hoists
    root.temps = context.temps
    root.cached = context.cached
}

第一步就是创建了一个上下文，里边初始化了一些属性及方法，具体的内容根据后边用到的再去看。

接下来就是调用了 traverseNode、hoistStatic、createRooCodegen 三个方法; 后边又将 context 中的属性赋值到了 root Vnode 上。

下面来分别看看那三个方法。

traverseNode

export function traverseNode(
 node: RootNode | TemplateChildNode,
 context: TransformContext
) {
   // ... 
}

在这个方法里可以分为三部分来看

第一部分

伪代码如下：

// apply transform plugins
const { nodeTransforms } = context
const exitFns = []
for (let i = 0; i < nodeTransforms.length; i++) {
    const onExit = nodeTransforms[i](node, context)
    if (onExit) {
        if (isArray(onExit)) {
            exitFns.push(...onExit)
        } else {
            exitFns.push(onExit)
        }
    }
}

代码中从 context 中获取到了 nodeTransforms , 这个属性是一开始调用 transform 方法的时候传进来的，里边包含了对各种类型节点的 transform 方法。

所以这段代码的作用就是把当前 node 传入每个 nodeTransform 来处理当前 node，然后将每个 nodeTransform 的返回结果赋值给 onExit，再将 onExit push 到 exitFns 中。

通过 onExit 可以判断每个 nodeTransform 的返回值还是一个函数。

产生的问题

每个 nodeTransform 都干了什么？

这里用 transformElement 来举例，因为 transformElement 方法是一个所有 AST Element 都会被执行的一个方法。

在函数的最后可以看到会给传入的 node 中的 codegenNode 属性赋值

export const transformElement: NodeTransform = (node, context) => {
  return function postTransformElement() {
    // ...
    node.codegenNode = createVNodeCall(
      context,
      vnodeTag,
      vnodeProps,
      vnodeChildren,
      vnodePatchFlag,
      vnodeDynamicProps,
      vnodeDirectives,
      !!shouldUseBlock,
      false /* disableTracking */,
      node.loc
    )
  }
}

内容就是方法 createVNodeCall 的返回值，createVNodeCall 方法如下

export function createVNodeCall(
  context: TransformContext | null,
  tag: VNodeCall['tag'],
  props?: VNodeCall['props'],
  children?: VNodeCall['children'],
  patchFlag?: VNodeCall['patchFlag'],
  dynamicProps?: VNodeCall['dynamicProps'],
  directives?: VNodeCall['directives'],
  isBlock: VNodeCall['isBlock'] = false,
  disableTracking: VNodeCall['disableTracking'] = false,
  loc = locStub
): VNodeCall {
  if (context) {
    if (isBlock) {
      context.helper(OPEN_BLOCK)
      context.helper(CREATE_BLOCK)
    } else {
      context.helper(CREATE_VNODE)
    }
    if (directives) {
      context.helper(WITH_DIRECTIVES)
    }
  }

  return {
    type: NodeTypes.VNODE_CALL,
    tag,
    props,
    children,
    patchFlag,
    dynamicProps,
    directives,
    isBlock,
    disableTracking,
    loc
  }
}

可以看到函数中调用了 context.helper 方法，其实这个方法就是把调用 helper 方法时传入的参数添加到 context.helpers 中

context.helpers 由 symbol 类型组成的一个 set 数据结构，每一个 symbol 值就是当前节点创建、渲染等处理函数名称。

除了调用 context.helper 方法就是 return 了一个 VNode。

看到这里就可以发现 transformElement 方法中对 node.codegenNode 属性的赋值知识赋了一个 VNode。但是 transform 方法不应该是针对各种节点类型来做优化的吗？这个优化在哪里呢？这里的优化其实是利用了 patchFlag 这个属性。那这个 patchFlag 是怎么促成优化的呢？

其实是在 diff 过程中会根据 patchFlag 进行优化。patchFlag 的所有值如下:

  // 标识具有动态 textContent（子级快速路径）的元素
  TEXT = 1, // 1

  // 标识具有动态类绑定的元素
  CLASS = 1 << 1, // 2

  // 标识具有动态 style 的节点
  STYLE = 1 << 2, // 4

  /**
   * Indicates an element that has non-class/style dynamic props.
   * Can also be on a component that has any dynamic props (includes
   * class/style). 
   * when this flag is present, the vnode also has a dynamicProps
   * array that contains the keys of the props that may change so the runtime
   * can diff them faster (without having to worry about removed props)
   */
  PROPS = 1 << 3, // 8

  /**
   * Indicates an element with props with dynamic keys. When keys change, a full
   * diff is always needed to remove the old key. This flag is mutually
   * exclusive with CLASS, STYLE and PROPS.
   */
  FULL_PROPS = 1 << 4, // 16

  // 标识具有事件监听的节点
  HYDRATE_EVENTS = 1 << 5, // 32

  // 标识一个不会改变子节点顺序的 fragment 
  STABLE_FRAGMENT = 1 << 6, // 64

  // 标识有 key 的 fragment 或者部分有 key 的子节点
  KEYED_FRAGMENT = 1 << 7, // 128

  // 标识一个子节点没有 key 的 framgent
  UNKEYED_FRAGMENT = 1 << 8, // 256

  // 标识只需要非 props 比较的节点
  NEED_PATCH = 1 << 9, // 512

  // 标识一个动态 slots 的组件
  DYNAMIC_SLOTS = 1 << 10, // 1024

  DEV_ROOT_FRAGMENT = 1 << 11,

  // 标识一个静态节点，静态节点从来不会更新，在优化的时候可以跳过
  HOISTED = -1,

  // 指示在 diff 过程应该要退出优化模式，不是 render 函数生成的一些元素，例如 renderSlot
  BAIL = -2

总体来看 patchFlag 分为两大类

值大于 0 时则表示对应的元素在 diff 过程中是可以优化生成或更新的

值小于 0 时则表示当前元素是可以跳过的

现在就可以回答 nodeTransform 方法都做了什么，在这个过程中是给所有节点加上了 patchFlag 属性，以便在 diff 过程中进行优化。

第二部分

switch (node.type) {
  case NodeTypes.COMMENT:
    if (!context.ssr) {
      // inject import for the Comment symbol, which is needed for creating
      // comment nodes with `createVNode`
      context.helper(CREATE_COMMENT)
    }
    break

  case NodeTypes.INTERPOLATION:
    // no need to traverse, but we need to inject toString helper
    if (!context.ssr) {
      context.helper(TO_DISPLAY_STRING)
    }
    break

    // for container types, further traverse downwards
  case NodeTypes.IF:
    for (let i = 0; i < node.branches.length; i++) {
      traverseNode(node.branches[i], context)
    }
    break
  case NodeTypes.IF_BRANCH:
  case NodeTypes.FOR:
  case NodeTypes.ELEMENT:
  case NodeTypes.ROOT:
    traverseChildren(node, context)
    break
}

在这里调用了三个方法

context.helper 在第一部分中提到过，这里来看一下他的具体实现
```
helper(name) {
  const count = context.helpers.get(name) || 0
  context.helpers.set(name, count + 1)
  return name
}
```
这里就是将其传入的 name set 到了 helpers 中，key 是 name，value 是 name 在 helpers 中出现的次数。
如果 node.type 为 NodeTypes.IF 类型则遍历了一个 if 的其他分支然后递归调用了 traverseNode

如果 node.type 为 NodeTypes.IF_BRANCH、NodeTypes.FOR、NodeTypes.ELEMENT、NodeTypes.ROOT 类型则调用 traverseChildren 方法

export function traverseChildren(
  parent: ParentNode,
  context: TransformContext
) {
  let i = 0
  const nodeRemoved = () => {
    i--
  }
  for (; i < parent.children.length; i++) {
    const child = parent.children[i]
    if (isString(child)) continue
    context.parent = parent
    context.childIndex = i
    context.onNodeRemoved = nodeRemoved
    traverseNode(child, context)
  }
}

通过代码可以看到，在 traverseChildren 方法中也没有过多的操作，只是遍历了当前节点的子节点，然后每一个子节点都去调用 traverseNode 方法。

在这一部分主要就是给 node 对应的 context 中的 helpers 属性添加了每个节点对应的处理方法。这里 helpers 中的值什么时候会用到呢？是在生成可执行代码阶段根据这些方法名会将其对应的方法引入。

第三部分

// exit transforms
context.currentNode = node
let i = exitFns.length
while (i--) {
  exitFns[i]()
}

在这里就是将第一部分的调用结果分别再调用一遍。看了第一部分之后可以知道 exitFns 里存放的都是方法。

总结

通过 traverseNode 方法可以得到是他会对每个 node 上的 codegenNode 属性进行赋值，在 codegenNode 中会有一个 patchFlag 值。这里会在 diff 的时候用到。

在 node 对应的 context 中还会有一个 helpers 属性，这里存放了每个节点应该什么样的处理的方法名。这里的会在代码生成阶段用到，会将这里的所有方法引入。

hoistStatic

这里就是 vue3 相对于 vue2 的另外一个优化点: 静态提升。

什么是静态提升

静态提升就是将静态节点提升到 render 函数外边生成，后续 render 函数有变化的时候就可以直接引用这个已经生成的函数，并不需要再次创建。

这个可以通过生成的渲染函数来看就会很明显

Template:

<div>
  <div>123</div>
</div>

Render:

import { createElementVNode as _createElementVNode, openBlock as _openBlock, createElementBlock as _createElementBlock } from "vue"

const _hoisted_1 = /*#__PURE__*/_createElementVNode("div", null, "123", -1 /* HOISTED */)
const _hoisted_2 = [
  _hoisted_1
]

export function render(_ctx, _cache, $props, $setup, $data, $options) {
  return (_openBlock(), _createElementBlock("div", null, _hoisted_2))
}

通过这个 render 函数就可以看到，上边的 div 以及其中的值都是静态的，一成不变的值，开启静态提升之后就会将创建静态节点的方法提升到 render 函数外边。

hoistStatic 方法

export function hoistStatic(root: RootNode, context: TransformContext) {
  walk(
    root,
    context,
    // Root node is unfortunately non-hoistable due to potential parent
    // fallthrough attributes.
    isSingleElementRoot(root, root.children[0])
  )
}

在这个方法中是直接调用了 walk 方法，在调用 walk 方法之前调用了 isSingleElementRoot 方法，这个方法就是判断了是不是根节点，因为根节点是不可被提升的。

在 walk 方法主要是对当前节点的子节点的操作

更改子节点 child 的 codegenNode 中的 patchFlag 值

(child.codegenNode as VNodeCall).patchFlag = PatchFlags.HOISTED + (__DEV__ ? ` /* HOISTED */` : ``)

将 child 的 codegenNode push 到 context.hoists 中，这一步是调用了 context 中的 hoist 方法

context.hoist(child.codegenNode!)

hoist 方法如下

hoist(exp) {
  context.hoists.push(exp)
  const identifier = createSimpleExpression(
    `_hoisted_${context.hoists.length}`,
    false,
    exp.loc,
    ConstantTypes.CAN_HOIST
  )
  identifier.hoisted = exp
  return identifier
}

这个方法的一开始就是进行 push 操作，后续又调用 createSimpleExpression 方法创建了一个标识符 identifier。createSimpleExpression 方法内部就是将传入的值加上一个 type 组成了一个对象 return 了出来

export function createSimpleExpression(
  content: SimpleExpressionNode['content'],
  isStatic: SimpleExpressionNode['isStatic'],
  loc: SourceLocation = locStub,
  constType: ConstantTypes = ConstantTypes.NOT_CONSTANT
): SimpleExpressionNode {
  return {
    type: NodeTypes.SIMPLE_EXPRESSION,
    loc,
    content,
    isStatic,
    constType: isStatic ? ConstantTypes.CAN_STRINGIFY : constType
  }
}

hoist 方法将这个 identifier return 了出去，也就产生了 walk 方法的第三个操作。

将 hoist 方法的返回值赋值给了 child.codegenNode
```
child.codegenNode = context.hoist(child.codegenNode!)
```

如果当前子节点是一个动态的，但是他的 props 有可能是可以被提升的，这个时候也会调用 hoist 方法将 child 的 props 添加到 context.hoists 中，然后将 hoist 方法的返回值赋值给 codegenNode.props。

后边就是如果 child 的 type 是 element、for、if、类型的话就遍历他们的子节点或分支然后递归调用 walk 方法。

总结

在这个方法中处理了所有静态节点，将所有的静态节点的 codegenNode 都 push 到根节点的上下文中的 hoists 数组中。

对当前静态节点的 codegenNode 进行赋值。

问题

为什么将所有的静态节点的 codegenNode 都 push 到根节点的上下文中的 hoists 数组中？

这应该也就属于静态提升的其中一步，这样的话可以最快的找到所有的静态节点并将其对应的 render 函数创建出来，也就不需要再去遍历每一个节点去寻找静态节点。

createRootCodegen

前边那些都是对子节点的处理，现在要创建出根节点的 codegen。

伪代码如下

function createRootCodegen(root: RootNode, context: TransformContext) {
  const { helper, removeHelper } = context
  const { children } = root
  if (children.length === 1) {
    const child = children[0]
    // if the single child is an element, turn it into a block.
    if (isSingleElementRoot(root, child) && child.codegenNode) {
      const codegenNode = child.codegenNode
      root.codegenNode = codegenNode
    } else {
      // - single <slot/>, IfNode, ForNode: already blocks.
      // - single text node: always patched.
      // root codegen falls through via genNode()
      root.codegenNode = child
    }
  } else if (children.length > 1) {
    // root has multiple nodes - return a fragment block.
    let patchFlag = PatchFlags.STABLE_FRAGMENT
    let patchFlagText = PatchFlagNames[PatchFlags.STABLE_FRAGMENT]

    root.codegenNode = createVNodeCall(
      context,
      helper(FRAGMENT),
      undefined,
      root.children,
      patchFlag + (__DEV__ ? ` /* ${patchFlagText} */` : ``),
      undefined,
      undefined,
      true
    )
  } else {
    // no children = noop. codegen will return null.
  }
}

其中处理了只有一个自己子节点和更多子节点。

如果只有一个子节点则判断子节点中是否有 codegenNode 属性，如果有则 root.codegenNode 就是子节点的 codegenNode，反之 root.codegenNode 就是子节点本身。

如果有更多子节点则调用前边提到的 createVNodeCall 方法来生成 root.codegenNode。

生成可执行代码

这一部分是通过 generate 方法来生成的。主要也可以分为 5 个部分

创建上下文

const context = createCodegenContext(ast, options)

根据 prefixIdentifiers 字段决定代码生成采用 module 还是 function 的形式，这两种方式在上边已经说过。
根据不同的引入方式引入 helpers 中的方法
```
if (!__BROWSER__ && mode === 'module') {
  genModulePreamble(ast, preambleContext, genScopeId, isSetupInlined)
} else {
  genFunctionPreamble(ast, preambleContext)
}
```
genModulePreamble 和 genFunctionPreamble 方法都是用来引入 helpers 中的方法的，只不过是引入方式的不同。
向 context.code 中添加、追加可执行的代码字符串

这里就是频繁的调用 push、indent、newline、deindent 等方法来对代码字符串中的格式进行优化

将最终的结果 return，在最终的数据里 code 就是可执行代码字符串，ast 就是抽象语法树。

{
  ast,
  code: context.code,
  preamble: isSetupInlined ? preambleContext.code : ``,
  // SourceMapGenerator does have toJSON() method but it's not in the types
  map: context.map ? (context.map as any).toJSON() : undefined
}

到这一步，所有的模板解析过程就结束了。

Vue3 模板编译源码学习

前言

解析