Vue源码速读 | 第四章：深入模板编译流程，面试必问的AST语法树到底是什么？Vue 中的模板解析是将模板字符串转换为

四、Vue中的模板解析--template -> AST

1. compile

在上一节中，我们通过setupStatefulComponent处理instance，为instance添加了proxy代理属性，同时执行了setup函数并传入了对应的参数。最后执行了handleSetupResult来编译模板

Component.render = compile(template);

这里的compile函数即为compileToFunction函数

function compileToFunction(template, options = {}) {
    const { code } = baseCompile(template, options);
    const render = new Function("Vue", code)(runtimeDom);
    return render;
}

2. baseCompile

通过baseParse将template转为ast语法树。

function baseCompile(template, options) {
    const ast = baseParse(template);
    transform(ast, Object.assign(options, {
        nodeTransforms: [transformElement, transformText, transformExpression],
    }));
    return generate(ast);
}
}

2.1 baseParse

function baseParse(content) {
    const context = createParserContext(content);
    return createRoot(parseChildren(context, []));
}

createParserContext只做了数据转换，将数据通过对象的source包裹

function createParserContext(content) {
    return {
        source: content,
    };
}

2.2 parseChildren

这里是模板解析的关键函数，通过循环、递归和正则匹配的方法解析template。

这里的/[a-z]/i用来匹配大小写字母

function parseChildren(context, ancestors) {
    const nodes = []; //存放解析后的数据
    while (!isEnd(context, ancestors)) { //循环
        let node;
        const s = context.source; // '<p>{{msg}}</p>'
        if (startsWith(s, "{{")) { // 解析插值语法
            node = parseInterpolation(context);
        }
        else if (s[0] === "<") { // 解析标签
            if (s[1] === "/") {
                if (/[a-z]/i.test(s[2])) {
                    parseTag(context, 1);
                    continue;
                }
            }
            else if (/[a-z]/i.test(s[1])) {
                node = parseElement(context, ancestors);
            }
        }
        if (!node) {
            node = parseText(context);
        }
        nodes.push(node);
    }
    return nodes;
}

2.2.1 isEnd

判断模板解析是否完成。

如果模板以</即标签结束符开头，且从尾部遍历ancestors，如果模板满足startsWithEndTagOpen则代表解析结束。

如果解析字符串已经为空也代表解析完毕

function isEnd(context, ancestors) {
    const s = context.source;
    if (context.source.startsWith("</")) {
        for (let i = ancestors.length - 1; i >= 0; --i) {
            if (startsWithEndTagOpen(s, ancestors[i].tag)) {
                return true;
            }
        }
    }
    return !context.source;
}

2.2.2 startsWithEndTag

判断context.source也就是template是否以'</'开头以传入的第二个参数tag结尾。

也就是判断source是否为结束标签--

function startsWithEndTagOpen(source, tag) {
    return (startsWith(source, "</") &&
        source.slice(2, 2 + tag.length).toLowerCase() === tag.toLowerCase());
}

2.2.3 parseElement

匹配以<标签名>开头的标签元素。

function parseElement(context, ancestors) {
    const element = parseTag(context, 0); //匹配标签
    ancestors.push(element);
    const children = parseChildren(context, ancestors); //见parseInterpolation返回值
    ancestors.pop();
    if (startsWithEndTagOpen(context.source, element.tag)) {
        parseTag(context, 1);
    }
    else {
        throw new Error(`缺失结束标签：${element.tag}`);
    }
    element.children = children;
    return element;
}

2.2.4 parseTag

/^</?([a-z][^\r\n\t\f />]*)/i用来匹配html标签

^ 匹配字符串的开始。
</? 匹配一个可选的 / 字符（用于闭合标签）。
([a-z][^\r\n\t\f />]*) 匹配一个标签名，具体包括：
- [a-z] 匹配一个字母（大小写不敏感，由 i 标志控制）。
- [^\r\n\t\f />]* 匹配任意数量的非特殊字符（非换行、非制表符等）。
/i 标志使得模式大小写不敏感。

e.g 正则只匹配第一个符合条件的标签

使用该正则匹配{{msg}}结果输出为['<p','p']

使用该正则匹配<div>{{msg}}</div>结果输出为['<div','div']

type标识处理的是开始标签还是结束标签，1代表为结束标签，0代表开始标签。

开始标签返回值为{ type:4, tag: 匹配标签, tagType:0}

结束标签没有返回值

function parseTag(context, type) {
    const match = /^</?([a-z][^\r\n\t\f />]*)/i.exec(context.source);
    const tag = match[1]; // 'p'
    advanceBy(context, match[0].length);  // >{{msg}}</p>
    advanceBy(context, 1); // {{msg}}</p>
    if (type === 1)
        return;
    let tagType = 0;
    return {
        type: 4,
        tag,
        tagType,
    };
}

2.2.5 advanceBy

slice切割字符串返回切割后的结果

{{msg}} --> advanceBy(str,2) --> >{{msg}}

function advanceBy(context, numberOfCharacters) {
    //返回numberOfCharacters之后的字符串
    context.source = context.source.slice(numberOfCharacters);
}

2.2.6 parseInterpolation

处理插值语法，获取变量。

function parseInterpolation(context)  { //{{msg}}</p>
    const openDelimiter = "{{";
    const closeDelimiter = "}}";
    //}}开头的字符索引
    const closeIndex = context.source.indexOf(closeDelimiter, openDelimiter.length);
    advanceBy(context, 2); //msg}}</p>
    //标签内容的开头索引
    const rawContentLength = closeIndex - openDelimiter.length;
    //获取标签内部内容
    const rawContent = context.source.slice(0, rawContentLength);
    //msg
    const preTrimContent = parseTextData(context, rawContent.length);
    //去除空格
    const content = preTrimContent.trim();
    //跳转到结束标签
    advanceBy(context, closeDelimiter.length);
    /* {
    type:2,
    content: {
    type:3,
    content:'msg'}
    }
    */
    return {
        type: 2,
        content: {
            type: 3,
            content,
        },
    };
}

2.2.7 parseTextData

处理文本内容，同时跳转到插值语法结束项

function parseTextData(context, length) {
    const rawText = context.source.slice(0, length); //}}</p>
    advanceBy(context, length);
    return rawText;
}

2.2.8 parseText

当标签内部有文本且不以插值语法开头例如111{{msg}},则第二次匹配不会进入parseInterpolation而是进入parseText处理文本内容

function parseText(context) {
    const endTokens = ["<", "{{"]; //111{{msg}}</p>
    let endIndex = context.source.length;
    for (let i = 0; i < endTokens.length; i++) {
        const index = context.source.indexOf(endTokens[i]);
        if (index !== -1 && endIndex > index) {
            endIndex = index;
        }
    }
    const content = parseTextData(context, endIndex); 
    return {
        type: 0,
        content,   //{type:0,content:'111'} 同时继续praseChildren
    };
}

2.2.9 流程图

以{{msg}}为例

flowchart LR
    main[parseChildren]
    id1{isEnd}
    id2[parseElement]
    id3[parseTag]
    id4[parseInterpolation]
    main --> id1 --> id2 --> id3
    id3--"{{msg}}< /P >"--> id2
    id2--"{{msg}}< /P >" --> main
    main--递归 --> id4
    id4 --"{type:2,content: {type:3,content:'msg'}}" --> main
    main-- 返回 -->id2

进入praseChilren 判断isEnd为false，继续执行，此时由于以<标签开头进入parseElement环节({{msg}})
praseElement首先执行parseTag,解析标签,返回{type: 4,tag:'p',tagType:0}，同时字符串跳转到标签内部({{msg}})
解析完标签后，对标签内部的children进行递归的解析，再次进入parseChildren，此时传入参数为{{msg}},且ancestors为步骤2的返回参数。
parseChildren此时由于以{{开头进入解析插值语法环节，进入parseInterpolation，解析了标签内部的插值语法返回{type:2,content: {type:3,content:'msg'}}，()
parseInterpolation返回后放入nodes后重新判断isEnd，此时isEnd为true，返回nodes，重新回到了praseElement，返回值为children同时赋值给第二步返回的标签对象的children
最终结果

const element = {
    type: 4,
    tag:'p',
    tagType:0,
    children:{
        type:2,
        content: {
            type:3,
            content:'msg'
        }
    }
}

2.3 createRoot

为返回的ast数据进行转换，详细见下一节ast处理

function createRoot(children) {
    return {
        type: 1,
        children,
        helpers: [],
    };
}