自动化利器——抽象语法树(AST)

435 阅读2分钟

AST简介

Esprima语法树解析器


Esprima ECMAScript 解析架构

  • 安装 npm i esprima
  • 使用
var esprima = require('esprima');
var s = esprima.parseScript("var a = 0;");
console.log(s);//var a = 0;
{
    "type": "Program",
    "body": [{
            "type": "VariableDeclaration",
            "declarations": [{
                    "type": "VariableDeclarator",
                    "id": {"type": "Identifier","name": "a"},
                    "init": {"type": "Literal","value": 0,"raw": "0"}
                }
            ],
            "kind": "var"
        }
    ],
    "sourceType": "script"
}

esprima.parseScript 函数定义

export function parseScript(
input: string, 
config?: ParseOptions, 
delegate?: (node: ESTree.Node, meta: any) => void): Program;

config=esprima.ParseOptions

名字类型默认值描述
jsx布尔false支持 JSX 语法
range布尔false使用每个节点基于索引的位置为其添加注释
loc布尔false为每个节点添加其基于列的位置和基于行的位置
tolerant布尔false容忍一些语法错误的情况
tokens布尔false收集每个令牌
comment布尔false收集每一行注释
var s = esprima.parseScript(`
/**
 * 求和
 * @param {*} a 变量a
 * @param {*} b 变量b
 * @returns 返回两个变量的和
 */
function add(a,b){return a+b;}
`,{comment:true,range:true,loc:true});//保留注释、节点范围、行列数据
console.log(JSON.stringify(s));
{
    "type": "Program",
    "body": [{
        "type": "FunctionDeclaration",
        "id": {
            "type": "Identifier",
            "name": "add",
            "range": [85, 88],
            "loc": {"start": {"line": 8,"column": 9},"end": {"line": 8,"column": 12}}
        },
        "params": [
            {"type": "Identifier","name": "a","range": [89, 90]}, 
            {"type": "Identifier","name": "b","range": [91, 92]}
        ],
        "body": {"//… …"},"generator": false,"expression": false, "async": false,"range":[76, 106]
        }
    ],
    "sourceType": "script",
    "range": [76, 106]
    "comments": [{
        "type": "Block",
        "value": "*\n * 求和\n * @param {*} a 变量a\n * @param {*} b 变量b\n * @returns 返回两个变量的和\n ",
        "range": [1, 75]
        }
    ]
}

delegate=(node: Node, meta: any) => void

esprima.parseScript("var a = 0;",null,(node,meta)=>{
    if(node.type=="VariableDeclaration"&&node.kind=="var")
        console.log("局部变量",node.declarations[0].id.name,"推荐使用let定义")
});
> 局部变量 a 推荐使用let定义

esprima.parseModule
函数定义

export function parseModule(
input: string, 
config?: ParseOptions, 
delegate?: (node: ESTree.Node, meta: any) => void): Program;

parseScript不一样的是parseModule返回的Program中的body类型是ModuleItem(模块项) ModuleItemStatementListItem(变量声明和执行语句列表项)多了导入和导出两个module才会用到的类型,这两个类型用的少,所以只用关心 StatementListItem

type StatementListItem = Declaration | Statement;
type ModuleItem = ImportDeclaration | ExportDeclaration | StatementListItem;

Syntax常见结构

声明包括:类声明、函数声明、变量声明

枚举 Statement

type Statement = BlockStatement | BreakStatement | ContinueStatement |
    DebuggerStatement | DoWhileStatement | EmptyStatement |
    ExpressionStatement | ForStatement | ForInStatement |
    ForOfStatement | FunctionDeclaration | IfStatement |
    LabeledStatement | ReturnStatement | SwitchStatement |
    ThrowStatement | TryStatement | VariableDeclaration |
    WhileStatement | WithStatement;

执行语句包括:块、break、continue、debugger、do while、空语句、表达式语句、for、for in、for of、function、if、标签、return、switch、throw、try、var、while、with。


Escodegen代码生成

  • 安装 npm i escodegen
  • 生成
var esprima = require('esprima');
var escodegen = require('escodegen');
var s = esprima.parseScript("var a = 0;");
console.log(escodegen.generate(s));//var a = 0;
const esprima = require('esprima');
const readline = require('readline');

const CYAN = '\x1b[36m';
const RESET = '\x1b[0m'
let source = '';

readline.createInterface({ input: process.stdin, terminal: false })
.on('line', line => { source += line + '\n' })
.on('close', () => {
    const tokens = esprima.tokenize(source, { range: true });
    const ids = tokens.filter(x => x.type === 'Identifier');
    const markers = ids.sort((a, b) => { return b.range[0] - a.range[0] });
    markers.forEach(t => {
        const id = CYAN + t.value + RESET;
        const start = t.range[0];
        const end = t.range[1];
        source = source.slice(0, start) + id + source.slice(end);
    });
    console.log(source);
});

应用场景

  • 代码分析工具
  • 高级代码模板
  • 代码内容替换
  • 自定义表达式计算
  • 自动化测试用例输出
  • 简易jsmini程序

扩展

  • 其他语言的AST库
  • NLP自然语言处理-语法解析