自动化利器——抽象语法树（AST）AST简介 Esprima语法树解析器 Esprima ECMAScript 解析架构

AST简介

Esprima语法树解析器

Esprima ECMAScript 解析架构

安装 npm i esprima
使用

var esprima = require('esprima');
var s = esprima.parseScript("var a = 0;");
console.log(s);//var a = 0;

{
    "type": "Program",
    "body": [{
            "type": "VariableDeclaration",
            "declarations": [{
                    "type": "VariableDeclarator",
                    "id": {"type": "Identifier","name": "a"},
                    "init": {"type": "Literal","value": 0,"raw": "0"}
                }
            ],
            "kind": "var"
        }
    ],
    "sourceType": "script"
}

esprima.parseScript 函数定义

export function parseScript(
input: string, 
config?: ParseOptions, 
delegate?: (node: ESTree.Node, meta: any) => void): Program;

config=esprima.ParseOptions

名字	类型	默认值	描述
jsx	布尔	false	支持 JSX 语法
range	布尔	false	使用每个节点基于索引的位置为其添加注释
loc	布尔	false	为每个节点添加其基于列的位置和基于行的位置
tolerant	布尔	false	容忍一些语法错误的情况
tokens	布尔	false	收集每个令牌
comment	布尔	false	收集每一行注释

var s = esprima.parseScript(`
/**
 * 求和
 * @param {*} a 变量a
 * @param {*} b 变量b
 * @returns 返回两个变量的和
 */
function add(a,b){return a+b;}
`,{comment:true,range:true,loc:true});//保留注释、节点范围、行列数据
console.log(JSON.stringify(s));

{
    "type": "Program",
    "body": [{
        "type": "FunctionDeclaration",
        "id": {
            "type": "Identifier",
            "name": "add",
            "range": [85, 88],
            "loc": {"start": {"line": 8,"column": 9},"end": {"line": 8,"column": 12}}
        },
        "params": [
            {"type": "Identifier","name": "a","range": [89, 90]}, 
            {"type": "Identifier","name": "b","range": [91, 92]}
        ],
        "body": {"//… …"},"generator": false,"expression": false, "async": false,"range":[76, 106]
        }
    ],
    "sourceType": "script",
    "range": [76, 106]
    "comments": [{
        "type": "Block",
        "value": "*\n * 求和\n * @param {*} a 变量a\n * @param {*} b 变量b\n * @returns 返回两个变量的和\n ",
        "range": [1, 75]
        }
    ]
}

delegate=(node: Node, meta: any) => void

esprima.parseScript("var a = 0;",null,(node,meta)=>{
    if(node.type=="VariableDeclaration"&&node.kind=="var")
        console.log("局部变量",node.declarations[0].id.name,"推荐使用let定义")
});

> 局部变量 a 推荐使用let定义

esprima.parseModule
函数定义

export function parseModule(
input: string, 
config?: ParseOptions, 
delegate?: (node: ESTree.Node, meta: any) => void): Program;

与parseScript不一样的是parseModule返回的Program中的body类型是ModuleItem（模块项） ModuleItem 比 StatementListItem（变量声明和执行语句列表项）多了导入和导出两个module才会用到的类型，这两个类型用的少，所以只用关心 StatementListItem

type StatementListItem = Declaration | Statement;
type ModuleItem = ImportDeclaration | ExportDeclaration | StatementListItem;

Syntax常见结构

声明包括：类声明、函数声明、变量声明

枚举 Statement

type Statement = BlockStatement | BreakStatement | ContinueStatement |
    DebuggerStatement | DoWhileStatement | EmptyStatement |
    ExpressionStatement | ForStatement | ForInStatement |
    ForOfStatement | FunctionDeclaration | IfStatement |
    LabeledStatement | ReturnStatement | SwitchStatement |
    ThrowStatement | TryStatement | VariableDeclaration |
    WhileStatement | WithStatement;

执行语句包括：块、break、continue、debugger、do while、空语句、表达式语句、for、for in、for of、function、if、标签、return、switch、throw、try、var、while、with。

Escodegen代码生成

安装 npm i escodegen
生成

var esprima = require('esprima');
var escodegen = require('escodegen');
var s = esprima.parseScript("var a = 0;");
console.log(escodegen.generate(s));//var a = 0;

const esprima = require('esprima');
const readline = require('readline');

const CYAN = '\x1b[36m';
const RESET = '\x1b[0m'
let source = '';

readline.createInterface({ input: process.stdin, terminal: false })
.on('line', line => { source += line + '\n' })
.on('close', () => {
    const tokens = esprima.tokenize(source, { range: true });
    const ids = tokens.filter(x => x.type === 'Identifier');
    const markers = ids.sort((a, b) => { return b.range[0] - a.range[0] });
    markers.forEach(t => {
        const id = CYAN + t.value + RESET;
        const start = t.range[0];
        const end = t.range[1];
        source = source.slice(0, start) + id + source.slice(end);
    });
    console.log(source);
});

应用场景

代码分析工具
高级代码模板
代码内容替换
自定义表达式计算
自动化测试用例输出
简易jsmini程序

扩展

其他语言的AST库
NLP自然语言处理-语法解析