模版字面量
背景知识
1、源代码在 v8 中经过 词法分析 (scanner)和语法分析(parser + pre-parser)生成 ast 树,语法分析后,经过解释器(Generator)生成字节码文件。
2、v8 编译阶段有词法分析(scanner),在 源码中 扫描器 scanner 类的 Initialize 开启对第一个 词法进行解析。
3、v8 词法解析后,获得 tokens ,然后根据 tokens 进行语法解析 (parser),获得 ast 树。
4、ast 树经过 BytecodeGenerator 转为字节码。
词法分析阶段-scanner
1、扫描器 在词法分析阶段,scan 对 js 字符串中的单个词法的扫描(对应方法 ScanSingleToken),在 ScanSingleToken 中,判断单个词法等于 TEMPLATE_SPAN 时,调用方法 ScanTemplateSpan(),对应代码:
V8_INLINE Token::Value Scanner::ScanSingleToken() {
Token::Value token;
do {
next().location.beg_pos = source_pos();
if (V8_LIKELY(static_cast<unsigned>(c0_) <= kMaxAscii)) {
token = one_char_tokens[c0_];
switch (token) {
case Token::LPAREN:
// 此处省略。。。。
case Token::TEMPLATE_SPAN:
Advance();
return ScanTemplateSpan();
// 此处省略。。。。
default:
UNREACHABLE();
}
}
// Continue scanning for tokens as long as we're just skipping whitespace.
} while (token == Token::WHITESPACE);
return token;
}
在 ScanTemplateSpan 方法中 ,通过 单个字符的扫描,匹配出 TEMPLATE_TAIL ( ` | } 字符 ) 和 TEMPLATE_SPAN(${ | } 字符),记下 了 templateLiteral 中字符的位置,同时,处理了 \ 和 \n 等特殊字符。
Token::Value Scanner::ScanTemplateSpan() {
// When scanning a TemplateSpan, we are looking for the following construct:
// TEMPLATE_SPAN ::
// ` LiteralChars* ${
// | } LiteralChars* ${
//
// TEMPLATE_TAIL ::
// ` LiteralChars* `
// | } LiteralChar* `
//
// A TEMPLATE_SPAN should always be followed by an Expression, while a
// TEMPLATE_TAIL terminates a TemplateLiteral and does not need to be
// followed by an Expression.
// These scoped helpers save and restore the original error state, so that we
// can specially treat invalid escape sequences in templates (which are
// handled by the parser).
ErrorState scanner_error_state(&scanner_error_, &scanner_error_location_);
ErrorState octal_error_state(&octal_message_, &octal_pos_);
Token::Value result = Token::TEMPLATE_SPAN;
next().literal_chars.Start();
next().raw_literal_chars.Start();
const bool capture_raw = true;
while (true) {
base::uc32 c = c0_;
if (c == '`') {
Advance(); // Consume '`'
result = Token::TEMPLATE_TAIL;
break;
} else if (c == '$' && Peek() == '{') {
Advance(); // Consume '$'
Advance(); // Consume '{'
break;
} else if (c == '\\') {
Advance(); // Consume '\\'
DCHECK(!unibrow::IsLineTerminator(kEndOfInput));
if (capture_raw) AddRawLiteralChar('\\');
if (unibrow::IsLineTerminator(c0_)) {
// The TV of LineContinuation :: \ LineTerminatorSequence is the empty
// code unit sequence.
base::uc32 lastChar = c0_;
Advance();
if (lastChar == '\r') {
// Also skip \n.
if (c0_ == '\n') Advance();
lastChar = '\n';
}
if (capture_raw) AddRawLiteralChar(lastChar);
} else {
bool success = ScanEscape<capture_raw>();
USE(success);
DCHECK_EQ(!success, has_error());
// For templates, invalid escape sequence checking is handled in the
// parser.
scanner_error_state.MoveErrorTo(next_);
octal_error_state.MoveErrorTo(next_);
}
} else if (c == kEndOfInput) {
// Unterminated template literal
break;
} else {
Advance(); // Consume c.
// The TRV of LineTerminatorSequence :: <CR> is the CV 0x000A.
// The TRV of LineTerminatorSequence :: <CR><LF> is the sequence
// consisting of the CV 0x000A.
if (c == '\r') {
if (c0_ == '\n') Advance(); // Consume '\n'
c = '\n';
}
if (capture_raw) AddRawLiteralChar(c);
AddLiteralChar(c);
}
}
next().location.end_pos = source_pos();
next().token = result;
return result;
}
上面是获取 templateLiteral 的 tokens 的步骤。
语法分析阶段-sparser
2、parser parser 类使用了重载,与 templateLiteral 有关的部分如下 :
Parser::TemplateLiteralState Parser::OpenTemplateLiteral(int pos) {
return zone()->New<TemplateLiteral>(zone(), pos);
}
void Parser::AddTemplateSpan(TemplateLiteralState* state, bool should_cook,
bool tail) {
int end = scanner()->location().end_pos - (tail ? 1 : 2);
const AstRawString* raw = scanner()->CurrentRawSymbol(ast_value_factory());
if (should_cook) {
const AstRawString* cooked = scanner()->CurrentSymbol(ast_value_factory());
(*state)->AddTemplateSpan(cooked, raw, end, zone());
} else {
(*state)->AddTemplateSpan(nullptr, raw, end, zone());
}
}
void Parser::AddTemplateExpression(TemplateLiteralState* state,
Expression* expression) {
(*state)->AddExpression(expression, zone());
}
Expression* Parser::CloseTemplateLiteral(TemplateLiteralState* state, int start,
Expression* tag) {
TemplateLiteral* lit = *state;
int pos = lit->position();
const ZonePtrList<const AstRawString>* cooked_strings = lit->cooked();
const ZonePtrList<const AstRawString>* raw_strings = lit->raw();
const ZonePtrList<Expression>* expressions = lit->expressions();
DCHECK_EQ(cooked_strings->length(), raw_strings->length());
DCHECK_EQ(cooked_strings->length(), expressions->length() + 1);
if (!tag) {
if (cooked_strings->length() == 1) {
return factory()->NewStringLiteral(cooked_strings->first(), pos);
}
return factory()->NewTemplateLiteral(cooked_strings, expressions, pos);
} else {
// GetTemplateObject
Expression* template_object =
factory()->NewGetTemplateObject(cooked_strings, raw_strings, pos);
// Call TagFn
ScopedPtrList<Expression> call_args(pointer_buffer());
call_args.Add(template_object);
call_args.AddAll(expressions->ToConstVector());
return factory()->NewTaggedTemplate(tag, call_args, pos);
}
}
生成字节码阶段-BytecodeGenerator
AstVisitor 抽象语法树访问类(基于Visitor设计模式来设计),这个类 使用类 重载的方式,解析类 TemplateLiteral,具体代码如下(VisitTemplateLiteral),最终生成字节码迭代器,这里有两个变量需要注意一下,一个是 字符串占位数组 substitutions,这个为 TemplateLiteral 中的变量,一个是 string_parts,这个是 TemplateLiteral 中的 不需要替代的部分;
void BytecodeGenerator::VisitTemplateLiteral(TemplateLiteral* expr) {
const ZonePtrList<const AstRawString>& parts = *expr->string_parts();
const ZonePtrList<Expression>& substitutions = *expr->substitutions();
// Template strings with no substitutions are turned into StringLiterals.
DCHECK_GT(substitutions.length(), 0);
DCHECK_EQ(parts.length(), substitutions.length() + 1);
// Generate string concatenation
// TODO(caitp): Don't generate feedback slot if it's not used --- introduce
// a simple, concise, reusable mechanism to lazily create reusable slots.
FeedbackSlot slot = feedback_spec()->AddBinaryOpICSlot();
Register last_part = register_allocator()->NewRegister();
bool last_part_valid = false;
builder()->SetExpressionPosition(expr);
for (int i = 0; i < substitutions.length(); ++i) {
if (i != 0) {
builder()->StoreAccumulatorInRegister(last_part);
last_part_valid = true;
}
if (!parts[i]->IsEmpty()) {
builder()->LoadLiteral(parts[i]);
if (last_part_valid) {
builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
}
builder()->StoreAccumulatorInRegister(last_part);
last_part_valid = true;
}
TypeHint type_hint = VisitForAccumulatorValue(substitutions[i]);
if (type_hint != TypeHint::kString) {
builder()->ToString();
}
if (last_part_valid) {
builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
}
last_part_valid = false;
}
if (!parts.last()->IsEmpty()) {
builder()->StoreAccumulatorInRegister(last_part);
builder()->LoadLiteral(parts.last());
builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
}
}
测试用例
到此,v8 的任务完成,在 v8 的测试用例(v8 源码中的 test文件夹)中也可以发现下面的代码。
TEST(TemplateLiterals) {
InitializedIgnitionHandleScope scope;
BytecodeExpectationsPrinter printer(CcTest::isolate());
const char* snippets[] = {
"var a = 1;\n"
"var b = 2;\n"
"return `${a}${b}string`;\n",
"var a = 1;\n"
"var b = 2;\n"
"return `string${a}${b}`;\n",
"var a = 1;\n"
"var b = 2;\n"
"return `${a}string${b}`;\n",
"var a = 1;\n"
"var b = 2;\n"
"return `foo${a}bar${b}baz${1}`;\n",
"var a = 1;\n"
"var b = 2;\n"
"return `${a}string` + `string${b}`;\n",
"var a = 1;\n"
"var b = 2;\n"
"function foo(a, b) { };\n"
"return `string${foo(a, b)}${a}${b}`;\n",
};
CHECK(CompareTexts(BuildActual(printer, snippets),
LoadGolden("TemplateLiterals.golden")));
}
生成字节码函数: BytecodeGenerator 生成最终的字节码: GenerateBytecodeBody
参考链接
1、用JavaScript带你体验V8引擎解析字符串 www.cnblogs.com/QH-Jimmy/p/…
2、JavaScript 引擎(V8)是如何工作的 segmentfault.com/a/119000002…