v8-字面量模版相关代码

525 阅读1分钟

模版字面量

背景知识

1、源代码在 v8 中经过 词法分析 (scanner)和语法分析(parser + pre-parser)生成 ast 树,语法分析后,经过解释器(Generator)生成字节码文件。

2、v8 编译阶段有词法分析(scanner),在 源码中 扫描器 scanner 类的 Initialize 开启对第一个 词法进行解析。

3、v8 词法解析后,获得 tokens ,然后根据 tokens 进行语法解析 (parser),获得 ast 树。

4、ast 树经过 BytecodeGenerator 转为字节码。

词法分析阶段-scanner

1、扫描器 在词法分析阶段,scan 对 js 字符串中的单个词法的扫描(对应方法 ScanSingleToken),在 ScanSingleToken 中,判断单个词法等于 TEMPLATE_SPAN 时,调用方法 ScanTemplateSpan(),对应代码:

V8_INLINE Token::Value Scanner::ScanSingleToken() {
  Token::Value token;
  do {
    next().location.beg_pos = source_pos();

    if (V8_LIKELY(static_cast<unsigned>(c0_) <= kMaxAscii)) {
      token = one_char_tokens[c0_];

      switch (token) {
        case Token::LPAREN:
        //  此处省略。。。。
        case Token::TEMPLATE_SPAN:
          Advance();
          return ScanTemplateSpan();
        //  此处省略。。。。
        default:
          UNREACHABLE();
      }
    }
    // Continue scanning for tokens as long as we're just skipping whitespace.
  } while (token == Token::WHITESPACE);

  return token;
}

在 ScanTemplateSpan 方法中 ,通过 单个字符的扫描,匹配出 TEMPLATE_TAIL ( ` | } 字符 ) 和 TEMPLATE_SPAN(${ | } 字符),记下 了 templateLiteral 中字符的位置,同时,处理了 \ 和 \n 等特殊字符。

Token::Value Scanner::ScanTemplateSpan() {
  // When scanning a TemplateSpan, we are looking for the following construct:
  // TEMPLATE_SPAN ::
  //     ` LiteralChars* ${
  //   | } LiteralChars* ${
  //
  // TEMPLATE_TAIL ::
  //     ` LiteralChars* `
  //   | } LiteralChar* `
  //
  // A TEMPLATE_SPAN should always be followed by an Expression, while a
  // TEMPLATE_TAIL terminates a TemplateLiteral and does not need to be
  // followed by an Expression.

  // These scoped helpers save and restore the original error state, so that we
  // can specially treat invalid escape sequences in templates (which are
  // handled by the parser).
  ErrorState scanner_error_state(&scanner_error_, &scanner_error_location_);
  ErrorState octal_error_state(&octal_message_, &octal_pos_);

  Token::Value result = Token::TEMPLATE_SPAN;
  next().literal_chars.Start();
  next().raw_literal_chars.Start();
  const bool capture_raw = true;
  while (true) {
    base::uc32 c = c0_;
    if (c == '`') {
      Advance();  // Consume '`'
      result = Token::TEMPLATE_TAIL;
      break;
    } else if (c == '$' && Peek() == '{') {
      Advance();  // Consume '$'
      Advance();  // Consume '{'
      break;
    } else if (c == '\\') {
      Advance();  // Consume '\\'
      DCHECK(!unibrow::IsLineTerminator(kEndOfInput));
      if (capture_raw) AddRawLiteralChar('\\');
      if (unibrow::IsLineTerminator(c0_)) {
        // The TV of LineContinuation :: \ LineTerminatorSequence is the empty
        // code unit sequence.
        base::uc32 lastChar = c0_;
        Advance();
        if (lastChar == '\r') {
          // Also skip \n.
          if (c0_ == '\n') Advance();
          lastChar = '\n';
        }
        if (capture_raw) AddRawLiteralChar(lastChar);
      } else {
        bool success = ScanEscape<capture_raw>();
        USE(success);
        DCHECK_EQ(!success, has_error());
        // For templates, invalid escape sequence checking is handled in the
        // parser.
        scanner_error_state.MoveErrorTo(next_);
        octal_error_state.MoveErrorTo(next_);
      }
    } else if (c == kEndOfInput) {
      // Unterminated template literal
      break;
    } else {
      Advance();  // Consume c.
      // The TRV of LineTerminatorSequence :: <CR> is the CV 0x000A.
      // The TRV of LineTerminatorSequence :: <CR><LF> is the sequence
      // consisting of the CV 0x000A.
      if (c == '\r') {
        if (c0_ == '\n') Advance();  // Consume '\n'
        c = '\n';
      }
      if (capture_raw) AddRawLiteralChar(c);
      AddLiteralChar(c);
    }
  }
  next().location.end_pos = source_pos();
  next().token = result;

  return result;
}

上面是获取 templateLiteral 的 tokens 的步骤。

语法分析阶段-sparser

2、parser parser 类使用了重载,与 templateLiteral 有关的部分如下 :

Parser::TemplateLiteralState Parser::OpenTemplateLiteral(int pos) {
  return zone()->New<TemplateLiteral>(zone(), pos);
}

void Parser::AddTemplateSpan(TemplateLiteralState* state, bool should_cook,
                             bool tail) {
  int end = scanner()->location().end_pos - (tail ? 1 : 2);
  const AstRawString* raw = scanner()->CurrentRawSymbol(ast_value_factory());
  if (should_cook) {
    const AstRawString* cooked = scanner()->CurrentSymbol(ast_value_factory());
    (*state)->AddTemplateSpan(cooked, raw, end, zone());
  } else {
    (*state)->AddTemplateSpan(nullptr, raw, end, zone());
  }
}

void Parser::AddTemplateExpression(TemplateLiteralState* state,
                                   Expression* expression) {
  (*state)->AddExpression(expression, zone());
}

Expression* Parser::CloseTemplateLiteral(TemplateLiteralState* state, int start,
                                         Expression* tag) {
  TemplateLiteral* lit = *state;
  int pos = lit->position();
  const ZonePtrList<const AstRawString>* cooked_strings = lit->cooked();
  const ZonePtrList<const AstRawString>* raw_strings = lit->raw();
  const ZonePtrList<Expression>* expressions = lit->expressions();
  DCHECK_EQ(cooked_strings->length(), raw_strings->length());
  DCHECK_EQ(cooked_strings->length(), expressions->length() + 1);

  if (!tag) {
    if (cooked_strings->length() == 1) {
      return factory()->NewStringLiteral(cooked_strings->first(), pos);
    }
    return factory()->NewTemplateLiteral(cooked_strings, expressions, pos);
  } else {
    // GetTemplateObject
    Expression* template_object =
        factory()->NewGetTemplateObject(cooked_strings, raw_strings, pos);

    // Call TagFn
    ScopedPtrList<Expression> call_args(pointer_buffer());
    call_args.Add(template_object);
    call_args.AddAll(expressions->ToConstVector());
    return factory()->NewTaggedTemplate(tag, call_args, pos);
  }
}

生成字节码阶段-BytecodeGenerator

AstVisitor 抽象语法树访问类(基于Visitor设计模式来设计),这个类 使用类 重载的方式,解析类 TemplateLiteral,具体代码如下(VisitTemplateLiteral),最终生成字节码迭代器,这里有两个变量需要注意一下,一个是 字符串占位数组 substitutions,这个为 TemplateLiteral 中的变量,一个是 string_parts,这个是 TemplateLiteral 中的 不需要替代的部分;

void BytecodeGenerator::VisitTemplateLiteral(TemplateLiteral* expr) {
  const ZonePtrList<const AstRawString>& parts = *expr->string_parts();
  const ZonePtrList<Expression>& substitutions = *expr->substitutions();
  // Template strings with no substitutions are turned into StringLiterals.
  DCHECK_GT(substitutions.length(), 0);
  DCHECK_EQ(parts.length(), substitutions.length() + 1);

  // Generate string concatenation
  // TODO(caitp): Don't generate feedback slot if it's not used --- introduce
  // a simple, concise, reusable mechanism to lazily create reusable slots.
  FeedbackSlot slot = feedback_spec()->AddBinaryOpICSlot();
  Register last_part = register_allocator()->NewRegister();
  bool last_part_valid = false;

  builder()->SetExpressionPosition(expr);
  for (int i = 0; i < substitutions.length(); ++i) {
    if (i != 0) {
      builder()->StoreAccumulatorInRegister(last_part);
      last_part_valid = true;
    }

    if (!parts[i]->IsEmpty()) {
      builder()->LoadLiteral(parts[i]);
      if (last_part_valid) {
        builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
      }
      builder()->StoreAccumulatorInRegister(last_part);
      last_part_valid = true;
    }

    TypeHint type_hint = VisitForAccumulatorValue(substitutions[i]);
    if (type_hint != TypeHint::kString) {
      builder()->ToString();
    }
    if (last_part_valid) {
      builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
    }
    last_part_valid = false;
  }

  if (!parts.last()->IsEmpty()) {
    builder()->StoreAccumulatorInRegister(last_part);
    builder()->LoadLiteral(parts.last());
    builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot));
  }
}


测试用例

到此,v8 的任务完成,在 v8 的测试用例(v8 源码中的 test文件夹)中也可以发现下面的代码。

TEST(TemplateLiterals) {
  InitializedIgnitionHandleScope scope;
  BytecodeExpectationsPrinter printer(CcTest::isolate());

  const char* snippets[] = {
      "var a = 1;\n"
      "var b = 2;\n"
      "return `${a}${b}string`;\n",

      "var a = 1;\n"
      "var b = 2;\n"
      "return `string${a}${b}`;\n",

      "var a = 1;\n"
      "var b = 2;\n"
      "return `${a}string${b}`;\n",

      "var a = 1;\n"
      "var b = 2;\n"
      "return `foo${a}bar${b}baz${1}`;\n",

      "var a = 1;\n"
      "var b = 2;\n"
      "return `${a}string` + `string${b}`;\n",

      "var a = 1;\n"
      "var b = 2;\n"
      "function foo(a, b) { };\n"
      "return `string${foo(a, b)}${a}${b}`;\n",
  };

  CHECK(CompareTexts(BuildActual(printer, snippets),
                     LoadGolden("TemplateLiterals.golden")));
}

生成字节码函数: BytecodeGenerator 生成最终的字节码: GenerateBytecodeBody

参考链接

1、用JavaScript带你体验V8引擎解析字符串 www.cnblogs.com/QH-Jimmy/p/…

2、JavaScript 引擎(V8)是如何工作的 segmentfault.com/a/119000002…