HJW: inside the V8 engine + 5 tips on how to write optimized code

Couple of weeks ago we started a series aimed at digging deeper into JavaScript and how it actually works: we thought that by knowing the building blocks of JavaScript and how they come to play together you’ll be able to write better code and apps.

The first post of the series focused on providing an overview of the engine, the runtime and the call stack. This second post will be diving into the internal parts of Google’s V8 JavaScript engine. We’ll also provide a few quick tips on how to write better JavaScript code —best practices our development team at SessionStack follows when building the product.

几个星期前我们开始一个系列，目标是深入挖掘 JavaScript 和它的实际工作原理：我们认为知道JavaScript构建代码块和块之间如何互相作用能够让你写出更好的代码和程序。

第一篇文章聚焦与提供一个引擎的概览，运行时和访问栈。现在第二篇文章将会深入研究谷歌的V8 JavaScript 引擎。我们也会提供一些小建议——为了写出更好的代码，我们开发团队在构建SessionStack时遵循的最佳实践。

Overview

A JavaScript engine is a program or an interpreter which executes JavaScript code. A JavaScript engine can be implemented as a standard interpreter, or just-in-time compiler that compiles JavaScript to bytecode in some form.

This is a list of popular projects that are implementing a JavaScript engine:

V8 — open source, developed by Google, written in C++
Rhino — managed by the Mozilla Foundation, open source, developed entirely in Java
SpiderMonkey — the first JavaScript engine, which back in the days powered Netscape Navigator, and today powers Firefox
JavaScriptCore — open source, marketed as Nitro and developed by Apple for Safari
KJS — KDE’s engine originally developed by Harri Porten for the KDE project’s Konqueror web browser
Chakra (JScript9) — Internet Explorer 删除线是我加的
Chakra (JavaScript) — Microsoft Edge
Nashorn, open source as part of OpenJDK, written by Oracle Java Languages and Tool Group
JerryScript — is a lightweight engine for the Internet of Things.

JavaScript 引擎是一个执行JavaScript代码的程序或者解释器。可以被实现成一个标准的解释器，或者以某种形式编译JavaScript为字节码的及时编译器。

Why was the V8 Engine created?

The V8 Engine which is built by Google is open source and written in C++ . This engine is used inside Google Chrome. Unlike the rest of the engines, however, V8 is also used for the popular Node.js runtime.

V8 was first designed to increase the performance of JavaScript execution inside web browsers. In order to obtain speed, V8 translates JavaScript code into more efficient machine code instead of using an interpreter. It compiles JavaScript code into machine code at execution by implementing a JIT (Just-In-Time) compiler like a lot of modern JavaScript engines do such as SpiderMonkey or Rhino (Mozilla). The main difference here is that V8 doesn’t produce bytecode or any intermediate code.

V8同时也是node的运行时。

一开始设计出来是为了提高浏览器执行JavaScript的性能，为了获得速度，V8将JavaScript代码翻译为更高效的机器码而不是解释器。它和其他现代JavaScript引擎SpiderMonkey 或者Rhino一样，通过实现一个及时编译器来完成这件事，不同之处在于V8在这过程中不会产生字节码或者其他任何的中间代码。

V8 used to have two compilers

Before version 5.9 of V8 came out (released earlier this year), the engine used two compilers:

full-codegen — a simple and very fast compiler that produced simple and relatively slow machine code.
Crankshaft — a more complex (Just-In-Time) optimizing compiler that produced highly-optimized code.

The V8 Engine also uses several threads internally:

The main thread does what you would expect: fetch your code, compile it and then execute it
There’s also a separate thread for compiling, so that the main thread can keep executing while the former is optimizing the code
A Profiler thread that will tell the runtime on which methods we spend a lot of time so that Crankshaft can optimize them
A few threads to handle Garbage Collector sweeps

When first executing the JavaScript code, V8 leverages full-codegen which directly translates the parsed JavaScript into machine code without any transformation. This allows it to start executing machine code very fast. Note that V8 does not use intermediate bytecode representation this way removing the need for an interpreter.

When your code has run for some time, the profiler thread has gathered enough data to tell which method should be optimized.

Next, Crankshaft optimizations begin in another thread. It translates the JavaScript abstract syntax tree to a high-level static single-assignment (SSA) representation called Hydrogen and tries to optimize that Hydrogen graph. Most optimizations are done at this level.

在5.9版本前，V8使用两个编译器：

full-codegen — 一个简单又快速的编译器，能生产简单但相对较慢的机器码

crankshaft — 一个更加复杂的（实时）优化编译器，产生高度优化的代码

V8引擎内部使用多线程：

主线程就是做你想象中的事：获取代码，编译，然后执行它
也有另一个为了编译的独立线程，当这个线程进行优化代码的时候，主线程继续执行代码不会被打断
一个剖析器线程将会告诉运行时哪个方法我们花费了大量时间，然后crankshaft去优化它们
几个线程会处理垃圾收集的收尾

当第一次执行JavaScript代码，V8利用full-codegen直接将被解析过的JavaScript翻译成机器码执行而不做更多工作。这让它开始执行速度就非常快。注意V8不使用中间代码这意味着不需要构建一个解释器。

当你的代码已经跑了一段时间，剖析器线程已经收集到足够的数据来告诉引擎哪一个方法应该被优化。接下来，crankshaft优化在另一个线程开始。它将JavaScript抽象语法树转换成一个高阶静态简单赋值（SSA）表示，称为Hydrogen，而且尝试优化Hydrogen图。大多数优化是在这个层面上完成的。

Inlining

The first optimization is inlining as much code as possible in advance. Inlining is the process of replacing a call site (the line of code where the function is called) with the body of the called function. This simple step allows following optimizations to be more meaningful.

第一个优化是提前尽可能多地内联代码。内联是用被调用的函数的主体取代调用点（调用函数的那行代码）的过程。这个简单的步骤使后续的优化更有意义。

Hidden class

JavaScript is a prototype-based language: there are no classes and objects are created using a cloning process. JavaScript is also a dynamic programming language which means that properties can be easily added or removed from an object after its instantiation.

Most JavaScript interpreters use dictionary-like structures (hash function based) to store the location of object property values in the memory. This structure makes retrieving the value of a property in JavaScript more computationally expensive than it would be in a non-dynamic programming language like Java or C#. In Java, all of the object properties are determined by a fixed object layout before compilation and cannot be dynamically added or removed at runtime (well, C# has the dynamic type which is another topic). As a result, the values of properties (or pointers to those properties) can be stored as a continuous buffer in the memory with a fixed-offset between each. The length of an offset can easily be determined based on the property type, whereas this is not possible in JavaScript where a property type can change during runtime.

Since using dictionaries to find the location of object properties in the memory is very inefficient, V8 uses a different method instead: hidden classes. Hidden classes work similarly to the fixed object layouts (classes) used in languages like Java, except they are created at runtime. Now, let’s what they actually look like:

function Point(x, y) {
    this.x = x;
    this.y = y;
}
var p1 = new Point(1, 2);

Once the “new Point(1, 2)” invocation happens, V8 will create a hidden class called “C0”. No properties have been defined for Point yet, so “C0” is empty.

Once the first statement “this.x = x” is executed (inside the “Point” function), V8 will create a second hidden class called “C1” that is based on “C0”. “C1” describes the location in the memory (relative to the object pointer) where the property x can be found. In this case, “x” is stored at offset 0, which means that when viewing a point object in the memory as a continuous buffer, the first offset will correspond to property “x”. V8 will also update “C0” with a “class transition” which states that if a property “x” is added to a point object, the hidden class should switch from “C0” to “C1”. The hidden class for the point object below is now “C1”.

Every time a new property is added to an object, the old hidden class is updated with a transition path to the new hidden class. Hidden class transitions are important because they allow hidden classes to be shared among objects that are created the same way. If two objects share a hidden class and the same property is added to both of them, transitions will ensure that both objects receive the same new hidden class and all the optimized code that comes with it.

This process is repeated when the statement “this.y = y” is executed (again, inside the Point function, after the “this.x = x” statement).

A new hidden class called “C2” is created, a class transition is added to “C1” stating that if a property “y” is added to a Point object (that already contains property “x”) then the hidden class should change to “C2”, and the point object’s hidden class is updated to “C2”.

Hidden class transitions are dependent on the order in which properties are added to an object. Take a look at the code snippet below:

function Point(x, y) {
    this.x = x;
    this.y = y;
}
var p1 = new Point(1, 2);
p1.a = 5;
p1.b = 6;
var p2 = new Point(3, 4);
p2.b = 7;
p2.a = 8;

Now, you would assume that for both p1 and p2 the same hidden classes and transitions would be used. Well, not really. For “p1”, first the property “a” will be added and then the property “b”. For “p2”, however, first “b” is being assigned, followed by “a”. Thus, “p1” and “p2” end up with different hidden classes as a result of the different transition paths. In such cases, it’s much better to initialize dynamic properties in the same order so that the hidden classes can be reused.

JavaScript是一种基于原型的语言：没有类的概念，对象是通过克隆过程创建的。JavaScript也是一种动态编程语言，这意味着在对象实例化之后，可以很容易的从对象中添加或删除属性。

大多数JavaScript解释器采用基于hash的字典结构来存储内存中对象属性的位置，这种结构使得JavaScript中检索一个属性的值比在c#或者Java这种非动态编程语言中的计算成本更高。【在Java中，所有的对象属性都是在编译前由固定的对象布局决定的，在运行时不能动态地添加或删除（好吧，C#有动态类型，这是另一个话题）机翻】因此，属性的值（或指向这些属性的指针）可以作为一个连续的缓冲区存储在内存中，每个属性之间有一个固定的偏移量。偏移量的长度可以很容易地根据属性类型来确定，（这让我想到计组里的PC指针）而这在JavaScript中是不可能的，因为属性类型可以在运行时改变。

由于使用字典来寻找对象属性在内存中的位置是非常低效的，很自然的，V8使用了一种不同的方法来代替：隐藏类。隐藏类的工作原理类似于Java等语言中使用的固定对象布局（类），只不过它们是在运行时创建的。

function Point(x, y) {
    this.x = x;
    this.y = y;
}
var p1 = new Point(1, 2);

一旦 “new Point(1, 2)” 执行，V8将会创建一个叫做C0的隐藏类. 现在Point还没有定义任何属性，所以C0是空的。

一旦第一条语句 "this.x = x "被执行（在 "Point "函数内部），V8将创建第二个隐藏类，称为 "C1"，它是基于 "C0 "的。"C1 "描述了内存中（相对于对象指针）可以找到属性x的位置。在这种情况下，"x "被存储在偏移量0处，这意味着当把内存中的点对象看成一个连续的缓冲区时，第一个偏移量将对应于属性 "x"。V8还将用一个 "类转换 "来更新 "C0"，它指出如果一个属性 "x "被添加到一个点对象中，隐藏的类应该从 "C0 "转换到 "C1"。下面这个点对象的隐藏类现在是 "C1"。每当一个新的属性被添加到一个对象上时，旧的隐藏类就会被更新为新的隐藏类的过渡路径。隐藏类的转换很重要，因为它们允许隐藏类在以相同方式创建的对象之间共享。如果两个对象共享一个隐藏类，并且相同的属性被添加到两个对象中，过渡将确保两个对象都收到相同的新隐藏类以及随之而来的所有优化代码。

当语句 "this.y = y "被执行时，这个过程会被重复（同样，在Point函数中，在 "this.x = x "语句之后）。

一个新的名为 "C2 "的隐藏类被创建，一个类转换被添加到 "C1 "中，说明如果一个属性 "y "被添加到一个点对象（已经包含属性 "x"），那么隐藏类应该改变为 "C2"，并且点对象的隐藏类被更新为 "C2"。

隐藏类的转换取决于属性被添加到对象中的顺序。请看下面的代码片段:

function Point(x, y) {
    this.x = x;
    this.y = y;
}
var p1 = new Point(1, 2);
p1.a = 5;
p1.b = 6;
var p2 = new Point(3, 4);
p2.b = 7;
p2.a = 8;

现在，你会认为，对于p1和p2来说，会使用相同的隐藏类和过渡。好吧，其实不然。对于 "p1"，首先将添加属性 "a"，然后是属性 "b"。然而，对于 "p2"，首先是 "b "被分配，然后是 "a"。因此，"p1 "和 "p2 "最终会因为不同的过渡路径而产生不同的隐藏类。在这种情况下，以相同的顺序初始化动态属性会更好，这样隐藏类就可以被重用。

Inline caching

V8 takes advantage of another technique for optimizing dynamically typed languages called inline caching. Inline caching relies on the observation that repeated calls to the same method tend to occur on the same type of object. An in-depth explanation of inline caching can be found here.

We’re going to touch upon the general concept of inline caching (in case you don’t have the time to go through the in-depth explanation above).

So how does it work? V8 maintains a cache of the type of objects that were passed as a parameter in recent method calls and uses this information to make an assumption about the type of object that will be passed as a parameter in the future. If V8 is able to make a good assumption about the type of object that will be passed to a method, it can bypass the process of figuring out how to access the object’s properties, and instead, use the stored information from previous lookups to the object’s hidden class.

So how are the concepts of hidden classes and inline caching related? Whenever a method is called on a specific object, the V8 engine has to perform a lookup to the hidden class of that object in order to determine the offset for accessing a specific property. After two successful calls of the same method to the same hidden class, V8 omits the hidden class lookup and simply adds the offset of the property to the object pointer itself. For all future calls of that method, the V8 engine assumes that the hidden class hasn’t changed, and jumps directly into the memory address for a specific property using the offsets stored from previous lookups. This greatly increases execution speed.

Inline caching is also the reason why it’s so important that objects of the same type share hidden classes. If you create two objects of the same type and with different hidden classes (as we did in the example earlier), V8 won’t be able to use inlinecaching because even though the two objects are of the same type, their corresponding hidden classes assign different offsets to their properties.

The two objects are basically the same but the “a” and “b” properties were created in different order.

V8利用了另一种优化动态类型语言的技术，即内联缓存。内联缓存依赖于这样的观察：对同一方法的重复调用往往发生在同一类型的对象上。关于内联缓存的深入解释可以在这里找到。

我们将谈谈内联缓存的一般概念（以防你没有时间去看上面的深入解释）。

那么它是如何工作的呢？V8对最近的方法调用中作为参数传递的对象的类型进行了缓存，并使用这些信息对未来作为参数传递的对象的类型进行了假设。如果V8能够对将被传递给方法的对象的类型做出一个很好的假设，它就可以绕过弄清楚如何访问对象的属性的过程，而是使用以前查找对象的隐藏类的存储信息。

那么，隐藏类和内联缓存的概念是如何关联的呢？每当一个方法在一个特定的对象上被调用时，V8引擎必须对该对象的隐藏类进行查询，以确定访问特定属性的偏移量。在两次成功调用同一方法到同一隐藏类后，V8省略了隐藏类的查找，只是将属性的偏移量添加到对象指针本身。对于该方法的所有未来调用，V8引擎假定隐藏类没有改变，并使用从以前的查找中存储的偏移量直接跳转到特定属性的内存地址。这大大提高了执行速度。

内联缓存也是相同类型的对象共享隐藏类的重要原因。如果你创建了两个具有相同类型和不同隐藏类的对象（就像我们在前面的例子中做的那样），V8将不能使用内联缓存，因为即使这两个对象具有相同的类型，它们相应的隐藏类为其属性分配了不同的偏移量。这两个对象基本相同，但 "a "和 "b "属性的创建顺序不同。

Compilation to machine code

Once the Hydrogen graph is optimized, Crankshaft lowers it to a lower-level representation called Lithium. Most of the Lithium implementation is architecture-specific. Register allocation happens at this level.

In the end, Lithium is compiled into machine code. Then something else happens called OSR: on-stack replacement. Before we started compiling and optimizing an obviously long-running method, we were likely running it. V8 is not going to forget what it just slowly executed to start again with the optimized version. Instead, it will transform all the context we have (stack, registers) so that we can switch to the optimized version in the middle of the execution. This is a very complex task, having in mind that among other optimizations, V8 has inlined the code initially. V8 is not the only engine capable of doing it.

There are safeguards called deoptimization to make the opposite transformation and reverts back to the non-optimized code in case an assumption the engine made doesn’t hold true anymore.

一旦 Hydrogen graph 被优化，Crankshaft就会把它降低到一个叫Lithium的低级别的表示。大部分Lithium的实现都是针对架构的。寄存器分配就发生在这个层次。

最后，Lithium被编译成机器代码。然后发生了另一件事，叫做OSR：堆栈上的替换。在我们开始编译和优化一个明显长期运行的方法之前，我们很可能正在运行它。V8不会忘记它刚刚慢慢执行的东西，以优化后的版本重新开始。相反，它将转换我们所有的上下文（堆栈、寄存器），这样我们就可以在执行的过程中切换到优化版本。这是一个非常复杂的任务，要知道，在其他优化中，V8最初是对代码进行了内联。V8并不是唯一能够做到这一点的引擎。

有一些被称为 "去优化 "的保障措施，以进行相反的转换，并在引擎所做的假设不再成立的情况下恢复到非优化的代码。

Garbage collection

For garbage collection, V8 uses a traditional generational approach of mark-and-sweep to clean the old generation. The marking phase is supposed to stop the JavaScript execution. In order to control GC costs and make the execution more stable, V8 uses incremental marking: instead of walking the whole heap, trying to mark every possible object, it only walk part of the heap, then resumes normal execution. The next GC stop will continue from where the previous heap walk has stopped. This allows for very short pauses during the normal execution. As mentioned before, the sweep phase is handled by separate threads.

对于垃圾收集，V8使用传统的标记和扫除的生成方法来清理旧的生成物。标记阶段应该是停止JavaScript的执行。为了控制GC成本并使执行更加稳定，V8使用了增量标记：它不是走遍整个堆，试图标记每个可能的对象，而是只走部分堆，然后恢复正常执行。下一次GC停止将从上一次堆行走停止的地方继续。这允许在正常执行过程中出现非常短的停顿。如前所述，清扫阶段是由独立的线程处理的。

Ignition and TurboFan

With the release of V8 5.9 earlier in 2017, a new execution pipeline was introduced. This new pipeline achieves even bigger performance improvements and significant memory savings in real-world JavaScript applications.

The new execution pipeline is built on top of Ignition, V8’s interpreter, and TurboFan, V8’s newest optimizing compiler.

You can check out the blog post from the V8 team about the topic here.

Since version 5.9 of V8 came out, full-codegen and Crankshaft (the technologies that have served V8 since 2010) have no longer been used by V8 for JavaScript execution as the V8 team has struggled to keep pace with the new JavaScript language features and the optimizations needed for these features.

This means that overall V8 will have much simpler and more maintainable architecture going forward.

随着2017年早些时候V8 5.9的发布，一个新的执行管道被引入。这个新的管道在现实世界的JavaScript应用程序中实现了更大的性能改进和显著的内存节省。

新的执行管道是建立在V8的解释器Ignition和V8最新的优化编译器TurboFan之上的。

自从V8的5.9版本问世以来，full-codeegen和Crankshaft（自2010年以来一直为V8服务的技术）已经不再被V8用于JavaScript的执行，因为V8团队一直在努力跟上新的JavaScript语言特性和这些特性所需的优化。

这意味着，总体而言，V8在未来将拥有更简单、更可维护的架构。

Improvements on Web and Node.js benchmarks

These improvements are just the start. The new Ignition and TurboFan pipeline pave the way for further optimizations that will boost JavaScript performance and shrink V8’s footprint in both Chrome and Node.js in the coming years.

Finally, here are some tips and tricks on how to write well-optimized, better JavaScript. You can easily derive these from the content above, however, here’s a summary for your convenience:

对Web和Node.js基准的改进

这些改进只是一个开始。新的Ignition和TurboFan管道为进一步的优化铺平了道路，这些优化将在未来几年内提高JavaScript的性能，并缩小V8在Chrome和Node.js中的足迹。

最后，这里有一些关于如何编写优化良好、更好的JavaScript的技巧和窍门。你可以很容易地从上面的内容中推导出这些内容，然而，为了方便你，这里有一个总结。

How to write optimized JavaScript

Order of object properties: always instantiate your object properties in the same order so that hidden classes, and subsequently optimized code, can be shared.
Dynamic properties: adding properties to an object after instantiation will force a hidden class change and slow down any methods that were optimized for the previous hidden class. Instead, assign all of an object’s properties in its constructor.
Methods: code that executes the same method repeatedly will run faster than code that executes many different methods only once (due to inline caching).
Arrays: avoid sparse arrays where keys are not incremental numbers. Sparse arrays which don’t have every element inside them are a hash table. Elements in such arrays are more expensive to access. Also, try to avoid pre-allocating large arrays. It’s better to grow as you go. Finally, don’t delete elements in arrays. It makes the keys sparse.
Tagged values: V8 represents objects and numbers with 32 bits. It uses a bit to know if it is an object (flag = 1) or an integer (flag = 0) called SMI (SMall Integer) because of its 31 bits. Then, if a numeric value is bigger than 31 bits, V8 will box the number, turning it into a double and creating a new object to put the number inside. Try to use 31 bit signed numbers whenever possible to avoid the expensive boxing operation into a JS object.

We at SessionStack try to follow these best practices in writing highly optimized JavaScript code. The reason is that once you integrate SessionStack into your production web app, it starts recording everything: all DOM changes, user interactions, JavaScript exceptions, stack traces, failed network requests, and debug messages. With SessionStack, you can replay issues in your web apps as videos and see everything that happened to your user. And all of this has to happen with no performance impact for your web app. 对象属性的顺序：总是以相同的顺序实例化你的对象属性，这样隐藏类以及随后的优化代码就可以共享。

动态属性：在实例化后向对象添加属性将强制改变隐藏类，并减慢任何针对先前隐藏类优化的方法。相反，在一个对象的构造函数中分配所有的属性。

方法：重复执行同一方法的代码将比只执行一次许多不同方法的代码运行得更快（由于内联缓存）。

数组：避免使用键值不是递增的数字的稀疏数组。稀疏数组里面没有每个元素，就是一个哈希表。这种数组中的元素的访问成本更高。另外，尽量避免预先分配大型数组。最好是边走边增长。最后，不要删除数组中的元素。这样会使键值变得稀疏。

标签值：V8用32位表示对象和数字。它使用一个位来知道它是一个对象（标志=1）还是一个整数（标志=0），称为SMI（SMall Integer），因为它有31位。然后，如果一个数字值大于31位，V8将对该数字进行装箱，将其变成一个双数，并创建一个新的对象，将该数字放在里面。尽量使用31位有符号的数字，以避免将昂贵的装箱操作放入JS对象。（没懂）

我们SessionStack在编写高度优化的JavaScript代码时努力遵循这些最佳实践。原因是，一旦你将SessionStack集成到你的生产网络应用中，它就开始记录一切：所有的DOM变化，用户互动，JavaScript异常，堆栈跟踪，失败的网络请求，以及调试信息。通过SessionStack，你可以将你的Web应用中的问题以视频的形式回放，并看到发生在用户身上的一切。而这一切都必须在不影响你的Web应用性能的情况下发生。