The history of web
- ~1993, telnet, NCSA mosaic;
- 1994, mosaic team splits to Netscape and SPRY;
- 1995, joined microsoft and the IE team;
- browser wars;
- 1996, ie3 vs netscape navigator, css;
- 1997, ie4, trident engine, dhtml("suddenly anything was dynamically programmable"), activedesktop in wins 98;
- 1999, ajax, ie5;
- 2001, ie6 and the death of web;
- 2001-2004, mashups, mozilla(firefox), safari, webkit;
- 2004-2006, ie7, back from the dead;
- 2007-,ie continues to underinvest in platform;
- 2007-2010, the rise of mobile revolution;
- web 2.0 isn't a set of technologies - it's caring about your user experience;
- 2015 the pwa era;
Anatomy of the browser 101
-
4s principles: speed, security, simplicity, stability;
-
chrome 1.0, communication on IPC:
- Browser(ui, networking, storage);
- Plugin(NPAPI), one for each;
- Renderer(Webkit), sandbox, one for each;
-
why multi-process? (as multi-process obviously going to be some performance overhead in some cases)
- cannot write perfect code;
- security: untrusted web content shouldn't use exploits to access file sysytem;
- speed: misbehaving(例如,js 性能不佳) tabs should not impact other tabs or the browser;
- stability: crashes should only affect the tab and not other tabs or the browser;
-
IPC
- inter-process communication;
- message passing between processes, with shared memory for large data;
- usuallt between processes of different privileges(特权较低的线程不能接管特权较高的线程), so need security review;
- 性能考虑,通常是异步通信;
-
sandboxing
- untrusted data can be used to exploit bugs and run arbitrary code;
- as we moved from simple HTML pages to web apps, more code to exploit(引擎变复杂,有些载体可以使用操作系统);
- run untrusted web content in a locked-down process where it doesn't have access to file system, OS calls
- exact mechanism to lock down a process is platform-specific;
- multiple levels of sandboxing depending on process type;
-
render process
- where data from the web is handled, e.g. parsing, layout, executing js, decoding
- completely sandboxed to prevent bugs from gaining access to user data and/or installing malware;
-
plugin process
- ran flash, java, reader etc...;
- had to be unsandboxed since plugins were written assuming full access (因为插件是第三方代码,它是在假设完全可以访问系统的情况下编写的);
-
browser process
- central coordinator;
- owns browser state such as profile data, setting etc...;
- draws UI;
- handles networking;
- cannot trust renderer;
-
threads
- in child process, for th most part a main thread and an IPC thread;
- many threads in the browser process;
- main browser threads:
- UI(主线程): where most of the browser logic lives;
- IO: non-blocking IO, e.g. networking and also IPC;
- files
-
chrome today
- 更多进程:gpu(partial sanbox, 和 render、plugin 线程直接通信), utility(实用进程), extension(扩展进程);
- plugin is sandboxed (pepper plugin);
- render use blink;
-
GPU process
- machines with powerful GPUs were becoming widespread;
- web platform features like WebGL meant that we'd have to make expensive GPU readback to render a page;
- large project to offload compositing and scrolling to GPU;
- separate process for stability and security(gpu 驱动程序可能有 bug,所以需要部分沙盒隔离);
-
utility process
- as the browser gained more features, new class of untrusted data that was not specific to a tab (e.g. installing an extension, processing JSON);
- runs code on behalf of browser in a sandbox;
- short-lived(杀死自己);
-
extension process
- speed & stability: did not want badly written extension code to adversely affect pages;
- security: wanted extension to have limited access to browser, page, and system
- simplicity: extensions install and uninstalled without restart;
-
pepper plugin
- many exploits coming through plugin code we didn't control (Flash & PDF);
- multi-year effort to create new plugin API that's sandboxed
- ported Flash
- wrote new PDF plugin based on PDF code we licensed(PDFium);
-
chrome/content split
- separated the product(Chrome) from the platform (sandboxed multi-process browser);
- src/chrome split into src/chrome and src/content;
- src/chrome for UI and browser features such as bookmarks, password manager, autofill;
- src/content for code related to multi-process, sandbox, and web platform;
- new products built on top of content, e.g. ChromeCast, Home, Electron;
-
componentization
- ios forces apps to use Safari's web engine
- need to share browser feature code(e.g. password manager, autofill, sync) with platform that doesn't use Blink;
- split feature code from src/chrome into src/components, and move 'content' specific code to isolated subdirs;
-
site isolation
- 针对嵌入型的恶意网站,可以根据来源,在不同的进程中运行选卡项中的不同部分;
-
mojo
- new IPC mechanism;
- IDL based;
- allow generation of bindings for different languages or even for same language but different types;
- hides detail of which process the caller and callee run in;
-
mojo primitives
- message pipes(双向、cheap);
- shared buffers;
- data pipes(带通知的共享内存);
- process/thread agnostic: avoid different code paths depending on location of sender/receiver;
-
feature chrome-servicification
- upgrade architecture;
- microservice architecture: well-defined, reusable, decoupled, layered system;
-
directory layout
Anatomy of the browser 201
-
chrome 是一个浏览器,同时也个提供给其他应用构建浏览器的库,chrome embeds in content;
-
browser 是其他进程的父进程;
-
多个配置文件:切换用户、匿名模式(创建一个非记录配置文件,没有将任何东西写入磁盘,但也会读取原始配置文件用于保存密码等功能)、访客模式(创建空白的一次性文件,不会写入磁盘、也不会读取之前的如任何设置);
-
browsercontext(一个 window):随 bom 销毁,key service;
-
webcontent(一个 tab,但也可以独立于选项卡):观察者模式(观察事件)、委托模式(调用 chrome 实现一些接口功能);
-
多进程框架;
-
//chrome 实例化了 //content/public 中的部分接口;
-
管理一个页面中不同源的嵌入;
-
存储分区;