前言

前面两个小节已经系统介绍了webpack中两个核心概念:plugin和loader.

今天将plugin和loader串联起来,从0手写一个基于插件体系搭建的程序架构,完成一个mini模仿版的webpack.

目标需求:

实现js的模块打包
搭建plugin体系,允许接入开发者自定义plugin
搭建loader体系,允许接入开发者自定义loader

mini-webpack有了plugin和loader的赋能,开发者就可以自己编写插件直接对打包的中间过程进行干预.除了实现最基本的js编译,我们还能添加对图片和css的处理能力.

依据执行流程,我们首先开发mini-webpack的插件体系,让它拥有接入插件的能力.

插件体系搭建完后,html模板、css以及图片想引入到项目中都可以通过自定义plugin和loader解决.最后再将所有依赖文件编译打包生成dist目录输出(源代码贴在了尾部).

主程序架构

项目根目录下创建一份配置文件webpack.config.js文件(代码如下).

入口地址是src/js目录下的index.js文件,最后打包生成的js代码放置在dist/bundle.js.

webpack.config.js还增加了对css以及图片的编译,分别加载相关的loader进行处理.

在plugins部分,配置了处理html的plugin.最终mini-webpack会在dist目录下生成一个index.html,并将打包编译后的js路径插入到index.html.

// webpack.config.js
module.exports = {
  entry:path.join(__dirname,"./src/js/index.js"),
  output:{
     path:path.join(__dirname,"/dist"),
     filename:"bundle.js"
  },
  module:{
    rules:[{
       test:/\.js/,
       use:[babelLoader]
    },{
      test:/\.css/,
      use:[htmlLoader]
    },
    {
      test:/\.(jpg|png|gif)/,
      use:[{
         loader:fileLoader,
         options:{
            outputPath:"./image"
         }
      }]
    }
   ]
  },
  plugins:[
     new HtmlWebpackPlugin({
        template:path.join(__dirname,"./src/index.html"),
        filename:"index.html"
     })
  ]
}

配置文件编写完,项目下创建mini-webpack的入口文件mini-webpack.js(代码如下)

入口文件的代码很简单,将webpack.config.js传入Compiler类,生成compiler对象.compiler对象执行run方法,便启动了编译流程.

// mini-webpack.js
const options = require("../webpack.config");
const Compiler = require("./Compiler");

const compiler = new Compiler(options);

compiler.run();// 启动webpack编译

compiler实现

compiler是mini-webpack的支柱引擎,它内部使用tapable定义了4个关键的生命周期函数如下(tapable上一小节介绍过不再赘述).

initialize: 初始化完毕时触发
compile: 编译前触发
emit : 生成打包文件之前触发
done : 构建完毕触发

Compiler.js的构造函数定义了4个生命周期钩子,紧接着调用bindHook函数开始绑定plugin.

开发者编写的配置文件webpack.config.js中有一栏plugins专门定义插件,而下面bindHook函数正是取出配置文件plugins的每一项,将每一款插件都接入mini-webpack体系使用.

// Compiler.js
const { SyncHook } = require('tapable');

class Compiler {
  
  constructor(options){
      this.options= options;
      this.hooks = {
        initialize:new SyncHook(["arg"]),
        compile:new SyncHook(["arg"]),
        emit:new SyncHook(["arg"]),
        done:new SyncHook(["arg"])
      }
      this.outPutDir = this.options.output.path; // 输出目录
      this.bindHook();
  }
  
 /**
   * 绑定hook事件
   */
  bindHook(){
    const { plugins } =  this.options;
    plugins.forEach((plugin)=>{ // 接入插件
       plugin.apply.call(plugin,this);
    })
    this.hooks.initialize.call(this.options); // 触发initialize钩子,初始化完毕
  }

  ...

}

compiler实例创建后,调用run方法就会启动代码编译任务(代码如下).

run方法代码不多,但几乎囊括了整个构建环节.按照执行顺序分析如下:

run方法首先触发compile钩子下定义的插件,此时的时间节点处于代码编译之前
随后complilation对象被创建出来,complilation对象是真正负责做事情的,它通过调用buildModule函数执行代码编译
buildModule函数执行完后,它会将所有要打包的代码封装到一个数据对象返回给this.assets
有了this.assets数据,文件生成就好办了.this.emit函数里面使用nodejs提供的文件api,依据this.assets提供的数据生成了对应的文件目录.

class Compiler {
    
   ...
 
   async run(){
    
    this.hooks.compile.call(this.options); // 开始构建之前触发的方法
    
    this.complilation = new Complilation(this.options,this); // 生成构建对象
    
    this.assets = await this.complilation.buildModule(); //开始执行编译
     
    this.hooks.emit.call(this.assets); // 输出静态资源到文件目录之前执行

    this.emit(); // 生成打包文件

    this.hooks.done.call(); // 编译构建完成 

  }
 

}

从run方法运行的流程可以看出来,最关键的一环就是complilation.buildModule()返回this.assets,通过this.assets就能生成最终的文件目录.

那this.assets的数据结构是什么样的呢(形似如下)?

this.assets = {
    'bundle.js': 'const name = "hello world";console.log(name);'
}

this.assets数据结构很好理解,key对应着文件名,value对应着文件的代码内容.

照此分析,上述结构最终就会在dist目录下创建一个bundle.js文件,并把右侧代码内容填充到文件里.

依次类推,那如果生成的this.assets数据结构如下,dist目录就会生成两个文件,一个是bundle.js，另一个是index.html,并且index.html里会插入bundle.js的脚本路径.

this.assets = {
    'bundle.js': 'const name = "hello world";console.log(name);',
    'index.html': '<!DOCTYPE html><html lang="en"><head>\n' +
            '  <meta charset="UTF-8">\n' +
            '  <meta http-equiv="X-UA-Compatible" content="IE=edge">\n' +
            '  <meta name="viewport" content="width=device-width, initial-scale=1.0">\n' +
            '  <title>Document</title>\n' +
            '</head>\n' +
            '<body>\n' +
            '    <div id="root">\n' +
            '        hello world\n' +
            '    </div>\n' +
            '\n' +
            '<script src="./bundle.js"></script></body></html>',
}

综上所述,我们最终只要能动态控制this.assets的数据结构,也就能决定最后打包生成的文件内容.那么不管js、css、html还是图片(图片可以序列化成二进制数据)都能打包生成.

现在结合前面的讲的内容,我们编写一个插件plugin,使mini-webpack打包后生成一个index.html文件,并将js路径注入进去.

配置文件引入插件HtmlWebpackPlugin,并使用new关键字构建一个实例对象.

// webpack.config.js
module.exports = {
   ...
    plugins:[
     new HtmlWebpackPlugin({
        template:path.join(__dirname,"./src/index.html"), // html页面模板
        filename:"index.html" // 文件名
     })
  ]
}

HtmlWebpackPlugin代码如下,它其实就是一个导出来的类.

类中有一个核心方法apply,上面介绍compiler接入插件时就是通过调用插件的apply方法实现的.

apply方法内部会监听compiler的emit钩子,上面介绍过emit事件会在生成打包文件之前触发.

那么在emit事件对应的时间节点,this.assets已经被complilation.buildModule()构建完成返回了,我们此时是可以拿到this.assets数据的.

apply方法内既然能拿到this.assets,接下来根据传入的配置获取html的模板字符串,并借助jsdom提供的api,将脚本路径插入到html中,最后赋值给compiler.assets就完成目标了.

如此通过一个插件的作用,this.assets里面就会增加一个index.html属性,最终打包文件也会生成相应的html文件.

// html-webpack-plugin.js
const path = require("path");
const fs = require("fs");
const { JSDOM } = require("jsdom");

class HtmlWebpackPlugin {

  constructor(options){
     this.template = options.template;
     this.filename = options.filename || "index.html";
  }

  
  apply(compiler){

    compiler.hooks.emit.tap("insertHtml",()=>{

      const { filename } = compiler.options.output;
       
      const js_url = `./${filename}`;

      const code = fs.readFileSync(this.template).toString();

      const dom = new JSDOM(code);

      const body = dom.window.document.querySelector("body");

      body.innerHTML = body.innerHTML + `<script src="${js_url}"></script>`;

      compiler.assets[this.filename] = dom.serialize(); 

    })

  }

}

module.exports = HtmlWebpackPlugin;

Complilation实现

complilation实例是真正负责做事的,它调用buildModule方法对代码进行编译构建,最终返回this.assets.

complilation.buildModule执行后,它首先通过配置项webpack.config.js获取项目的入口文件地址.

complilation要做的第一件事情是根据webpack.config.js中rules的配置,拿到专门处理.js为后缀的loader,对下面js文件进行处理.

// index.js

const { add } =  require("./other");
require("../css/global.css");
const img_url = require("../img/1.png"); 

console.log(add(1,1));

一般而言对js文件处理最多的要求是将es6语法转换成es5,complilation通过调用loader就能让上面的es6语法转换成es5.

现在来看loader是如何将es6语法转换成es5的(代码如下).

// webpack.config.js

module.exports = {
   module:{
    rules:[{
       test:/\.js/,
       use:[babelLoader] // 所有js文件都要被babelLoader处理一遍
    }
   }  
}

-----------------------------------------------------

// babelLoader.js

const parser = require("@babel/parser");
const { transformFromAstSync } = require("@babel/core");


// 将es6 语法转换成es5
module.exports = function(content){
  const ast = parser.parse(content,{
    sourceType:"module"
  })
  const { code } = transformFromAstSync(ast,null,{
    presets: ["@babel/preset-env"]
  })
  return code;
}

babelLoader是一个导出的函数,content是传入的源代码.

函数内首先使用babel相关工具将源码转换成ast语法树,再设置presets就能使ast转换成es5代码返回.

经过了babelLoader的处理,入口文件index.js的源码变成了下面这个样子.

// index.js

var _require = require("./other.js")
require("../css/global.css");
var img_url = require("../img/1.png"); 

console.log(_require.add(1,1));

通过观察,所有const关键词都变成了var,代码确实转换成了es5.

但是这个代码直接丢到浏览器上运行肯定会报错,因为浏览器不知道require是什么,它也无法帮你智能的引入css和图片.

complilation实例第二步开始扫描上面index.js里面的require语法.如果发现是以.js为后缀的文件,先忽略.

但发现是其他后缀名的文件,它就会在webpack.config.js中寻找相应的loader进行处理.比如上面的代码使用处理css和图片的loader加工后,代码被转换成下面的样子.

require("../css/global.css")会被替换成一段函数,当浏览器最终执行这个函数时会将global.css的样式添加到页面文档的头部.

图片require("../img/1.png")则替换成一个图片路径,图片的名称被转化成了一段hash值.

var _require = require("./other.js")

(function(){
    var tag = document.createElement("STYLE"); // 创建style标签  
    tag.innerHTML = "body { color : red;}";  // global.css的样式内容   
    var head = document.getElementsByTagName("Head")[0]; // 寻找head标签   
    head.appendChild(tag); 
 })();
 
var img_url = "image/0e3c014db8b376a43cbf7cca8291a036a357b2937f4b6dfb03864d0ea2c9bf11.png";

console.log(_require.add(1,1));

上面对css和图片的处理都可以通过loader来实现,接下里看一下loader是如何处理的.

htmlLoader专门处理css格式的文件,执行后返回一个字符串函数.

函数内创建style标签,再将css文件的内容插入进去,最后放入文档的head标签下.

// htmlLoader.js
module.exports = function(content){
  return `(function(){
    var tag = document.createElement("STYLE"); // 创建style标签
    tag.innerHTML = ${JSON.stringify(content)};
    var head = document.getElementsByTagName("Head")[0]; // 寻找head标签
    head.appendChild(tag);
  })()`;
}

fileLoader可以用来处理图片或字体,它会根据文件内容生成一个hash名称.并在compiler下emit钩子绑定一个事件函数.

当mini-webpack准备打包生成文件之前,emit钩子函数将被触发.

它会在this.assets下添加图片,并将图片的相对路径作为返回值.最终dist目录下也会生成相应的图片.

// fileLoader.js
const path = require("path");
const sha256 = require("sha256");

module.exports = function(){

  const ext = path.extname(this.filename); // 获取图片后缀名

  const hash = sha256(this.raw); // this.row是文件的二进制数据,生成图片的hash名称

  const outputPath = this.query.outputPath || "./"; // 用户配置的目录

  const img_relative = path.join(outputPath,`${hash}${ext}`); // 拼接出打包后图片的相对路径

  this.context.hooks.emit.tap("imgResolve",()=>{
      this.context.assets[img_relative] = this.raw; // 将图片放到dist文件夹中
  })

  return JSON.stringify(img_relative.replace(/\\/g,"/")); // 相对路径直接返回
}

complilation实例通过扫描入口文件index.js所有require语法,从而对代码的中的css和图片进行代码转换,这个扫描过程是如何实现的呢?

扫描require语法通过babel提供的工具可以轻松做到,首先将代码转化成ast语法树,再根据语法树的数据特征寻找到代码中所有require语法,如果发现了css文件或者图片就可以进行后续的处理.

const traverse = require("@babel/traverse").default;

 // ...
 
traverse(ast, {
        CallExpression(path) {
          if (path.node.callee.type === "Identifier" && path.node.callee.name === "require"){
               ... //寻找到了所有require语法
          } 
        },
});

complilation实例执行完了上述任务后,入口文件的代码最终转换成了下面的样子.

下面这段代码已经对js、css和图片都做了处理,但是这段代码丢到浏览器里面仍然运行不了.浏览器不认识require，它也无法智能的将其他js引入进来.

var _require = require("./other.js");

(function(){
    var tag = document.createElement("STYLE"); // 创建style标签  
    tag.innerHTML = "body { color : red;}";  // global.css的样式内容   
    var head = document.getElementsByTagName("Head")[0]; // 寻找head标签   
    head.appendChild(tag); 
 })();
 
var img_url = "image/0e3c014db8b376a43cbf7cca8291a036a357b2937f4b6dfb03864d0ea2c9bf11.png";

console.log(_require.add(1,1));

依赖分析

complilation实例第三步要开始处理require引入js的情况,假设项目的源代码如下.

项目中存在三个文件:index.js、other.js以及three.js.其中index.js引入了other.js导出的add方法.而other.js引入了three.js的multiple方法.

现在如何把这三块代码合并起来放到浏览器中运行呢??

// index.js 入口文件(已被loader处理完)

var _require = require("./other.js");

(function(){
    var tag = document.createElement("STYLE"); // 创建style标签  
    tag.innerHTML = "body { color : red;}";  // global.css的样式内容   
    var head = document.getElementsByTagName("Head")[0]; // 寻找head标签   
    head.appendChild(tag); 
 })();
 
var img_url = "image/0e3c014db8b376a43cbf7cca8291a036a357b2937f4b6dfb03864d0ea2c9bf11.png";

console.log(_require.add(1,1));

----------------------

// other.js

const { multiple } =  require("./three");

exports.add = (a,b)=>{
  return multiple(a+b);
}

---------------------

// three.js

exports.multiple = (total)=>{
  return total * 10;
}

如果上述三块代码被转换成了以下形式,不就完成了代码的合并吗?

 // bundle.js
  
 var entry = "./src/js/index.js"; //入口地址 
 var deps = {      // 依赖图谱
    "./src/js/index.js":"var _require = require(\"./other.js\");(function(){var tag = document.createElement(\"STYLE\"); ...... ",
    "./other.js":"var _require = require(\"./three.js\");exports.add = (a,b)=>{ return _require.multiple(a+b);}",
    "./three.js":"exports.multiple = (total)=>{ return total * 10;}"
 }
 
 (function(entry,modules){
      function require(pathname){
        
        var module = {
            exports:{}
        }
        
        ;(function(require,module,exports){
          const code = modules[pathname];
          try{
            eval(code);
          }catch(error){
              console.log(error);
          }
        })(require,module,module.exports);
        
        return module.exports;
        
      }
      require(entry); // 执行入口文件
 })(entry,deps)

deps是一个数据对象,key对应着文件名,value对应着被loader处理完后的代码.

该代码一旦丢入浏览器中,require函数首先加载入口文件./src/js/index.js,依赖图谱根据入口文件的地址返回源代码code,随后使用eval执行code.

在执行eval(code)的过程中,碰到了require("./other.js"),又会触发require函数递归调用,直至将所有依赖文件都执行完毕.

因此只要将项目的中各个文件的代码最终编译合成到bundle.js里,就完成了整个编译构建任务.这个环节最难的就是依赖图谱的生成,那如何将各个文件中的代码转换成了上述依赖图谱deps的数据结构呢?

我们还是要借助babel工具实现目标(代码如下).analyseLib是分析依赖的函数,初始时analyseLib接受入口文件的源代码和文件名开始执行.

源代码随后被转化成了ast语法树,通过遍历语法树,找到了入口文件依赖的所有js文件并存到了deps数组.

流程继续往下,deps数组遍历循环,开始递归调用analyseLib函数.递归全部调用结束后,所有的依赖文件和代码都赋值给了this.modules.

有了this.modules(依赖图谱),依据上面的格式,合成bundle.js的代码就变得非常简单.

complilation实例最后一步工作就是要在this.assets对象上添加一个属性名bundle.js和值,并将自己的构建结果this.assets返回给compiler.到此为止mini-webpack的构建任务便结束了.

const traverse = require("@babel/traverse").default;

class Complilation {

  ...

  modules = {}; //依赖图谱

//依赖分析
//code是源代码,filename是文件名
  analyseLib(code,filename){
  
      const ext = path.extname(filename); // 获取文件名后缀
      
      if(ext !== ".js"){ // 只有js文件才需要做依赖分析
          return;  
      }
 
      const ast = parser.parse(code,{  // 将源代码生成ast语法树
        sourceType:"module"
      })
      
      const deps = [];
  
      traverse(ast, { //遍历ast语法树,寻找require语句
        CallExpression(path) {
          if (path.node.callee.type === "Identifier" && path.node.callee.name === "require") {
             deps.push(path.node.arguments[0].value);
          }
        },
      });
  
      this.modules[filename] = code; 
      
      for(let i = 0; i < deps.length ; i++){
        const dep = deps[i]; 
        this.analyseLib(this.getCode(dep),dep); // 获取依赖的文件代码,并递归调用analyseLib函数
      }  
   
   }
   
   ...

}

手写mini-webpack

前言

主程序架构

compiler实现

Complilation实现

依赖分析

源代码