egg-cluster那些事

1,352 阅读5分钟

背景

看过egg文档的话,应该对它描述的“内置多进程管理”有点印象吧。

我们也知道,node中js的执行是单线程的,无法更好利用服务器资源。为了解决这个问题,一般是用PM2进程管理,多个进程共用一个端口; 以及node自带的cluster模块。

那么本文主要描述的便是egg-cluster

在这之前,需要先简单描述下egg项目的npm run dev,其实是执行egg-bin中的run方法,最终是会执行egg-cluster暴露出的startCluster方法。

require(options.framework).startCluster(options);

什么是egg-cluster?

egg-cluster是egg内置的基础模块之一,能够更好利用服务器资源,解决一个进程只能运行在一个CPU的问题。通过node提供的cluster实现了多进程模式,并且一般会根据服务器的 CPU 核数来定Worker 进程的数量,达到更好利用多核资源。

关于egg多进程模型

在了解egg多进程模型之前,需要先描述下 IPC,即‘进程间通讯’

以下这张图便是来自官网文档的进程模型。cluster 的 IPC 通道只存在于 Master 和 Worker/Agent 之间,Worker 与 Agent 进程互相间是没有的,通过 Master 来转发。

image.png

框架的启动时序如下:
image.png

  1. Master 启动后先 fork Agent 进程
  2. Agent 初始化成功后,通过 IPC 通道通知 Master
  3. Master 再 fork 多个 App Worker
  4. App Worker 初始化成功,通知 Master
  5. 所有的进程初始化成功后,Master 通知 Agent 和 Worker 应用启动成功

image.png

当一个应用启动时,大致上就是利用Master作为主线程,启动Agent作为秘书进程协助Worker处理一些公共事务(日志之类),启动Worker进程执行真正的业务代码。

接下来我们看看具体的实现

egg-cluster启动流程源码

从入口文件index.js看,egg在此处暴露了egg.startCluster,且这个API的作用主要是启动Master

exports.startCluster = function(options, callback) {
  new Master(options).ready(callback);
};

接下来便从Master出发

class Master extends EventEmitter {
  constructor(options) {
    super();
    // 初始化参数,如https,port等
    this.options = parseOptions(options);
    // worker实例,包含setAgent/deleteAgent/setWorker/getWorker等
    this.workerManager = new Manager();
    // messenger实例,相关通信规则
    this.messenger = new Messenger(this);
    // get-ready模块
    ready.mixin(this);
    //判断环境
    this.isProduction = isProduction();
    ...
    //Master启动后处理
    this.ready(() => {
      this.isStarted = true;
      //发送egg-ready至各个进程
      const action = "egg-ready";
      this.messenger.send({action,to: "parent",data: {port: this[REAL_PORT],address: this[APP_ADDRESS], protocol: this[PROTOCOL],},});
      this.messenger.send({action,to: "app",data: this.options,});
      this.messenger.send({action,to: "agent",data: this.options,});

      //进行检查处理,检查agent跟work状态
      if (this.isProduction) {
        this.workerManager.startCheck();
      }
    });
    //事件注册监听
    this.on("agent-exit", this.onAgentExit.bind(this));
    this.on("agent-start", this.onAgentStart.bind(this));
    this.on("app-exit", this.onAppExit.bind(this));
    this.on("app-start", this.onAppStart.bind(this));
    this.on("reload-worker", this.onReload.bind(this));
    this.once("agent-start", this.forkAppWorkers.bind(this));
    this.on("realport", ({ port, protocol }) => {
      if (port) this[REAL_PORT] = port;
      if (protocol) this[PROTOCOL] = protocol;
    });
    process.once("SIGINT", this.onSignal.bind(this, "SIGINT"));
    process.once("SIGQUIT", this.onSignal.bind(this, "SIGQUIT"));
    process.once("SIGTERM", this.onSignal.bind(this, "SIGTERM"));
    process.once("exit", this.onExit.bind(this));

    ...
    
    //检测端口,fork个agent
    this.detectPorts().then(() => {
      this.forkAgentWorker();
    });
。。。

从上面代码可以看出,先是进行了准备工作,如:初始化参数、Manager实例(worker相关的处理setAgent/deleteAgent/setWorker/getWorker)、Messenger实例(用于master/agent/worker之间通讯)、注册事件监听(agent-exit/agent-start等),最主要的是下面这段过程,检测端口,fork个agent,我们看看检测完端口detectPorts后,forkAgentWorker的过程

forkAgentWorker() {
    this.agentStartTime = Date.now();
    const args = [JSON.stringify(this.options)];
    const opt = {};
    if (process.platform === "win32") opt.windowsHide = true;
    // add debug execArgv
    const debugPort = process.env.EGG_AGENT_DEBUG_PORT || 5800;
    if (this.options.isDebug)
      opt.execArgv = process.execArgv.concat([
        `--${
          semver.gte(process.version, "8.0.0") ? "inspect" : "debug"
        }-port=${debugPort}`,
      ]);
    const agentWorker = childprocess.fork(this.getAgentWorkerFile(), args, opt);
    agentWorker.status = "starting";
    agentWorker.id = ++this.agentWorkerIndex;
    this.workerManager.setAgent(agentWorker)
    // send debug message
    if (this.options.isDebug) {
      this.messenger.send({
        to: "parent",
        from: "agent",
        action: "debug",
        data: {
          debugPort,
          pid: agentWorker.pid,
        },
      });
    }
    // forwarding agent' message to messenger
    agentWorker.on("message", (msg) => {
      if (typeof msg === "string") {
        msg = {
          action: msg,
          data: msg,
        };
      }
      msg.from = "agent";
      this.messenger.send(msg);
    });
    //监听agent报错
    agentWorker.on("error", (err) => {
      err.name = "AgentWorkerError";
      err.id = agentWorker.id;
      err.pid = agentWorker.pid;
      this.logger.error(err);
    });
    // 监听agent退出  agent exit message
    agentWorker.once("exit", (code, signal) => {
      this.messenger.send({
        action: "agent-exit",
        data: {
          code,
          signal,
        },
        to: "master",
        from: "agent",
      });
    });
  }

可能代码有点多,其实最主要的是下面这句,用childprocess启动agent

const agentWorker = childprocess.fork(this.getAgentWorkerFile(), args, opt);

继续查this.getAgentWorkerFile

getAgentWorkerFile() {
    return path.join(__dirname, "agent_worker.js");
}

继续挖进agent_worker.js,终于找到了。。如下,调用process.send发了通知“agent-start”给master

agent.ready(err => {
  // don't send started message to master when start error
  if (err) return;
  agent.removeListener('error', startErrorHandler);
  process.send({ action: 'agent-start', to: 'master' });
});

此时来到Master中的代码,那也就是接下来会执行相应的this.onAgentStart()以及this.forkAppWorkers()

//注册事件监听
    this.on("agent-exit", this.onAgentExit.bind(this));
    this.on("agent-start", this.onAgentStart.bind(this));
    this.on("app-exit", this.onAppExit.bind(this));
    this.on("app-start", this.onAppStart.bind(this));
    this.on("reload-worker", this.onReload.bind(this));
    this.once("agent-start", this.forkAppWorkers.bind(this));

先来看看以下this.onAgentStart()代码,主要是进行通知的操作

onAgentStart() {
    this.agentWorker.status = "started";
    // Send egg-ready when agent is started after launched
    if (this.isAllAppWorkerStarted) {
      this.messenger.send({
        action: "egg-ready",
        to: "agent",
        data: this.options,
      });
    }
    this.messenger.send({
      action: "egg-pids",
      to: "app",
      data: [this.agentWorker.pid],
    });
    // should send current worker pids when agent restart
    if (this.isStarted) {
      this.messenger.send({
        action: "egg-pids",
        to: "agent",
        data: this.workerManager.getListeningWorkerIds(),
      });
    }
    this.messenger.send({
      action: "agent-start",
      to: "app",
    });
  }

再往下看看接下来执行的this.forkAppWorkers(),cfork?此刻是否出现了疑问,查了下,npm上描述如下123,大概就是启动worker、重启以及日志监听。这个forkAppWorkers()方法的过程主要就是cfork启动worker,触发相应的app-start。
1.- Easy fork with worker file path
2.- Handle worker restart, even it was exit unexpected.
3.- Auto error log process uncaughtException event

forkAppWorkers() {
    this.appStartTime = Date.now();
    this.isAllAppWorkerStarted = false;
    this.startSuccessCount = 0;
    const args = [JSON.stringify(this.options)];
    //启动worker
    cfork({
      exec: this.getAppWorkerFile(),
      args,
      silent: false,
      count: this.options.workers,
      // don't refork in local env
      refork: this.isProduction,
      windowsHide: process.platform === "win32",
    });

    let debugPort = process.debugPort;
    //master监听fork
    cluster.on("fork", (worker) => {
      worker.disableRefork = true;
      this.workerManager.setWorker(worker);
      worker.on("message", (msg) => {
        if (typeof msg === "string") {
          msg = {
            action: msg,
            data: msg,
          };
        }
        msg.from = "app";
        this.messenger.send(msg);
      });
    });
    //监听disconnect
    cluster.on("disconnect", (worker) => {
      this.logger.info(
        "[master] app_worker#%s:%s disconnect, suicide: %s, state: %s, current workers: %j",
        worker.id,
        worker.process.pid,
        worker.exitedAfterDisconnect,
        worker.state,
        Object.keys(cluster.workers)
      );
    });
    //监听worker的exit
    cluster.on("exit", (worker, code, signal) => {
      this.messenger.send({
        action: "app-exit",
        data: {
          workerPid: worker.process.pid,
          code,
          signal,
        },
        to: "master",
        from: "app",
      });
    });
    //监听worker的listening
    cluster.on("listening", (worker, address) => {
      this.messenger.send({
        action: "app-start",
        data: {
          workerPid: worker.process.pid,
          address,
        },
        to: "master",
        from: "app",
      });
    });
  }

触发相应的app-start后,又回到Master的初始代码,接下来便会执行this.onAppStart(),如下所示,主要是worker发消息给agent以及app worker:‘egg-ready’。最后触发ready

this.on("app-start", this.onAppStart.bind(this));

onAppStart(data) {
    const worker = this.workerManager.getWorker(data.workerPid);
    const address = data.address;

    // worker should listen stickyWorkerPort when sticky mode
    if (this.options.sticky) {
      if (String(address.port) !== String(this.options.stickyWorkerPort)) {
        return;
      }
      // worker should listen REALPORT when not sticky mode
    } else if (
      !isUnixSock(address) &&
      String(address.port) !== String(this[REAL_PORT])
    ) {
      return;
    }

    // send message to agent with alive workers
    this.messenger.send({
      action: "egg-pids",
      to: "agent",
      data: this.workerManager.getListeningWorkerIds(),
    });

    ...
    ...

    // Send egg-ready when app is started after launched
    if (this.isAllAppWorkerStarted) {
      this.messenger.send({
        action: "egg-ready",
        to: "app",
        data: this.options,
      });
    }
    ...
    ...
    if (this.options.sticky) {
      this.startMasterSocketServer((err) => {
        if (err) return this.ready(err);
        this.ready(true);
      });
    } else {
      this.ready(true);
    }
  }

ready相关如下,master发消息给parent、app、agent:‘egg-ready’

   this.ready(() => {
      this.isStarted = true;
      //发送egg-ready至各个进程
      const action = "egg-ready";
      this.messenger.send({action,to: "parent",data: {port: this[REAL_PORT],address: this[APP_ADDRESS], protocol: this[PROTOCOL],},});
      this.messenger.send({action,to: "app",data: this.options,});
      this.messenger.send({action,to: "agent",data: this.options,});

      //进行检查处理,检查agent跟work状态
      if (this.isProduction) {
        this.workerManager.startCheck();
      }
   });

启动流程大概这样也就结束了,最后的最后,补充张来源于源码中messenger模块携带的进程通信关系图。

 * master messenger,provide communication between parent, master, agent and app.
 *
 *             ┌────────┐
 *             │ parent │
 *            /└────────┘\
 *           /     |      \
 *          /  ┌────────┐  \
 *         /   │ master │   \
 *        /    └────────┘    \
 *       /     /         \    \
 *     ┌───────┐         ┌───────┐
 *     │ agent │ ------- │  app  │
 *     └───────┘         └───────┘
 *

总结

看完这篇,可以了解到, Master继承EventEmitter使用订阅/通知模式进行信息采集以及send进行转发.且会发现,agent是使用child_process.fork方法启动的,worker是通过cluster启动。

关于EggCluster涉及知识体系图如下:

1586416532759-d68d6a71-0f8f-4256-9954-92e390fee1ea (1).png

参考

eggjs.org/zh-cn/core/…
zhuanlan.zhihu.com/p/29374045#
www.yuque.com/jianzhen/ib…