Nacos集群架构下节点状态同步的设计原理

1,130 阅读2分钟

前言

  • 最近一直在研究Nacos开源框架的源码和执行流程原理
  • 本次简单聊下AP集群架构下Nacos节点状态同步的设计原理

Nacos集群架构下节点状态同步的设计方式

  • nacos集群节点间通信同步主要通过ServerListManager类的init方法,会启动定时任务进行节点状态通信。
@Component("serverListManager")
public class ServerListManager extends MemberChangeListener {
    ......
    @PostConstruct
    public void init() {
        // 集群节点状态同步任务
        GlobalExecutor.registerServerStatusReporter(new ServerStatusReporter(), 2000);
        GlobalExecutor.registerServerInfoUpdater(new ServerInfoUpdater());
    }
    ......
}
  • ServerStatusReporter任务首先会获取集群下所有的nacos节点信息,然后依次遍历执行(排除自己)向其他节点发送自身状态信息。
private class ServerStatusReporter implements Runnable {

    @Override
    public void run() {
        try {

            if (EnvUtil.getPort() <= 0) {
                return;
            }

            int weight = Runtime.getRuntime().availableProcessors() / 2;
            if (weight <= 0) {
                weight = 1;
            }

            long curTime = System.currentTimeMillis();
            String status = LOCALHOST_SITE + "#" + EnvUtil.getLocalAddress() + "#" + curTime + "#" + weight
                    + "\r\n";

            // 获取所有节点
            List<Member> allServers = getServers();

            if (!contains(EnvUtil.getLocalAddress())) {
                Loggers.SRV_LOG.error("local ip is not in serverlist, ip: {}, serverlist: {}",
                        EnvUtil.getLocalAddress(), allServers);
                return;
            }

            // 集群模式下
            if (allServers.size() > 0 && !EnvUtil.getLocalAddress()
                    .contains(IPUtil.localHostIP())) {
                for (Member server : allServers) {
                    // 排除自身
                    if (Objects.equals(server.getAddress(), EnvUtil.getLocalAddress())) {
                        continue;
                    }

                    // This metadata information exists from 1.3.0 onwards "version"
                    if (server.getExtendVal(MemberMetaDataConstants.VERSION) != null) {
                        Loggers.SRV_LOG
                                .debug("[SERVER-STATUS] target {} has extend val {} = {}, use new api report status",
                                        server.getAddress(), MemberMetaDataConstants.VERSION,
                                        server.getExtendVal(MemberMetaDataConstants.VERSION));
                        continue;
                    }
                    
                    Message msg = new Message();
                    msg.setData(status);
                    
                    // 发送状态信息到其他节点
                    synchronizer.send(server.getAddress(), msg);
                }
            }
        } catch (Exception e) {
            Loggers.SRV_LOG.error("[SERVER-STATUS] Exception while sending server status", e);
        } finally {
            GlobalExecutor
                    .registerServerStatusReporter(this, switchDomain.getServerStatusSynchronizationPeriodMillis());
        }

    }
}
  • 这里的发送http请求用的是ServerStatusSynchronizer.send()方法
public class ServerStatusSynchronizer implements Synchronizer {
    
    @Override
    public void send(final String serverIP, Message msg) {
        if (StringUtils.isEmpty(serverIP)) {
            return;
        }
        
        final Map<String, String> params = new HashMap<String, String>(2);
        
        params.put("serverStatus", msg.getData());
        
        String url = "http://" + serverIP + ":" + EnvUtil.getPort() + EnvUtil.getContextPath()
                + UtilsAndCommons.NACOS_NAMING_CONTEXT + "/operator/server/status";
        
        if (IPUtil.containsPort(serverIP)) {
            url = "http://" + serverIP + EnvUtil.getContextPath() + UtilsAndCommons.NACOS_NAMING_CONTEXT
                    + "/operator/server/status";
        }
        
        try {
            HttpClient.asyncHttpGet(url, null, params, new Callback<String>() {
                @Override
                public void onReceive(RestResult<String> result) {
                    if (!result.ok()) {
                        Loggers.SRV_LOG.warn("[STATUS-SYNCHRONIZE] failed to request serverStatus, remote server: {}",
                                serverIP);
                    }
                }
    
                @Override
                public void onError(Throwable throwable) {
                    Loggers.SRV_LOG.warn("[STATUS-SYNCHRONIZE] failed to request serverStatus, remote server: {}", serverIP, throwable);
                }
    
                @Override
                public void onCancel() {
        
                }
            });
        } catch (Exception e) {
            Loggers.SRV_LOG.warn("[STATUS-SYNCHRONIZE] failed to request serverStatus, remote server: {}", serverIP, e);
        }
    }
}

最后

  • 如果有nacos节点宕机了,集群的其他节点会感知到并更新节点的状态,这样就会保证集群节点和客户端的心跳连接时选择节点机器的准确性。
  • Nacos集群架构下心跳健康检查的设计原理
  • 虚心学习,共同进步-_-