Nacos-服务注册(server端)

545 阅读4分钟

服务注册

nacos-server实现服务注册主要完成4件事:

  1. 更新本地注册表
  2. 启动服务提供者心跳检测线程
  3. 将变更通知给已订阅服务消费者
  4. 非MODE=standalone模式下,集群数据一致性同步

一、更新本地注册表

入口方法:com.alibaba.nacos.naming.controllers.InstanceController#register

public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
        createEmptyService(namespaceId, serviceName, instance.isEphemeral()); // @1
        Service service = getService(namespaceId, serviceName);
        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance); // @2
}

代码@1: 创建Service,并放入serviceMap(注册表)中,instance.isEphemeral()是否临时,之前文章中Distro协议被定位为临时数据的一致性协议,否则采用raft一致性协议

代码@2: 将Instance同步给集群其他节点

com.alibaba.nacos.naming.core.ServiceManager#createServiceIfAbsent

public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException {
        Service service = getService(namespaceId, serviceName);
        if (service == null) {
            service = new Service();
            service.setName(serviceName);
            service.setNamespaceId(namespaceId);
            service.setGroupName(NamingUtils.getGroupName(serviceName));
            service.setLastModifiedMillis(System.currentTimeMillis());
            service.recalculateChecksum();
            if (cluster != null) {
                cluster.setService(service);
                service.getClusterMap().put(cluster.getName(), cluster);
            }
            service.validate();
            if (local) {
                putServiceAndInit(service); // @1
            } else {
                addOrReplaceService(service); // @2
            }
        }
}

代码@1:local==true表示临时数据,这个方法做了几件事:

  • 放入内存注册表
  • 启动该服务提供者的心跳检测线程ClientBeatCheckTask---本文最后讲
  • 添加listener(即建立获取服务列表的客户端与服务提供者之间的订阅关系),由com.alibaba.nacos.naming.consistency.ephemeral.distro.DistroConsistencyServiceImpl.Notifier线程执行通知

代码@2:由一致性协议类完成数据的同步com.alibaba.nacos.naming.consistency.ConsistencyService

二、通知订阅服务的消费者

那么,我们发现Notifier是一个任务线程,会遍历每一个listener,执行listener.onChange,然后由listener.updateIPs完成更新与通知

 public void updateIPs(Collection<Instance> instances, boolean ephemeral) {
        Map<String, List<Instance>> ipMap = new HashMap<>(clusterMap.size());
        for (String clusterName : clusterMap.keySet()) {
            ipMap.put(clusterName, new ArrayList<>());
        }
        for (Instance instance : instances) {
            try {
                if (instance == null) {
                    Loggers.SRV_LOG.error("[NACOS-DOM] received malformed ip: null");
                    continue;
                }
                if (StringUtils.isEmpty(instance.getClusterName())) {
                    instance.setClusterName(UtilsAndCommons.DEFAULT_CLUSTER_NAME);
                }
                if (!clusterMap.containsKey(instance.getClusterName())) {
                    Loggers.SRV_LOG.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                        instance.getClusterName(), instance.toJSON());
                    Cluster cluster = new Cluster(instance.getClusterName(), this);
                    cluster.init();
                    getClusterMap().put(instance.getClusterName(), cluster);
                }

                List<Instance> clusterIPs = ipMap.get(instance.getClusterName());
                if (clusterIPs == null) {
                    clusterIPs = new LinkedList<>();
                    ipMap.put(instance.getClusterName(), clusterIPs);
                }
                clusterIPs.add(instance);
            } catch (Exception e) {
                Loggers.SRV_LOG.error("[NACOS-DOM] failed to process ip: " + instance, e);
            }
        }
        for (Map.Entry<String, List<Instance>> entry : ipMap.entrySet()) { //@1 
            List<Instance> entryIPs = entry.getValue();
            clusterMap.get(entry.getKey()).updateIPs(entryIPs, ephemeral);
        }
        setLastModifiedMillis(System.currentTimeMillis());
        getPushService().serviceChanged(this); //@2 发布事件
        StringBuilder stringBuilder = new StringBuilder();

        for (Instance instance : allIPs()) {
            stringBuilder.append(instance.toIPAddr()).append("_").append(instance.isHealthy()).append(",");
        }

        Loggers.EVT_LOG.info("[IP-UPDATED] namespace: {}, service: {}, ips: {}",
            getNamespaceId(), getName(), stringBuilder.toString());

    }

代码@1: 里以及上面一大段代码是用来更新clusterMap

代码@2: 构建ServiceChangeEvent事件,发布到applicationContext中,由 com.alibaba.nacos.naming.push.PushService#onApplicationEvent订阅消费,在onApplicationEvent方法中让线程从clientMap获取PushClient(服务订阅关系),并采用UDP协议推送通知给client端,具体client端如何接收,我们下次再分析

三、客户端探活ClientBeatCheckTask

@Override
  public void run() {
      try {
          if (!getDistroMapper().responsible(service.getName())) {
              return;
          }
          List<Instance> instances = service.allIPs(true);
          for (Instance instance : instances) {   
              if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) { // @1
                  if (!instance.isMarked()) {
                      if (instance.isHealthy()) {
                          instance.setHealthy(false);                    
                          instance.getIp(), instance.getPort(), instance.getClusterName(), service.getName(),
                          UtilsAndCommons.LOCALHOST_SITE, instance.getInstanceHeartBeatTimeOut(), instance.getLastBeat());
                          getPushService().serviceChanged(service);
                          SpringContext.getAppContext().publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance));
                      }
                  }
              }
          }
          if (!getGlobalConfig().isExpireInstance()) {
              return;
          }
          // then remove obsolete instances:
          for (Instance instance : instances) {

              if (instance.isMarked()) {
                  continue;
              }
              if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) { // @2
                  // delete instance
                  deleteIP(instance);
              }
          }
      } catch (Exception e) {
          Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e);
      }
  }

代码@1: 超时15s未心跳的 healthy设置为false

代码@2: 超时30s未心跳的,删除实例

四、数据同步ConsistencyService

对于nacos集群,采用com.alibaba.nacos.naming.consistency.ConsistencyService#put保证分布式节点间数据一致性

1. distro协议的DistroConsistencyServiceImpl

@Override
    public void put(String key, Record value) throws NacosException {
        onPut(key, value); // @1
        taskDispatcher.addTask(key); // @2
    }

代码@1: onPut方法做的事同putServiceAndInit类似,就是添加内存注册表,并通知监听者

代码@2: 将key丢给TaskScheduler.queue中,由TaskDispatcher.TaskScheduler#run线程取出key构建同步信息并将注册信息分发给集群中其他server节点

其对于的接收者为其他Peer的com.alibaba.nacos.naming.controllers.DistroController#onSyncDatum

@RequestMapping(value = "/datum", method = RequestMethod.PUT)
    public String onSyncDatum(HttpServletRequest request, HttpServletResponse response) throws Exception 		{
        String entity = IOUtils.toString(request.getInputStream(), "UTF-8");
        Map<String, Datum<Instances>> dataMap =
            serializer.deserializeMap(entity.getBytes(), Instances.class);

        for (Map.Entry<String, Datum<Instances>> entry : dataMap.entrySet()) {
            if (KeyBuilder.matchEphemeralInstanceListKey(entry.getKey())) {
                String namespaceId = KeyBuilder.getNamespace(entry.getKey());
                String serviceName = KeyBuilder.getServiceName(entry.getKey());
                if (!serviceManager.containService(namespaceId, serviceName)
                    && switchDomain.isDefaultInstanceEphemeral()) {
                    serviceManager.createEmptyService(namespaceId, serviceName, true);
                }
                consistencyService.onPut(entry.getKey(), entry.getValue().value); // @1
            }
        }
        return "ok";
    }

代码@1: 上面的代码应该不会陌生了,还是onPut方法

2. raft协议的RaftConsistencyServiceImpl

put方法调用com.alibaba.nacos.naming.consistency.persistent.raft.RaftCore#signalPublish

public void signalPublish(String key, Record value) throws Exception {
       if (!isLeader()) { // @1
           JSONObject params = new JSONObject();
           params.put("key", key);
           params.put("value", value);
           Map<String, String> parameters = new HashMap<>(1);
           parameters.put("key", key);
           raftProxy.proxyPostLarge(getLeader().ip, API_PUB, params.toJSONString(), parameters);
           return;
       }
       try {
           OPERATE_LOCK.lock();
           long start = System.currentTimeMillis();
           final Datum datum = new Datum();
           datum.key = key;
           datum.value = value;
           if (getDatum(key) == null) {
               datum.timestamp.set(1L);
           } else {
               datum.timestamp.set(getDatum(key).timestamp.incrementAndGet());
           }

           JSONObject json = new JSONObject();
           json.put("datum", datum);
           json.put("source", peers.local());
           onPublish(datum, peers.local()); //@2

           final String content = JSON.toJSONString(json);
           //广播给所有节点,只要过半节点成功(majorityCount = peers/2+1 )
           // jraft---commitAt isGrant 半数提交
           //
           /***
            * 1. jraft-sendEntries() 并行发送
            * peers.majorityCount() = 法定人数 quorum
            */
           final CountDownLatch latch = new CountDownLatch(peers.majorityCount()); // @3
           for (final String server : peers.allServersIncludeMyself()) {
               //如果是自己,不用请求
               if (isLeader(server)) {
                   latch.countDown();
                   continue;
               }
               final String url = buildURL(server, API_ON_PUB);
               HttpClient.asyncHttpPostLarge(url, Arrays.asList("key=" + key), content, new AsyncCompletionHandler<Integer>() {
                   @Override
                   public Integer onCompleted(Response response) throws Exception {
                       if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                           Loggers.RAFT.warn("[RAFT] failed to publish data to peer, datumId={}, peer={}, http code={}",
                               datum.key, server, response.getStatusCode());
                           return 1;
                       }
                       latch.countDown();
                       return 0;
                   }

                   @Override
                   public STATE onContentWriteCompleted() {
                       return STATE.CONTINUE;
                   }
               });

           }
           if (!latch.await(UtilsAndCommons.RAFT_PUBLISH_TIMEOUT, TimeUnit.MILLISECONDS)) { //@4
               // only majority servers return success , we can consider this update success
               Loggers.RAFT.error("data publish failed, caused failed to notify majority, key={}", key);
               throw new IllegalStateException("data publish failed, caused failed to notify majority, key=" + key);
           }

           long end = System.currentTimeMillis();
           Loggers.RAFT.info("signalPublish cost {} ms, key: {}", (end - start), key);
       } finally {
           OPERATE_LOCK.unlock();
       }
   }

代码@1: 不是leader则转发给leader

代码@2: 完成本地存储

代码@3: 遍历所有的peer,发送同步请求,这里用countDownLatch来统计是否半数提交,只要半数提交即完成数据一直性同步

代码@4: countDownLatch在这里等待5s

这里就不贴follower接收的代码了,复用了signalPublish方法;PS:这里推荐大家去了解下sofajraft,也是基于raft协议实现了数据一致性

小结

本文分析了nacos-server端注册服务以及集群下两种模式同步数据的代码,其中raft协议的设计思想值得大家去学习,异步编程思想也可以在项目中借鉴应用。