Nacos2:全面解读服务注册发现源码和流程

2,759 阅读21分钟

语雀原文(效果更佳):www.yuque.com/di_long/byy…

内容导读

为什么阅读nacos源码?

业务价值

  • 注册中心:微服务最核心的组件,而nacos更是其中的佼佼者
  • 配置中心:微服务的另一个核心组件
  • 传统应用上云:云在成本、维护均有巨大优势

技术价值

  • 高性能:
    • nacos2(8核16G,三节点):注册/注销实例TPS达 26000,查询实例TPS达 30000
    • nacos1(16核32G,三节点):注册/查询实例TPS达 13000
  • 如何实现注册中心
  • 如何融合各种生态

名词解释

  • 命名空间(Namespace):逻辑隔离服务组和服务,默认公共命名空间(public)
  • 组(group):多个服务归类,如用户组有用户、积分、会员等服务
  • 服务(Service)、集群(Cluster)实例(Instance):对标分级(三级)存储模型
  • 注册(注册者、注册表):充当服务提供者角色,向nacos注册自己的服务实例信息;注册者为服务提供者;注册表为存储注册服务的数据结构(ConcurrentMap<Service, Set<String>>
  • 订阅(订阅者、订阅表):充当服务消费者角色,向nacos拉取自己依赖的服务;订阅者为服务消费者;订阅表为存储订阅客户端的数据结构(ConcurrentMap<Service, Set<String>>
  • 延时任务执行引擎(ExecuteTaskExecuteEngine):定时执行任务,一般通过ScheduledExecutorService实现的
  • 执行任务执行引擎(ExecuteTaskExecuteEngine):执行任务,类似ThreadPoolExecutor。nacos为了减少线程切换,一般采用Thread+BlockingQueue实现。

环境准备

服务端

下载源码:github.com/alibaba/nac…
切换到2.1.2:git checkout 2.1.2
源码启动:先编译grpc相关类,再运行
image.png

客户端

搭建一套SpringCloudAlibaba环境,参考代码案例(黑马授课案例)
注意选版本,参考:SpringCloudAlibaba组件版本

<properties>
    <spring-cloud.version>Hoxton.SR12</spring-cloud.version>
    <spring-cloud-alibaba.version>2.2.8.RELEASE</spring-cloud-alibaba.version>
</properties>

<dependencyManagement>
    <dependencies>
        <!-- SpringCloud -->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-dependencies</artifactId>
            <version>${spring-cloud.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
        <!-- SpringCloudAlibaba父模块 -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-alibaba-dependencies</artifactId>
            <version>${spring-cloud-alibaba.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

服务注册流程和源码

流程图

客户端源码

入口类:NacosServiceRegistryAutoConfiguration
image.png

@AutoConfigureAfter({ AutoServiceRegistrationConfiguration.class, AutoServiceRegistrationAutoConfiguration.class, NacosDiscoveryAutoConfiguration.class })
public class NacosServiceRegistryAutoConfiguration {

	@Bean
	@ConditionalOnBean(AutoServiceRegistrationProperties.class)
	public NacosAutoServiceRegistration nacosAutoServiceRegistration(
			NacosServiceRegistry registry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
        // 注入NacosAutoServiceRegistration对象
		return new NacosAutoServiceRegistration(registry,
				autoServiceRegistrationProperties, registration);
	}

}

public class NacosAutoServiceRegistration extends AbstractAutoServiceRegistration<Registration> {
	public NacosAutoServiceRegistration(ServiceRegistry<Registration> serviceRegistry,
			AutoServiceRegistrationProperties autoServiceRegistrationProperties,
			NacosRegistration registration) {
        // 注意:父类的serviceRegistry是NacosServiceRegistry对象
		super(serviceRegistry, autoServiceRegistrationProperties);
		this.registration = registration;
	}
    
    protected void register() {
		//[] super.register();
        // 执行NacosServiceRegistry#register
        this.serviceRegistry.register(this.registration); //  getRegistration()返回的就是this.registration
	}
}

public class NacosServiceRegistry implements ServiceRegistry<Registration> {
	
    @Override
	public void register(Registration registration) {

		NacosNamingService namingService = namingService();
		String serviceId = registration.getServiceId();
		String group = nacosDiscoveryProperties.getGroup();
		Instance instance = getNacosInstanceFromRegistration(registration);
    	//[] namingService.registerInstance(serviceName=serviceId, groupName=group, instance);
        //[] clientProxy.registerService(serviceName, groupName, instance); .// clientProxy的类型是NamingClientProxyDelegate
        //[] getExecuteClientProxy(instance).registerService(serviceName, groupName, instance);
        // 临时实例使用grpc,持久化实例使用http
        NamingClientProxy clientProxy = instance.isEphemeral() ? grpcClientProxy : httpClientProxy;
    	//[] clientProxy.registerService(serviceName, groupName, instance);
        
        //[] redoService.cacheInstanceForRedo(serviceName, groupName, instance);
        // 缓存重做数据,定时使用redoData重新注册,代码在RedoScheduledTask(定时调用),最终调用的也是NamingGrpcClientProxy#doRegisterService
        InstanceRedoData redoData = InstanceRedoData.build(serviceName, groupName, instance);
        synchronized (registeredInstances) {
            registeredInstances.put(key, redoData);
        }
        //[] NamingGrpcClientProxy#doRegisterService(serviceName, groupName, instance);
        // 构建实例请求
        InstanceRequest request = new InstanceRequest(namespaceId, serviceName, groupName, NamingRemoteConstants.REGISTER_INSTANCE, instance);
        //[] NamingGrpcClientProxy#requestToServer(request, Response.class);
        // 发送RPC请求,====>>>> 接下来的逻辑要看服务端如何处理了
        Response response = requestTimeout < 0 ? rpcClient.request(request) : rpcClient.request(request, requestTimeout);
        
        //[] redoService.instanceRegistered(serviceName, groupName);	
        // 标记redoData已注册
        InstanceRedoData redoData = registeredInstances.get(key);
        if (null != redoData) {
            redoData.setRegistered(true);
        }
	}
}

服务端源码

处理请求入口方法:InstanceRequestHandler#handle

@Component
public class InstanceRequestHandler extends RequestHandler<InstanceRequest, InstanceResponse> {
    private final EphemeralClientOperationServiceImpl clientOperationService;

    public InstanceResponse handle(InstanceRequest request, RequestMeta meta) throws NacosException {
        Service service = Service
                .newService(request.getNamespace(), request.getGroupName(), request.getServiceName(), true);
        switch (request.getType()) {
            case NamingRemoteConstants.REGISTER_INSTANCE:
                //[] return registerInstance(service, request, meta);
                
                //[] clientOperationService.registerInstance(service, instance=request.getInstance(), clientId=meta.getConnectionId());
                // 首次就返回service(同一个对象),已存在时读缓存
                Service singleton = ServiceManager.getInstance().getSingleton(service);
                Client client = clientManager.getClient(clientId);
                // 生成服务端存储的instance信息,并记录到Client
                InstancePublishInfo instanceInfo = getPublishInfo(instance);
                client.addServiceInstance(singleton, instanceInfo);
                
                // 发布注册服务事件,源码解读见下文:维护服务注册订阅数据
                NotifyCenter.publishEvent(new ClientOperationEvent.ClientRegisterServiceEvent(singleton, clientId));
                // 发布元数据事件,源码解读见下文:管理元数据源码
                NotifyCenter.publishEvent(new MetadataEvent.InstanceMetadataEvent(singleton, instanceInfo.getMetadataId(), false));
                // 发布RegisterInstanceTraceEvent
                NotifyCenter.publishEvent(new RegisterInstanceTraceEvent());
                // 返回服务注册结果
                return new InstanceResponse(NamingRemoteConstants.REGISTER_INSTANCE);
            case NamingRemoteConstants.DE_REGISTER_INSTANCE:
                return deregisterInstance(service, request, meta);
            default:
                throw new NacosException(NacosException.INVALID_PARAM,
                        String.format("Unsupported request type %s", request.getType()));
        }
    }
}


说明:仅选了和流程相关代码介绍,更多细节请阅读源码;另外发完事件后的处理过程,需看 服务注册发现源码全面解读 部分。

服务发现流程和源码

SpringCloud服务发现需扩展DiscoveryClient#getInstances(String serviceName)DiscoveryClient#getServices两个方法,流程都涉及客户端和服务端,因此分两个流程图介绍。

流程图:获取服务列表

流程图:获取服务实例列表

时序图:获取服务列表

时序图:获取服务实例列表

客户端源码

入口类image.png
关键代码如下:

@Configuration(proxyBeanMethods = false)
@ConditionalOnDiscoveryEnabled
@ConditionalOnBlockingDiscoveryEnabled
@ConditionalOnNacosDiscoveryEnabled
@AutoConfigureBefore({ SimpleDiscoveryClientAutoConfiguration.class, CommonsClientAutoConfiguration.class })
@AutoConfigureAfter(NacosDiscoveryAutoConfiguration.class)
public class NacosDiscoveryClientConfiguration {

	@Bean
	public DiscoveryClient nacosDiscoveryClient(
			NacosServiceDiscovery nacosServiceDiscovery) {
		return new NacosDiscoveryClient(nacosServiceDiscovery);
	}

}

// 实现了SpringCloud的DiscoveryClient接口,重点是getInstances和getServices方法,而且都是由NacosServiceDiscovery实现
public class NacosDiscoveryClient implements DiscoveryClient {

	private NacosServiceDiscovery serviceDiscovery;

	public NacosDiscoveryClient(NacosServiceDiscovery nacosServiceDiscovery) {
		this.serviceDiscovery = nacosServiceDiscovery;
	}

	public List<String> getServices() {
		return serviceDiscovery.getServices();
	}
    
    public List<ServiceInstance> getInstances(String serviceId) {
		return serviceDiscovery.getInstances(serviceId);
	}

}
public class NacosServiceDiscovery {
	private NacosDiscoveryProperties discoveryProperties;
	private NacosServiceManager nacosServiceManager;

	public NacosServiceDiscovery(NacosDiscoveryProperties discoveryProperties,
			NacosServiceManager nacosServiceManager) {
		this.discoveryProperties = discoveryProperties;
		this.nacosServiceManager = nacosServiceManager;
	}

    // 返回指定group的所有服务名
	public List<String> getServices() throws NacosException {
		String group = discoveryProperties.getGroup();
        //[] namingService()
        // 最终调用NacosFactory#createNamingService
        NacosNamingService namingService = nacosServiceManager.getNamingService(discoveryProperties.getNacosProperties());
		//[] ListView<String> services = namingService.getServicesOfServer(1, Integer.MAX_VALUE, group); 
        //[] NacosNamingService#getServicesOfServer
        //[] clientProxy.getServiceList(pageNo, pageSize, groupName, selector)
        //[] NamingClientProxyDelegate#getServiceList
        //[] grpcClientProxy.getServiceList(pageNo, pageSize, groupName, selector);
        ServiceListRequest request = new ServiceListRequest(namespaceId, groupName, pageNo, pageSize);
        //[] ServiceListResponse response = requestToServer(request, ServiceListResponse.class);
        // 发送服务列表请求(ServiceListRequest),接下来由服务端处理
        Response response = requestTimeout < 0 ? rpcClient.request(request) : rpcClient.request(request, requestTimeout);
        ListView<String> result = new ListView<String>();
        result.setCount(response.getCount());
        result.setData(response.getServiceNames());
		return services.getData();
	}

    // 返回指定group和service的instance
    public List<ServiceInstance> getInstances(String serviceId) throws NacosException {
		String group = discoveryProperties.getGroup();
        // 最终调用NacosFactory#createNamingService
        NacosNamingService namingService = nacosServiceManager.getNamingService(discoveryProperties.getNacosProperties());
        
		//[] List<Instance> instances = namingService.selectInstances(serviceId, group, true);
        //[] NacosNamingService#selectInstances(serviceName, groupName, List<String> clusters, boolean healthy, boolean subscribe=true)
        ServiceInfo serviceInfo;
        String clusterString = StringUtils.join(clusters, ",");
        if (subscribe) { // 默认是订阅的
            // 从缓存获取ServiceInfo
            serviceInfo = serviceInfoHolder.getServiceInfo(serviceName, groupName, clusterString);
            if (null == serviceInfo) {
                //[] serviceInfo = (NamingClientProxyDelegate)clientProxy.subscribe(serviceName, groupName, clusterString);
                String serviceNameWithGroup = NamingUtils.getGroupedName(serviceName, groupName);
                String serviceKey = ServiceInfo.getKey(serviceNameWithGroup, clusters);

                // 定时同步服务端serviceInfo,定时执行UpdateTask(见下文)
                //[] serviceInfoUpdateService.scheduleUpdateIfAbsent(serviceName, groupName, clusters);
                String serviceKey = ServiceInfo.getKey(NamingUtils.getGroupedName(serviceName, groupName), clusters);
                UpdateTask task = new UpdateTask(serviceName, groupName, clusters);
                executor.schedule(task, DEFAULT_DELAY, TimeUnit.MILLISECONDS);
                
                
                ServiceInfo result = serviceInfoHolder.getServiceInfoMap().get(serviceKey);
                if (null == result || !isSubscribed(serviceName, groupName, clusters)) {
                    //[] result = grpcClientProxy.subscribe(serviceName, groupName, clusters);
                    // 缓存重做数据,定时使用redoData重新订阅,代码在RedoScheduledTask(由NamingGrpcRedoService定时调度),最终调用的也是NamingGrpcClientProxy#doSubscribe
                    redoService.cacheSubscriberForRedo(serviceName, groupName, clusters);
                    //[] result = doSubscribe(serviceName, groupName, clusters);
                    // 使用grpc发送服务订阅请求
                    SubscribeServiceRequest request = new SubscribeServiceRequest(namespaceId, groupName, serviceName, clusters, true);
                    SubscribeServiceResponse response = requestToServer(request, SubscribeServiceResponse.class);
                    
                    //[] redoService.subscriberRegistered(serviceName, groupName, clusters);
                    SubscriberRedoData redoData = subscribes.get(key);
                    if (null != redoData) {
                        // 标记订阅数据已订阅
                        redoData.setRegistered(true);
                    }
                    result = response.getServiceInfo();
                }
                //[] serviceInfoHolder.processServiceInfo(result);
                
                // 为监控添加service数量
                MetricsMonitor.getServiceInfoMapSizeMonitor().set(serviceInfoMap.size());
                if (changed) {
                    // 发布InstancesChangeEvent事件,处理源码在 InstancesChangeNotifier
                    NotifyCenter.publishEvent(new InstancesChangeEvent(serviceInfo.getName(), serviceInfo.getGroupName(),
                            serviceInfo.getClusters(), serviceInfo.getHosts()));
                    // 同步serviceInfo数据到本地文件
                    DiskCache.write(serviceInfo, cacheDir);
                }
                serviceInfo = result;
            }
        } else {
            // 最终使用grpc发送ServiceQueryRequest
            //[] serviceInfo = (NamingClientProxyDelegate)clientProxy.queryInstancesOfService(serviceName, groupName, clusterString, 0, false);
            //[] grpcClientProxy.queryInstancesOfService(serviceName, groupName, clusters, udpPort, healthyOnly);
            ServiceQueryRequest request = new ServiceQueryRequest(namespaceId, serviceName, groupName);
            request.setCluster(clusters);
            request.setHealthyOnly(healthyOnly);
            request.setUdpPort(udpPort);
            QueryServiceResponse response = requestToServer(request, QueryServiceResponse.class);
            serviceInfo = response.getServiceInfo();
        }
        //[] List<Instance> instances = selectInstances(serviceInfo, healthy);
        
        List<Instance> instances = serviceInfo.getHosts();
        Iterator<Instance> iterator = instances.iterator();
        while (iterator.hasNext()) {
            Instance instance = iterator.next();
            // 保留 健康、启用、权重大于0 的实例
            if (healthy != instance.isHealthy() || !instance.isEnabled() || instance.getWeight() <= 0) {
                iterator.remove();
            }
        }

        //[] return hostToServiceInstanceList(instances, serviceId);
        List<ServiceInstance> result = new ArrayList<>(instances.size());
		for (Instance instance : instances) {
            // 构建NacosServiceInstance
			ServiceInstance serviceInstance = hostToServiceInstance(instance, serviceId);
			if (serviceInstance != null) {
				result.add(serviceInstance);
			}
		}
        return result
	}
    
}

public class UpdateTask implements Runnable {
    public void run() {
        long delayTime = 1000L;
        
        try {
            ServiceInfo serviceObj = serviceInfoHolder.getServiceInfoMap().get(serviceKey);
            if (serviceObj == null) {
                // 使用grpc向服务端发送ServiceQueryRequest
                serviceObj = namingClientProxy.queryInstancesOfService(serviceName, groupName, clusters, 0, false);
                //[] serviceInfoHolder.processServiceInfo(serviceObj);
                
                // 为监控添加service数量
                MetricsMonitor.getServiceInfoMapSizeMonitor().set(serviceInfoMap.size());
                if (changed) {
                    // 发布InstancesChangeEvent事件
                    NotifyCenter.publishEvent(new InstancesChangeEvent(serviceInfo.getName(), serviceInfo.getGroupName(),
                            serviceInfo.getClusters(), serviceInfo.getHosts()));
                    // 同步serviceInfo数据到本地文件
                    DiskCache.write(serviceInfo, cacheDir);
                }
                lastRefTime = serviceObj.getLastRefTime();
                return;
            }
            
            if (serviceObj.getLastRefTime() <= lastRefTime) {
                // 使用grpc向服务端发送ServiceQueryRequest
                serviceObj = namingClientProxy.queryInstancesOfService(serviceName, groupName, clusters, 0, false);
                //[] serviceInfoHolder.processServiceInfo(serviceObj);
                
                // 为监控添加service数量
                MetricsMonitor.getServiceInfoMapSizeMonitor().set(serviceInfoMap.size());
                if (changed) {
                    // 发布InstancesChangeEvent事件,处理源码在 InstancesChangeNotifier
                    NotifyCenter.publishEvent(new InstancesChangeEvent(serviceInfo.getName(), serviceInfo.getGroupName(),
                            serviceInfo.getClusters(), serviceInfo.getHosts()));
                    // 同步serviceInfo数据到本地文件
                    DiskCache.write(serviceInfo, cacheDir);
                }
            }
            lastRefTime = serviceObj.getLastRefTime();
            if (CollectionUtils.isEmpty(serviceObj.getHosts())) {
                incFailCount(); // 记录失败次数,最多6次
                return;
            }
            delayTime = serviceObj.getCacheMillis() * DEFAULT_UPDATE_CACHE_TIME_MULTIPLE;
            
            resetFailCount(); // 设置失败次数为0
        } catch (Throwable e) {
            incFailCount(); // 记录失败次数,最多6次
        } finally {
            if (!isCancel) {
                // 注意:延时时间最长为60s,时长和失败次数相关(失败次数越大,延时时间越长)
                executor.schedule(this, Math.min(delayTime << failCount, DEFAULT_DELAY * 60),
                        TimeUnit.MILLISECONDS);
            }
        }
    }
    
}

客户端源码:处理 InstancesChangeEvent

处理类:InstancesChangeNotifier

public class InstancesChangeNotifier extends Subscriber<InstancesChangeEvent> {
    private final Map<String, ConcurrentHashSet<EventListener>> listenerMap = new ConcurrentHashMap<String, ConcurrentHashSet<EventListener>>();
    @Override
    public void onEvent(InstancesChangeEvent event) {
        String key = ServiceInfo.getKey(NamingUtils.getGroupedName(event.getServiceName(), event.getGroupName()), event.getClusters());
        ConcurrentHashSet<EventListener> eventListeners = listenerMap.get(key);
        if (CollectionUtils.isEmpty(eventListeners)) {
            return;
        }
        for (final EventListener listener : eventListeners) {
            //[] final com.alibaba.nacos.api.naming.listener.Event namingEvent = transferToNamingEvent(event);
            final com.alibaba.nacos.api.naming.listener.Event namingEvent = new NamingEvent(instancesChangeEvent.getServiceName(), instancesChangeEvent.getGroupName(),
                instancesChangeEvent.getClusters(), instancesChangeEvent.getHosts());
            // 最终调度执行listener.onEvent(namingEvent),只在NacosWatch#start找到了有效的EventListener,见下文
            if (listener instanceof AbstractEventListener && ((AbstractEventListener) listener).getExecutor() != null) {
                ((AbstractEventListener) listener).getExecutor().execute(() -> listener.onEvent(namingEvent));
            } else {
                listener.onEvent(namingEvent);
            }
        }
    }
}

public class NacosWatch implements ApplicationEventPublisherAware, SmartLifecycle, DisposableBean {
    private Map<String, EventListener> listenerMap = new ConcurrentHashMap(16);
    private final AtomicBoolean running = new AtomicBoolean(false);

    public void start() {
        if (this.running.compareAndSet(false, true)) {
            EventListener eventListener = (EventListener)this.listenerMap.computeIfAbsent(this.buildKey(), (event) -> {
                return new EventListener() {
                    public void onEvent(Event event) {
                        if (event instanceof NamingEvent) {
                            List instances = ((NamingEvent)event).getInstances();
                            
                            //[] Optional instanceOptional = NacosWatch.this.selectCurrentInstance(instances);
                            
                            // 按IP和端口选择第一个instance作为当前的instance
                            Optional instanceOptional = instances.stream().filter((instance) -> {
                                return this.properties.getIp().equals(instance.getIp()) && this.properties.getPort() == instance.getPort();
                            }).findFirst()
                                
                            instanceOptional.ifPresent((currentInstance) -> {
                                //[] NacosWatch.this.resetIfNeeded(currentInstance);
                                // 重新设置properties的metadata
                                if (!this.properties.getMetadata().equals(instance.getMetadata())) {
                                    this.properties.setMetadata(instance.getMetadata());
                                }
                            });
                        }
                    }
                };
            });
        }
    }
}

服务端源码

在上面的客户端源码中,发现发送三个请求:ServiceListRequest、SubscribeServiceRequest和ServiceQueryRequest,它们的服务端处理分别对应ServiceListRequestHandler、SubscribeServiceRequestHandler和ServiceQueryRequestHandler。

服务端源码:ServiceListRequestHandler

@Component
public class ServiceListRequestHandler extends RequestHandler<ServiceListRequest, ServiceListResponse> {
    private final ConcurrentHashMap<Service, Service> singletonRepository; // 位于 ServiceManager
    private final ConcurrentHashMap<String, Set<Service>> namespaceSingletonMaps; // 位于 ServiceManager
    
	@Override
    @Secured(action = ActionTypes.READ)
    public ServiceListResponse handle(ServiceListRequest request, RequestMeta meta) throws NacosException {
        // 获取命名空间下的Service
        //[] Collection<Service> serviceSet = ServiceManager.getInstance().getSingletons(request.getNamespace());
        Collection<Service> serviceSet = namespaceSingletonMaps.getOrDefault(namespace, new HashSet<>(1));

        // 构建响应结果
        ServiceListResponse result = ServiceListResponse.buildSuccessResponse(0, new LinkedList<>());
        if (!serviceSet.isEmpty()) {
            //[] Collection<String> serviceNameSet = selectServiceWithGroupName(serviceSet, groupName = request.getGroupName());
            Collection<String> serviceNameSet = new HashSet<>(serviceSet.size());
            for (Service each : serviceSet) {
                if (Objects.equals(groupName, each.getGroup())) {
                    // 添加groupServiceName,格式如:groupA@@serviceA
                    serviceNameSet.add(each.getGroupedServiceName());
                }
            }
            // TODO select service by selector
            // 按分页裁剪serviceNameSet
            List<String> serviceNameList = ServiceUtil.pageServiceName(request.getPageNo(), request.getPageSize(), serviceNameSet);
            result.setCount(serviceNameSet.size());
            result.setServiceNames(serviceNameList);
        }
        return result;
    }
}

服务端源码:SubscribeServiceRequestHandler

@Component
public class SubscribeServiceRequestHandler extends RequestHandler<SubscribeServiceRequest, SubscribeServiceResponse> {
    
    private final ServiceStorage serviceStorage;
    private final NamingMetadataManager metadataManager;
    private final EphemeralClientOperationServiceImpl clientOperationService;
    
    @Override
    @Secured(action = ActionTypes.READ)
    public SubscribeServiceResponse handle(SubscribeServiceRequest request, RequestMeta meta) throws NacosException {
        String namespaceId = request.getNamespace();
        String serviceName = request.getServiceName();
        String groupName = request.getGroupName();
        String app = request.getHeader("app", "unknown");
        String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
        Service service = Service.newService(namespaceId, groupName, serviceName, true);
        Subscriber subscriber = new Subscriber(meta.getClientIp(), meta.getClientVersion(), app, meta.getClientIp(),
                namespaceId, groupedServiceName, 0, request.getClusters());

        //[] ServiceInfo serviceInfo = ServiceUtil.selectInstancesWithHealthyProtection(serviceStorage.getData(service), metadataManager.getServiceMetadata(service).orElse(null), subscriber.getCluster(), false, true, subscriber.getIp());
        
        // 【核心,下文单独解读】从缓存(或首次构建)获取serviceInfo(不含集群信息)
        ServiceInfo result = serviceStorage.getData(service);
        //[] ServiceMetadata serviceMetadata = metadataManager.getServiceMetadata(service).orElse(null);
        
        // 从内存(map)获取ServiceMetadata
        ServiceMetadata serviceMetadata = Optional.ofNullable(serviceMetadataMap.get(service));
        // 仅包含有保护机制的健康实例(代码很多,但逻辑不复杂,暂不解读)
        ServiceInfo serviceInfo = ServiceUtil.selectInstancesWithHealthyProtection(result, serviceMetadata, subscriber.getCluster(), false, true, subscriber.getIp());
        
        
        if (request.isSubscribe()) {
            //[] clientOperationService.subscribeService(service, subscriber, meta.getConnectionId());
            //[] Service singleton = ServiceManager.getInstance().getSingletonIfExist(service).orElse(service);
            Service singleton = Optional.ofNullable((Map<Service, Service>)singletonRepository.get(service));
            
            ConnectionBasedClient client = (ConnectionBasedClientManager)clientManager.getClient(clientId); // 从map中获取
            //[] client.addServiceSubscriber(singleton, subscriber);
            if (null == subscribers.put(service, subscriber)) {
                MetricsMonitor.incrementSubscribeCount(); // 订阅数加1
            }
            client.setLastUpdatedTime();
            // 发布客户端订阅服务事件
            NotifyCenter.publishEvent(new ClientOperationEvent.ClientSubscribeServiceEvent(singleton, clientId));

            // 发布SubscribeServiceTraceEvent
            NotifyCenter.publishEvent(new SubscribeServiceTraceEvent(System.currentTimeMillis(),
                    meta.getClientIp(), service.getNamespace(), service.getGroup(), service.getName()));
        } else {
            //[] clientOperationService.unsubscribeService(service, subscriber, meta.getConnectionId());

            //[] Service singleton = ServiceManager.getInstance().getSingletonIfExist(service).orElse(service);
            Service singleton = Optional.ofNullable((Map<Service, Service>)singletonRepository.get(service));
            
            ConnectionBasedClient client = (ConnectionBasedClientManager)clientManager.getClient(clientId); // 从map中获取
            // client.removeServiceSubscriber(service=singleton);
            if (null != subscribers.remove(service)) {
                MetricsMonitor.decrementSubscribeCount(); // 订阅数减1
            }
            client.setLastUpdatedTime();
            // 发布客户端取消订阅事件
            NotifyCenter.publishEvent(new ClientOperationEvent.ClientUnsubscribeServiceEvent(singleton, clientId));
            
            // 发布UnsubscribeServiceTraceEvent
            NotifyCenter.publishEvent(new UnsubscribeServiceTraceEvent(System.currentTimeMillis(),
                    meta.getClientIp(), service.getNamespace(), service.getGroup(), service.getName()));
        }
        return new SubscribeServiceResponse(ResponseCode.SUCCESS.getCode(), "success", serviceInfo);
    }
}

服务端源码:ServiceQueryRequestHandler

@Component
public class ServiceQueryRequestHandler extends RequestHandler<ServiceQueryRequest, QueryServiceResponse> {
    private final ServiceStorage serviceStorage;
    private final NamingMetadataManager metadataManager;
    
    public ServiceQueryRequestHandler(ServiceStorage serviceStorage, NamingMetadataManager metadataManager) {
        this.serviceStorage = serviceStorage;
        this.metadataManager = metadataManager;
    }
    
    @Override
    @Secured(action = ActionTypes.READ)
    public QueryServiceResponse handle(ServiceQueryRequest request, RequestMeta meta) throws NacosException {
        String namespaceId = request.getNamespace();
        String groupName = request.getGroupName();
        String serviceName = request.getServiceName();
        Service service = Service.newService(namespaceId, groupName, serviceName);
        String cluster = null == request.getCluster() ? "" : request.getCluster();
        boolean healthyOnly = request.isHealthyOnly();

        // 【核心,下文单独解读】从缓存(或首次构建)获取serviceInfo(不含集群信息)
        ServiceInfo result = serviceStorage.getData(service);
        //[] ServiceMetadata serviceMetadata = metadataManager.getServiceMetadata(service).orElse(null);
        
        // 从内存(map)获取ServiceMetadata
        ServiceMetadata serviceMetadata = Optional.ofNullable(serviceMetadataMap.get(service));
        // 获取有保护机制的健康实例(代码很多,但逻辑不复杂,暂不解读)
        result = ServiceUtil.selectInstancesWithHealthyProtection(result, serviceMetadata, cluster, healthyOnly, true,
                meta.getClientIp());
        
        return QueryServiceResponse.buildSuccessResponse(result);
    }

}

关于事件处理请阅读全面解读服务注册发现事件处理,接下来我们研究serviceStorage.getData(service)的实现

服务端源码:ServiceStorage#getData

@Component
public class ServiceStorage {
    private final ConcurrentMap<Service, ServiceInfo> serviceDataIndexes;
    private final ConcurrentMap<Service, Set<String>> serviceClusterIndex;

    // 位于ServiceManager
    private final ConcurrentHashMap<Service, Service> singletonRepository;
    private final ConcurrentHashMap<String, Set<Service>> namespaceSingletonMaps;

    private final ClientServiceIndexesManager serviceIndexesManager;
    private final ClientManager clientManager;
    private final SwitchDomain switchDomain;
    private final NamingMetadataManager metadataManager;
    
    // 从缓存读取,或者遍历数据构建ServiceInfo
	public ServiceInfo getData(Service service) {
        if(serviceDataIndexes.containsKey(service)){
        	serviceDataIndexes.get(service)
        }
        //[] getPushData(service)
        // 创建一个空对象,设置些默认值
        ServiceInfo result = emptyServiceInfo(service);
        //[] Service singleton = ServiceManager.getInstance().getSingleton(service);
        singletonRepository.putIfAbsent(service, service);
        Service result = singletonRepository.get(service);
        namespaceSingletonMaps.computeIfAbsent(result.getNamespace(), (namespace) -> new ConcurrentHashSet<>());
        namespaceSingletonMaps.get(result.getNamespace()).add(result);
        // 获取服务的所有实例,下文再讲getAllInstancesFromIndex(代码很多)
        List<Instance> instances = getAllInstancesFromIndex(singleton);
        result.setHosts(instances);
        serviceDataIndexes.put(singleton, result);

        return result;
    }

    private List<Instance> getAllInstancesFromIndex(Service service) {
        Set<Instance> result = new HashSet<>();
        Set<String> clusters = new HashSet<>();
        // 获取所有的注册者的客户端ID
        //[] Collection<String> clientIds = serviceIndexesManager.getAllClientsRegisteredService(service);
        Collection<String> clientIds = publisherIndexes.containsKey(service) ? publisherIndexes.get(service) : new ConcurrentHashSet<>();
        
        for (String each : clientIds) {
            // 从注册表获取实例注册信息 
            Optional<InstancePublishInfo> instancePublishInfo = getInstanceInfo(each, service);
            if (instancePublishInfo.isPresent()) { // 存在
                InstancePublishInfo publishInfo = instancePublishInfo.get();
                //If it is a BatchInstancePublishInfo type, it will be processed manually and added to the instance list
                if (publishInfo instanceof BatchInstancePublishInfo) {
                    BatchInstancePublishInfo batchInstancePublishInfo = (BatchInstancePublishInfo) publishInfo;
                    //[] List<Instance> batchInstance = parseBatchInstance(service, batchInstancePublishInfo, clusters);
                    
                    List<Instance> resultInstanceList = new ArrayList<>();
                    // 获取 BatchInstancePublishInfo 包含的多个InstancePublishInfo
                    List<InstancePublishInfo> instancePublishInfos = batchInstancePublishInfo.getInstancePublishInfos();
                    for (InstancePublishInfo instancePublishInfo : instancePublishInfos) {
                        // InstancePublishInfo 转 Instance
                        Instance instance = parseInstance(service, instancePublishInfo);
                        
                        resultInstanceList.add(instance); // 添加instance
                        clusters.add(instance.getClusterName()); // 添加cluster
                    }
                    List<Instance> batchInstance = resultInstanceList;
                    
                    result.addAll(batchInstance);
                } else {
                    // InstancePublishInfo 转 Instance
                    Instance instance = parseInstance(service, instancePublishInfo.get());
                    result.add(instance); // 添加instance
                    clusters.add(instance.getClusterName());  // 添加cluster
                }
            }
        }
        // 缓存服务的clusters
        serviceClusterIndex.put(service, clusters);
        return new LinkedList<>(result);
    }

	// 顺带提下获取cluster的代码
    public Set<String> getClusters(Service service) {
        return serviceClusterIndex.getOrDefault(service, new HashSet<>());
    }
}

服务注册发现事件处理源码

这些核心功能主要采用事件驱动架构实现,在服务注册中发布了ServiceEvent.ServiceChangeEventClientEvent.ClientChangedEvent(由ClientOperationEvent.ClientRegisterServiceEvent事件发布)MetadataEvent.InstanceMetadataEventRegisterInstanceTraceEvent事件,在服务发现中发布了ClientOperationEvent.ClientSubscribeServiceEventSubscribeServiceTraceEvent事件,

下文事件名直接用子类名,如ServiceEvent.ServiceChangeEventServiceChangeEvent

维护服务注册订阅数据

核心功能:维护(添加/删除)服务注册和订阅内存数据,同时传递相应的客户端事件
源码位置:ClientServiceIndexesManager#onEvent
处理两类事件:ClientEvent.ClientDisconnectEventClientOperationEventClientOperationEvent包括ClientRegisterServiceEventClientDeregisterServiceEventClientSubscribeServiceEventClientUnsubscribeServiceEvent四种事件
关键代码如下:

public class ClientServiceIndexesManager extends SmartSubscriber {
    private final ConcurrentMap<Service, Set<String>> publisherIndexes = new ConcurrentHashMap<>();
    
    private final ConcurrentMap<Service, Set<String>> subscriberIndexes = new ConcurrentHashMap<>();
    
    @Override
    public void onEvent(Event event) {
        if (event instanceof ClientEvent.ClientDisconnectEvent) {
            //[] handleClientDisconnect((ClientEvent.ClientDisconnectEvent) event);
        	Client client = event.getClient();
            for (Service each : client.getAllSubscribeService()) { // 客户端订阅的所有服务
                //[] removeSubscriberIndexes(each, client.getClientId());
                // 服务订阅表移出client
                subscriberIndexes.get(service).remove(clientId);
                if (subscriberIndexes.get(service).isEmpty()) {
                    // 如果服务的订阅者为空,则把服务从订阅表中删除
                    subscriberIndexes.remove(service);
                }
            }
            // 注销实例原因
            DeregisterInstanceReason reason = event.isNative()
                    ? DeregisterInstanceReason.NATIVE_DISCONNECTED : DeregisterInstanceReason.SYNCED_DISCONNECTED;
            long currentTimeMillis = System.currentTimeMillis();
            for (Service each : client.getAllPublishedService()) { // 客户端发布的所有服务
                //[] removePublisherIndexes(each, client.getClientId());
                publisherIndexes.computeIfPresent(service, (s, ids) -> {
                    ids.remove(clientId); // 从注册表删除client
                    // 发布ServiceChangedEvent,见服务推送部分内容
                    NotifyCenter.publishEvent(new ServiceEvent.ServiceChangedEvent(service, true));
                    // 服务的实例为空时,会把服务从注册表删除。注意:computeIfPresent返回值为null时,意味着key被移出。
                    return ids.isEmpty() ? null : ids;
                });
                InstancePublishInfo instance = client.getInstancePublishInfo(each);
                NotifyCenter.publishEvent(new DeregisterInstanceTraceEvent(currentTimeMillis,
                        "", false, reason, each.getNamespace(), each.getGroup(), each.getName(),
                        instance.getIp(), instance.getPort()));
            }
        } else if (event instanceof ClientOperationEvent) {
            //[] handleClientOperation((ClientOperationEvent) event);
        	Service service = event.getService();
            String clientId = event.getClientId();
            if (event instanceof ClientOperationEvent.ClientRegisterServiceEvent) {
                //[] addPublisherIndexes(service, clientId);
                // 添加注册服务的client,同时发布服务变更事件
                publisherIndexes.computeIfAbsent(service, key -> new ConcurrentHashSet<>());
                publisherIndexes.get(service).add(clientId);
                NotifyCenter.publishEvent(new ServiceEvent.ServiceChangedEvent(service, true));
            } else if (event instanceof ClientOperationEvent.ClientDeregisterServiceEvent) {
                //[] removePublisherIndexes(service, clientId);
                // 注销注册服务的client,同时发布服务变更事件
                publisherIndexes.computeIfPresent(service, (s, ids) -> {
                    ids.remove(clientId);
                    NotifyCenter.publishEvent(new ServiceEvent.ServiceChangedEvent(service, true));
                    return ids.isEmpty() ? null : ids;
                });
            } else if (event instanceof ClientOperationEvent.ClientSubscribeServiceEvent) {
                //[] addSubscriberIndexes(service, clientId);
                // 添加订阅服务的client,同时发布服务订阅事件
                subscriberIndexes.computeIfAbsent(service, key -> new ConcurrentHashSet<>());
                if (subscriberIndexes.get(service).add(clientId)) {
                    NotifyCenter.publishEvent(new ServiceEvent.ServiceSubscribedEvent(service, clientId));
                }
            } else if (event instanceof ClientOperationEvent.ClientUnsubscribeServiceEvent) {
                //[] removeSubscriberIndexes(service, clientId);
                // 删除订阅服务的client,若没有client订阅service,则移除service
                subscriberIndexes.get(service).remove(clientId);
                if (subscriberIndexes.get(service).isEmpty()) {
                    subscriberIndexes.remove(service);
                }
            }
        }
    }
}

服务推送

核心功能:向订阅客户端推送服务变动数据
源码位置:NamingSubscriberServiceV2Impl#onEvent
处理两个事件:ServiceChangeEvent和ServiceSubscribedEvent

时序图

为了减少链路,忽略了返回流程,合并了两个事件的调用链

客户端源码

public class NamingSubscriberServiceV2Impl extends SmartSubscriber implements NamingSubscriberService {
	
    public NamingSubscriberServiceV2Impl(ClientManagerDelegate clientManager,
            ClientServiceIndexesManager indexesManager, ServiceStorage serviceStorage,
            NamingMetadataManager metadataManager, PushExecutorDelegate pushExecutor, UpgradeJudgement upgradeJudgement,
            SwitchDomain switchDomain) {
        this.clientManager = clientManager;
        this.indexesManager = indexesManager;
        this.upgradeJudgement = upgradeJudgement;
        this.delayTaskEngine = new PushDelayTaskExecuteEngine(clientManager, indexesManager, serviceStorage,
                metadataManager, pushExecutor, switchDomain);
        NotifyCenter.registerSubscriber(this, NamingEventPublisherFactory.getInstance());
    }

    public void onEvent(Event event) {
        // 必须是grpc
        if (!upgradeJudgement.isUseGrpcFeatures()) {
            return;
        }
        if (event instanceof ServiceEvent.ServiceChangedEvent) {
            // 服务变更时,向所有客户端推送
            ServiceEvent.ServiceChangedEvent serviceChangedEvent = (ServiceEvent.ServiceChangedEvent) event;
            Service service = serviceChangedEvent.getService();
            // 注意:PushDelayTask默认未指定客户端地址,会推送给所有客户端
            PushDelayTask task = new PushDelayTask(service, PushConfig.getInstance().getPushTaskDelay());
            // 添加延时任务到任务引擎,最终执行 PushDelayTaskProcessor.process,引擎实现源码见下文:延时任务调度引擎
            delayTaskEngine.addTask(service, task);
        } else if (event instanceof ServiceEvent.ServiceSubscribedEvent) {
            // 客户端服务订阅变更时,只推送给这个变更的客户端
            ServiceEvent.ServiceSubscribedEvent subscribedEvent = (ServiceEvent.ServiceSubscribedEvent) event;
            Service service = subscribedEvent.getService();
            // 注意:PushDelayTask指定了事件变更的客户端
            PushDelayTask task = new PushDelayTask(service, PushConfig.getInstance().getPushTaskDelay(), subscribedEvent.getClientId());
            // 添加延时任务到任务引擎,最终执行 PushDelayTaskProcessor.process,引擎实现源码见下文:延时任务调度引擎
            delayTaskEngine.addTask(service, task);
        }
    }
}

private static class PushDelayTaskProcessor implements NacosTaskProcessor {
    
    @Override
    public boolean process(NacosTask task) {
        PushDelayTask pushDelayTask = (PushDelayTask) task;
        Service service = pushDelayTask.getService();
        PushExecuteTask task = new PushExecuteTask(service, executeEngine, pushDelayTask);
        // 添加执行任务,其实就是NacosExecuteTaskExecuteEngine,源码见下文:任务执行引擎
        NamingExecuteTaskDispatcher.getInstance().dispatchAndExecuteTask(service, task);
        return true;
    }
}

public class PushExecuteTask extends AbstractExecuteTask implements NacosTask, Runnable {
	public void run() {
        try {
            //[] PushDataWrapper wrapper = generatePushData();
            ServiceInfo serviceInfo = delayTaskEngine.getServiceStorage().getPushData(service);
            ServiceMetadata serviceMetadata = delayTaskEngine.getMetadataManager().getServiceMetadata(service).orElse(null);
            PushDataWrapper wrapper = new PushDataWrapper(serviceMetadata, serviceInfo);
            
            ClientManager clientManager = delayTaskEngine.getClientManager();
            //[] Collection<String> clientIds = getTargetClientIds();
            // 获取所有客户端或指定的客户端
            Collection<String> clientIds = delayTask.isPushToAll() ? delayTaskEngine.getIndexesManager().getAllClientsSubscribeService(service): delayTask.getTargetClients();
            for (String each : clientIds) {
                Client client = clientManager.getClient(each);
                Subscriber subscriber = clientManager.getClient(each).getSubscriber(service);
                // 请求回调:成功、失败、超时处理逻辑
                ServicePushCallback callback = new ServicePushCallback(each, subscriber, wrapper.getOriginalData(), delayTask.isPushToAll());
                // 实际执行的是 PushExecutorRpcImpl#doPushWithCallback
                delayTaskEngine.getPushExecutor().doPushWithCallback(each, subscriber, wrapper, callback);
            }
        } catch (Exception e) {
            Loggers.PUSH.error("Push task for service" + service.getGroupedServiceName() + " execute failed ", e);
            delayTaskEngine.addTask(service, new PushDelayTask(service, 1000L));
        }
    }
}

public class PushExecutorRpcImpl implements PushExecutor {
	
    private final RpcPushService pushService;
    
    public void doPushWithCallback(String clientId, Subscriber subscriber, PushDataWrapper data,
            NamingPushCallback callBack) {
        
        //[] ServiceInfo actualServiceInfo = getServiceInfo(data, subscriber);
        // 获取可用的实例
        ServiceInfo actualServiceInfo = ServiceUtil.selectInstancesWithHealthyProtection(data.getOriginalData(), data.getServiceMetadata(), false, true, subscriber);
        callBack.setActualServiceInfo(actualServiceInfo);
        // 发起rpc请求
        NotifySubscriberRequest request = NotifySubscriberRequest.buildNotifySubscriberRequest(actualServiceInfo);
        //[] pushService.pushWithCallback(clientId, request, callBack, GlobalExecutor.getCallbackExecutor());
        Connection connection = connectionManager.getConnection(connectionId);
        if (connection != null) {
            try {
                // 异步请求
                connection.asyncRequest(request, new AbstractRequestCallBack(requestCallBack.getTimeout()) {
                    
                    @Override
                    public Executor getExecutor() {
                        return executor;
                    }
                    
                    @Override
                    public void onResponse(Response response) {
                        if (response.isSuccess()) {
                            requestCallBack.onSuccess();
                        } else {
                            requestCallBack.onFail(new NacosException(response.getErrorCode(), response.getMessage()));
                        }
                    }
                    
                    @Override
                    public void onException(Throwable e) {
                        requestCallBack.onFail(e);
                    }
                });
            } catch (ConnectionAlreadyClosedException e) {
                // 连接关闭时注销连接
                connectionManager.unregister(connectionId);
                requestCallBack.onSuccess();
            } catch (Exception e) {
                requestCallBack.onFail(e);
            }
        } else {
            requestCallBack.onSuccess();
        }
    }
    
}

服务端源码

public class NamingPushRequestHandler implements ServerRequestHandler {
    
    private final ServiceInfoHolder serviceInfoHolder;
    
    public NamingPushRequestHandler(ServiceInfoHolder serviceInfoHolder) {
        this.serviceInfoHolder = serviceInfoHolder;
    }
    
    @Override
    public Response requestReply(Request request) {
        if (request instanceof NotifySubscriberRequest) {
            NotifySubscriberRequest notifyResponse = (NotifySubscriberRequest) request;
            //[] serviceInfoHolder.processServiceInfo(notifyResponse.getServiceInfo());
            
            // 更新serviceInfo
            String serviceKey = serviceInfo.getKey();
            ServiceInfo oldService = serviceInfoMap.get(serviceInfo.getKey());
            
            serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
            boolean changed = isChangedServiceInfo(oldService, serviceInfo);
            if (StringUtils.isBlank(serviceInfo.getJsonFromServer())) {
                serviceInfo.setJsonFromServer(JacksonUtils.toJson(serviceInfo));
            }
            MetricsMonitor.getServiceInfoMapSizeMonitor().set(serviceInfoMap.size());
            if (changed) {
                // ****发布事件
                NotifyCenter.publishEvent(new InstancesChangeEvent(notifierEventScope, serviceInfo.getName(), serviceInfo.getGroupName(),
                        serviceInfo.getClusters(), serviceInfo.getHosts()));
                // ****写到本地文件
                DiskCache.write(serviceInfo, cacheDir);
            }
            return new NotifySubscriberResponse();
        }
        return null;
    }
}

集群数据同步

核心功能:向集群其它成员同步当前节点的数据(ClientSyncData对象,包含clientId、clietnAttributes、namespaces、groupNames、serviceNames、instancePublishInfos),通过远程调用集群其它节点接口实现(发送DistroDataRequest)
源码入口:DistroClientDataProcessor#onEvent
处理三个事件:ClientVerifyFailedEvent()ClientDisconnectEventClientChangedEvent

时序图

为了减少链路,忽略返回流程

事件处理类:DistroClientDataProcessor

源码入口:DistroClientDataProcessor#onEvent

public class DistroClientDataProcessor extends SmartSubscriber implements DistroDataStorage, DistroDataProcessor {

    public static final String TYPE = "Nacos:Naming:v2:ClientData";
    
    public void onEvent(Event event) {
        // 只有集群模式才有效
        if (EnvUtil.getStandaloneMode()) {
            return;
        }
    
    	// 必须是grpc
    	if (!upgradeJudgement.isUseGrpcFeatures()) {
            return;
        }
    
     	Client client = event.getClient();
        // 必须是临时实例,同时当前服务是负责此客户端的
        if (null == client || !client.isEphemeral() || !clientManager.isResponsibleClient(client)) {
            return;
        }
    	// 处理事件
    	if (event instanceof ClientEvent.ClientVerifyFailedEvent) {
            // 相关代码:DistroVerifyExecuteTask和DistroVerifyCallbackWrapper类
            //[] syncToVerifyFailedServer((ClientEvent.ClientVerifyFailedEvent) event);
            DistroKey distroKey = new DistroKey(client.getClientId(), TYPE);
            // 同步验证失败的数据
            //[] distroProtocol.syncToTarget(distroKey, DataOperation.ADD, event.getTargetServer(), 0L);
            DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(), targetServer);
            // 添加了DistroDelayTask延时任务,由 DistroDelayTaskExecuteEngine 调度执行,最终是调用 DistroDelayTaskProcessor#process(见下面的代码)
            DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay);
            distroTaskEngineHolder.getDelayTaskExecuteEngine().addTask(distroKeyWithTarget, distroDelayTask);
        } else {
            //[] syncToAllServer((ClientEvent) event);
            if (event instanceof ClientEvent.ClientDisconnectEvent) {
                DistroKey distroKey = new DistroKey(client.getClientId(), TYPE);
                //[] distroProtocol.sync(distroKey, DataOperation.DELETE);
            	//[] sync(distroKey, action, DistroConfig.getInstance().getSyncDelayMillis());
            	for (Member each : memberManager.allMembersWithoutSelf()) { // 获取集群中的其它nacos
                    // 与ClientVerifyFailedEvent的syncToTarget相同
                    //[] syncToTarget(distroKey, action, each.getAddress(), delay); 
                    DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(), targetServer);
                    // 添加了DistroDelayTask延时任务,由 DistroDelayTaskExecuteEngine 调度执行,最终是调用 DistroDelayTaskProcessor#process(见下面的代码)
                    DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay);
                    distroTaskEngineHolder.getDelayTaskExecuteEngine().addTask(distroKeyWithTarget, distroDelayTask);
                }
            } else if (event instanceof ClientEvent.ClientChangedEvent) {
                DistroKey distroKey = new DistroKey(client.getClientId(), TYPE);
                // 和ClientDisconnectEvent处理逻辑一致,只是数据类型由DELETE变为CHANGE
                distroProtocol.sync(distroKey, DataOperation.CHANGE);
            }
        }
    }
}

//[] DistroDelayTaskProcessor#process 
public class DistroDelayTaskProcessor implements NacosTaskProcessor {
    
    public boolean process(NacosTask task) {
        if (!(task instanceof DistroDelayTask)) {
            return true;
        }
        DistroDelayTask distroDelayTask = (DistroDelayTask) task;
        DistroKey distroKey = distroDelayTask.getDistroKey();
        switch (distroDelayTask.getAction()) {
            case DELETE: // unregister注册的是DELETE事件
                // 添加了DistroSyncDeleteTask执行任务,由 DistroExecuteTaskExecuteEngine 执行
                DistroSyncDeleteTask syncDeleteTask = new DistroSyncDeleteTask(distroKey, distroComponentHolder);
                //[] distroTaskEngineHolder.getExecuteWorkersManager().addTask(distroKey, syncDeleteTask);
                NacosTaskProcessor processor = getProcessor(tag); // 没有Processor
                if (null != processor) {
                    processor.process(task);
                    return;
                }
                TaskExecuteWorker worker = getWorker(tag);
                // 调度执行的是 DistroSyncDeleteTask 的doExecute或doExecuteWithCallback方法,它们都调用DistroTransportAgent#syncData(跳到下文)
                worker.process(task); 
                return true;
            case CHANGE:
            case ADD:
                // 逻辑和Delete一致
                DistroSyncChangeTask syncChangeTask = new DistroSyncChangeTask(distroKey, distroComponentHolder);
                distroTaskEngineHolder.getExecuteWorkersManager().addTask(distroKey, syncChangeTask);
                return true;
            default:
                return false;
        }
    }
}

DistroTransportAgent#syncData

DistroTransportAgent有两个实现类:DistroClientTransportAgentDistroHttpAgent(不支持回调),nacos2肯定是DistroClientTransportAgent(不可能是HTTP)

 public class DistroClientTransportAgent implements DistroTransportAgent {
    @Override
    public boolean syncData(DistroData data, String targetServer) {
        if (isNoExistTarget(targetServer)) {
            return true;
        }
        DistroDataRequest request = new DistroDataRequest(data, data.getType());
        Member member = memberManager.find(targetServer);
        try {
            // 同步发送 DistroDataRequest 请求
            Response response = clusterRpcClientProxy.sendRequest(member, request);
            return checkResponse(response);
        } catch (NacosException e) {
            Loggers.DISTRO.error("[DISTRO-FAILED] Sync distro data failed! key: {}", data.getDistroKey(), e);
        }
        return false;
    }
    
    @Override
    public void syncData(DistroData data, String targetServer, DistroCallback callback) {
        if (isNoExistTarget(targetServer)) {
            callback.onSuccess();
            return;
        }
        DistroDataRequest request = new DistroDataRequest(data, data.getType());
        Member member = memberManager.find(targetServer);
        try {
            // 异步发送 DistroDataRequest 请求
            clusterRpcClientProxy.asyncRequest(member, request, new DistroRpcCallbackWrapper(callback, member));
        } catch (NacosException nacosException) {
            callback.onFailed(nacosException);
        }
    }
 }

接下来看服务端是如何处理 DistroDataRequest 请求的

处理DistroDataRequest请求

源码入口:DistroDataRequestHandler#handle,关键代码如下:

  • 变更操作(ADD/CHANGE/DELETE):维护注册表数据,然后发布事件(ClientRegisterServiceEventClientDeregisterServiceEventClientDisconnectEvent
  • 获取快照数据(SNAPSHOT)
  • 处理验证数据(VERIFY)
public class DistroDataRequestHandler extends RequestHandler<DistroDataRequest, DistroDataResponse> {
    
    public DistroDataResponse handle(DistroDataRequest request, RequestMeta meta) throws NacosException {
        try {
            switch (request.getDataOperation()) {
                case VERIFY:
                    // []return handleVerify(request.getDistroData(), meta);
                    DistroDataResponse result = new DistroDataResponse();
                    if (!distroProtocol.onVerify(distroData, meta.getClientIp())) {
                        result.setErrorInfo(ResponseCode.FAIL.getCode(), "[DISTRO-FAILED] distro data verify failed");
                    }
                    return result;
                case SNAPSHOT: // 获取基准快照数据
                    //[] return handleSnapshot();
                    DistroDataResponse result = new DistroDataResponse();
                    //[] DistroData distroData = distroProtocol.onSnapshot(DistroClientDataProcessor.TYPE);
                    DistroDataStorage distroDataStorage = distroComponentHolder.findDataStorage(type);
                    if (null == distroDataStorage) {
                        distroData = new DistroData(new DistroKey("snapshot", type), new byte[0]);
                    } else {
                        //[] distroDataStorage.getDatumSnapshot();
                        ClientSyncDatumSnapshot snapshot = new ClientSyncDatumSnapshot();
                        byte[] data = ApplicationUtils.getBean(Serializer.class).serialize(snapshot);
                        distroData = new DistroData(new DistroKey(DataOperation.SNAPSHOT.name(), TYPE), data);
                    }
                    result.setDistroData(distroData);
                    return result;
                case ADD:
                case CHANGE:
                case DELETE:
                    // []return handleSyncData(request.getDistroData());
                    DistroDataResponse result = new DistroDataResponse();
                    String resourceType = distroData.getDistroKey().getResourceType();
                    // [] distroProtocol.onReceive(distroData),其结果改用onReceive标记
                    boolean onReceive = false;
                    //dataProcessor是DistroConsistencyServiceImpl对象
                    DistroDataProcessor dataProcessor = distroComponentHolder.findDataProcessor(resourceType); 
                    // []return dataProcessor.processData(distroData);
                    switch (distroData.getType()) {
                        case ADD:
                        case CHANGE:
                            // 反序列content(类型为byte[])为clientSyncData
                            ClientSyncData clientSyncData = ApplicationUtils.getBean(Serializer.class)
                                    .deserialize(distroData.getContent(), ClientSyncData.class);
                            // [] handlerClientSyncData(clientSyncData);
                            // 生成client:不存在时创建client(IpPortBasedClient)
                            clientManager.syncClientConnected(clientSyncData.getClientId(), clientSyncData.getAttributes()); 
                            Client client = clientManager.getClient(clientSyncData.getClientId()); 
                            // [] upgradeClient(client, clientSyncData);
                            Set<Service> syncedService = new HashSet<>();
    
                            // 批处理逻辑和单处理逻辑类似
                            // processBatchInstanceDistroData(syncedService, client, clientSyncData);
                            
                            for (int i = 0; i < clientSyncData.namespaces.size(); i++) {
                                // 添加注册表信息,并发布ClientRegisterServiceEvent
                                client.addServiceInstance(singleton, instancePublishInfo);
                                NotifyCenter.publishEvent(
                                        new ClientOperationEvent.ClientRegisterServiceEvent(singleton, client.getClientId()));
                            }
                            for (Service each : client.getAllPublishedService()) { // 遍历client发布的所有服务
                                // 删除注册表信息,并发布ClientDeregisterServiceEvent
                                client.removeServiceInstance(each);
                                NotifyCenter.publishEvent(
                                        new ClientOperationEvent.ClientDeregisterServiceEvent(each, client.getClientId()));
                            }
                            return true; // onReceive=true
                        case DELETE:
                            String deleteClientId = distroData.getDistroKey().getResourceKey();
                            Loggers.DISTRO.info("[Client-Delete] Received distro client sync data {}", deleteClientId);
                            //[] clientManager.clientDisconnected(deleteClientId);
                            // 移除client,并发布ClientDisconnectEvent
                            IpPortBasedClient client = clients.remove(clientId);
                            NotifyCenter.publishEvent(new ClientEvent.ClientDisconnectEvent(client, isResponsibleClient(client)));
                            return true; // onReceive=true
                        default:
                            return false; // onReceive=false
                    }
                    if (!onReceive) { 
                        result.setErrorCode(ResponseCode.FAIL.getCode());
                        result.setMessage("[DISTRO-FAILED] distro data handle failed");
                    }
                    return result;
                case QUERY:
                    // []return handleQueryData(request.getDistroData());
                    DistroDataResponse result = new DistroDataResponse();
                    DistroKey distroKey = distroData.getDistroKey();
                    DistroData queryData = distroProtocol.onQuery(distroKey);
                    result.setDistroData(queryData);
                    return result;
                default:
                    return new DistroDataResponse();
            }
        } catch (Exception e) {
            DistroDataResponse result = new DistroDataResponse();
            result.setErrorCode(ResponseCode.FAIL.getCode());
            result.setMessage("handle distro request with exception");
            return result;
        }
    }
}

服务健康检查:ConnectionManager

基本逻辑:按连接限制规则(ConnectionLimitRule)和过期时间(20s)计算过期的连接,然后探活(向客户端发送异步RPC请求)

流程图

源码入口:ConnectionManager#start(应用启动时执行)

image.png

此函数包含健康检测和连接重置功能

【初始化】计算过期连接和驱逐客户端

int totalCount = connections.size();
Loggers.REMOTE_DIGEST.info("Connection check task start");
MetricsMonitor.getLongConnectionMonitor().set(totalCount);
Set<Map.Entry<String, Connection>> entries = connections.entrySet();
int currentSdkClientCount = currentSdkClientCount();
boolean isLoaderClient = loadClient >= 0;
int currentMaxClient = isLoaderClient ? loadClient : connectionLimitRule.countLimit;
int expelCount = currentMaxClient < 0 ? 0 : Math.max(currentSdkClientCount - currentMaxClient, 0);
List<String> expelClient = new LinkedList<>();

Map<String, AtomicInteger> expelForIp = new HashMap<>(16);
//1. calculate expel count  of ip.
for (Map.Entry<String, Connection> entry : entries) {

    Connection client = entry.getValue();
    String appName = client.getMetaInfo().getAppName();
    String clientIp = client.getMetaInfo().getClientIp();
    if (client.getMetaInfo().isSdkSource() && !expelForIp.containsKey(clientIp)) {
        //get limit for current ip.
        int countLimitOfIp = connectionLimitRule.getCountLimitOfIp(clientIp);
        if (countLimitOfIp < 0) {
            int countLimitOfApp = connectionLimitRule.getCountLimitOfApp(appName);
            countLimitOfIp = countLimitOfApp < 0 ? countLimitOfIp : countLimitOfApp;
        }
        if (countLimitOfIp < 0) {
            countLimitOfIp = connectionLimitRule.getCountLimitPerClientIpDefault();
        }

        if (countLimitOfIp >= 0 && connectionForClientIp.containsKey(clientIp)) {
            AtomicInteger currentCountIp = connectionForClientIp.get(clientIp);
            if (currentCountIp != null && currentCountIp.get() > countLimitOfIp) {
                expelForIp.put(clientIp, new AtomicInteger(currentCountIp.get() - countLimitOfIp));
            }
        }
    }
}
Set<String> outDatedConnections = new HashSet<>();
long now = System.currentTimeMillis();
//2.get expel connection for ip limit.
for (Map.Entry<String, Connection> entry : entries) {
    Connection client = entry.getValue();
    String clientIp = client.getMetaInfo().getClientIp();
    AtomicInteger integer = expelForIp.get(clientIp);
    if (integer != null && integer.intValue() > 0) {
        integer.decrementAndGet();
        expelClient.add(client.getMetaInfo().getConnectionId());
        expelCount--;
    } else if (now - client.getMetaInfo().getLastActiveTime() >= KEEP_ALIVE_TIME) {
        outDatedConnections.add(client.getMetaInfo().getConnectionId());
    }

}
 //3. if total count is still over limit.
if (expelCount > 0) {
    for (Map.Entry<String, Connection> entry : entries) {
        Connection client = entry.getValue();
        if (!expelForIp.containsKey(client.getMetaInfo().clientIp) && client.getMetaInfo()
                .isSdkSource() && expelCount > 0) {
            expelClient.add(client.getMetaInfo().getConnectionId());
            expelCount--;
            outDatedConnections.remove(client.getMetaInfo().getConnectionId());
        }
    }
}

【功能】与驱逐的客户端重置连接

发送ConnectResetRequest请求,然后注销已经关闭的连接

String serverIp = null;
String serverPort = null;
if (StringUtils.isNotBlank(redirectAddress) && redirectAddress.contains(Constants.COLON)) {
    String[] split = redirectAddress.split(Constants.COLON);
    serverIp = split[0];
    serverPort = split[1];
}

for (String expelledClientId : expelClient) {
    try {
        Connection connection = getConnection(expelledClientId);
        if (connection != null) {
            ConnectResetRequest connectResetRequest = new ConnectResetRequest();
            connectResetRequest.setServerIp(serverIp);
            connectResetRequest.setServerPort(serverPort);
            connection.asyncRequest(connectResetRequest, null); // 向客户端发送请求,而且不关注响应结果
        }

    } catch (ConnectionAlreadyClosedException e) {
        unregister(expelledClientId); // 注销已关闭的连接
    } catch (Exception e) {
        Loggers.REMOTE_DIGEST.error("Error occurs when expel connection, expelledClientId:{}", expelledClientId, e);
    }
}

客户端处理逻辑

class ConnectResetRequestHandler implements ServerRequestHandler {
    // 位于 RpcClient
    private final BlockingQueue<ReconnectContext> reconnectionSignal = new ArrayBlockingQueue<>(1);
    
    @Override
    public Response requestReply(Request request) {
        
        if (request instanceof ConnectResetRequest) {
            synchronized (RpcClient.this) {
                if (isRunning()) {
                    ConnectResetRequest connectResetRequest = (ConnectResetRequest) request;
                    if (StringUtils.isNotBlank(connectResetRequest.getServerIp())) {
                        // 解析服务信息
                        ServerInfo serverInfo = resolveServerInfo(connectResetRequest.getServerIp() + Constants.COLON + connectResetRequest.getServerPort());
                        //[] switchServerAsync(recommendServerInfo=serverInfo, onRequestFail=false);
                        // 入队,执行任务见下面
                        reconnectionSignal.offer(new ReconnectContext(recommendServerInfo, onRequestFail));
                    } else {
                        //[] switchServerAsync();
                        switchServerAsync(null, false); //和上面的代码一样了
                    }
                }
            }
            return new ConnectResetResponse();
        }
        return null;
    }
}

// 源码采用匿名内部类,位于com.alibaba.nacos.common.remote.client.RpcClient#start
class ConnectResetTask implements Runnable{
    while (true) {
        // 出队
        ReconnectContext reconnectContext = reconnectionSignal.poll(rpcClientConfig.connectionKeepAlive(), TimeUnit.MILLISECONDS);
        if (reconnectContext == null) {
            // 检测存活时间
            if (System.currentTimeMillis() - lastActiveTimeStamp >= rpcClientConfig.connectionKeepAlive()) {
                // 向服务端发送HealthCheckRequest,成功时服务端直接返回Response,见服务端源码 HealthCheckRequestHandler
                boolean isHealthy = healthCheck(); 
                if (!isHealthy) { // 如果不健康
                    RpcClientStatus rpcClientStatus = RpcClient.this.rpcClientStatus.get();
                    
                    if (RpcClientStatus.SHUTDOWN.equals(rpcClientStatus)) { 
                        // 如果客户端已关闭,则终止重连任务
                        break;
                    }
                    // 标记客户端状态为UNHEALTHY
                    boolean statusFLowSuccess = RpcClient.this.rpcClientStatus.compareAndSet(rpcClientStatus, RpcClientStatus.UNHEALTHY);
                    if (statusFLowSuccess) {
                        reconnectContext = new ReconnectContext(null, false);
                    } else {
                        continue;
                    }
                    
                } else {
                    // 如果存活,则更新存活时间
                    lastActiveTimeStamp = System.currentTimeMillis();
                    continue;
                }
            } else {
                continue;
            }
            
        }
        
        if (reconnectContext.serverInfo != null) {
            
            boolean serverExist = false;
            for (String server : getServerListFactory().getServerList()) { // 获取服务列表
                ServerInfo serverInfo = resolveServerInfo(server); // 解析ServerInfo
                if (serverInfo.getServerIp().equals(reconnectContext.serverInfo.getServerIp())) {
                    serverExist = true;
                    // 更新端口为发起连接重置的服务端口
                    reconnectContext.serverInfo.serverPort = serverInfo.serverPort;
                    break;
                }
            }
            if (!serverExist) {
                // 如果发起连接重置的服务不在服务列表中,则清除serverInfo
                reconnectContext.serverInfo = null;
            }
        }
        // 重连
        reconnect(reconnectContext.serverInfo, reconnectContext.onRequestFail);
      
    }
}

连接重置时序图

【功能】健康检测

发送ClientDetectionRequest请求探活,然后注销过期的实例

if (CollectionUtils.isNotEmpty(outDatedConnections)) {
    Set<String> successConnections = new HashSet<>();
    final CountDownLatch latch = new CountDownLatch(outDatedConnections.size());
    for (String outDateConnectionId : outDatedConnections) {
        try {
            Connection connection = getConnection(outDateConnectionId);
            if (connection != null) {
                ClientDetectionRequest clientDetectionRequest = new ClientDetectionRequest();
                // 异步发送请求,客户端直接返回响应(ClientDetectionResponse)
                connection.asyncRequest(clientDetectionRequest, new RequestCallBack() {
                    @Override
                    public Executor getExecutor() {
                        return null;
                    }

                    @Override
                    public long getTimeout() {
                        return 1000L; // 超时1s
                    }

                    @Override
                    public void onResponse(Response response) {
                        latch.countDown();
                        if (response != null && response.isSuccess()) {
                            connection.freshActiveTime();
                            successConnections.add(outDateConnectionId); // 放入成功的连接池
                        }
                    }

                    @Override
                    public void onException(Throwable e) {
                        latch.countDown();
                    }
                });

            } else {
                latch.countDown();
            }

        } catch (ConnectionAlreadyClosedException e) {
            latch.countDown();
        } catch (Exception e) {
            latch.countDown();
        }
    }

    latch.await(3000L, TimeUnit.MILLISECONDS);

    for (String outDateConnectionId : outDatedConnections) {
        if (!successConnections.contains(outDateConnectionId)) {
            // 注销过期的实例
            unregister(outDateConnectionId);
        }
    }
}

注销关键代码(unregister)

最终所有的连接监听器都会发布ClientDisconnectEvent

public synchronized void unregister(String connectionId) {
    Connection remove = this.connections.remove(connectionId);
    if (remove != null) {
        remove.close(); // 关闭连接
        // [1] clientConnectionEventListenerRegistry.notifyClientDisConnected(remove); 
        // 向注册的所有ClientConnectionEventListener发送断连事件,主要看ConnectionBasedClient,其它两个只是清除缓存(ConfigConnectionEventListener和RpcAckCallbackInitorOrCleaner)
        for (ClientConnectionEventListener clientConnectionEventListener : clientConnectionEventListeners) {
            // [2] clientConnectionEventListener.clientDisConnected(remove);
            // [3] ConnectionBasedClientManager#clientDisConnected
            String clientId = connect.getMetaInfo().getConnectionId()
            ConnectionBasedClient client = clients.remove(clientId);
            client.release();
            NotifyCenter.publishEvent(new ClientEvent.ClientDisconnectEvent(client, isResponsibleClient(client)));            
        }
    }
}

有两处处理ClientDisconnectEvent

  • 删除注册表订阅表数据:源码在ClientServiceIndexesManager#onEvent,源码解读查看服务推送
  • **同步数据到集群:**源码在DistroClientDataProcessor#onEvent,源码解读查看集群数据同步

管理元数据:NamingMetadataManager

功能:维护(添加/删除)过期的实例和服务元数据
源码入口:NamingMetadataManager#onEvent
处理事件:InstanceMetadataEventServiceMetadataEventClientDisconnectEvent

public class NamingMetadataManager extends SmartSubscriber {
    
    private final Set<ExpiredMetadataInfo> expiredMetadataInfos;
    private ConcurrentMap<Service, ServiceMetadata> serviceMetadataMap;
    private ConcurrentMap<Service, ConcurrentMap<String, InstanceMetadata>> instanceMetadataMap;

    public NamingMetadataManager() {
        serviceMetadataMap = new ConcurrentHashMap<>(1 << 10);
        instanceMetadataMap = new ConcurrentHashMap<>(1 << 10);
        expiredMetadataInfos = new ConcurrentHashSet<>();
        NotifyCenter.registerSubscriber(this, NamingEventPublisherFactory.getInstance());
    }
    
    public void onEvent(Event event) {
        if (event instanceof MetadataEvent.InstanceMetadataEvent) {
            //[] handleInstanceMetadataEvent((MetadataEvent.InstanceMetadataEvent) event);
            // 如果包含实例的元数据
            if (containInstanceMetadata(event.getService(), event.getMetadataId())) {
                //[] updateExpiredInfo(event.isExpired(),ExpiredMetadataInfo.newExpiredInstanceMetadata(event.getService(), event.getMetadataId()));
                // 添加或移除过期的实例元数据
                if (expired) {
                    expiredMetadataInfos.add(expiredMetadataInfo);
                } else {
                    expiredMetadataInfos.remove(expiredMetadataInfo);
                }
            }
        } else if (event instanceof MetadataEvent.ServiceMetadataEvent) {
        	// 和InstanceMetadataEvent逻辑类似,添加或移除过期的服务元数据
            handleServiceMetadataEvent((MetadataEvent.ServiceMetadataEvent) event);
        } else {
            // 和InstanceMetadataEvent逻辑类似,添加或移除过期的实例元数据
            handleClientDisconnectEvent((ClientEvent.ClientDisconnectEvent) event);
        }
    }
}

核心工具类源码解读

事件驱动

nacos2大量应用了事件驱动架构

NotifyCenter

服务注册发现通过NotifyCenter#publishEvent方法来发布事件,我们就先来研究它的核心流程和实现代码吧。

时序图

发布事件源码:NotifyCenter#publishEvent

基本逻辑:事件入队,若入队失败直接处理(发布给订阅者)

private final Map<String, EventPublisher> publisherMap = new ConcurrentHashMap<>(16);

static{
    // 创建和启动共用的事件发布者
	INSTANCE.sharePublisher = new DefaultSharePublisher();
    //[] INSTANCE.sharePublisher.init(type=SlowEvent.class, bufferSize=shareBufferSize=1024);
    setDaemon(true);
    setName("nacos.publisher-" + type.getName());
    this.eventType = type;
    this.queueMaxSize = bufferSize;
    this.queue = new ArrayBlockingQueue<>(bufferSize);
    start();//线程启动
}

public static boolean publishEvent(final Event event) {
    try {
        //[] return publishEvent(eventType=event.getClass(), event);
        // SlowEvent共享一个事件序列,sequence始终为0
        if (ClassUtils.isAssignableFrom(SlowEvent.class, eventType)) {
            //[] return INSTANCE.sharePublisher.publish(event);
            // 直接把事件扔到队列
            boolean success = this.queue.offer(event); // DefaultPublisher#publish
            if (!success) {
                // 直接处理事件:发布给订阅者(非主流程,下文再解读)
                receiveEvent(event); 
                return true;
            }
            return true;
        }
        // 类似Class.getName(),数组和内部类时和写的代码一样
        final String topic = ClassUtils.getCanonicalName(eventType);
        // 获取事件的发布者,通过NotifyCenter#registerToPublisher注册发布者
        EventPublisher publisher = INSTANCE.publisherMap.get(topic);
        if (publisher != null) {
            // 有3个实现:NamingEventPublisher#publish(下文解读)、TraceEventPublisher#publish(入队,跳过入队失败)、DefaultPublisher#publish(被上面覆盖)
            //[] return publisher.publish(event); 
            boolean success = this.queue.offer(event);
            if (!success) {
                // 直接处理事件:发布给订阅者(非主流程,下文再解读)
                handleEvent(event); // NamingEventPublisher#handleEvent
            }
            return true;
        }
        // 如果没有事件发布者,而且是插件事件(如NamingTraceEvent),则可以不发布事件
        if (event.isPluginEvent()) {
            return true;
        }
        return false;
    } catch (Throwable ex) {
        return false;
    }
}

我们接下来研究下事件是如何调度和执行的,有三个类:DefaultSharePublisherNamingEventPublisherTraceEventPublisher

事件调度源码:DefaultSharePublisher

我们先看下DefaultSharePublisher的父类DefaultPublisher。
DefaultPublisher继承Thread,手动调用init启动,内部声明了一个队列

// 一个线程,只有创建了就会默默的执行
public class DefaultPublisher extends Thread implements EventPublisher {
	// 存储订阅者
    protected final ConcurrentHashSet<Subscriber> subscribers = new ConcurrentHashSet<>();
    // 使用方手动调用
    public void init(Class<? extends Event> type, int bufferSize) {
        setDaemon(true); // 守护线程,应用结束立即销毁
        setName("nacos.publisher-" + type.getName());
        this.eventType = type;
        this.queueMaxSize = bufferSize;
        this.queue = new ArrayBlockingQueue<>(bufferSize);
        start();
    }

    // 重写线程的run方法:线程启动后,由底层调度(需获得CPU资源)
    @Override
    public void run() {
        //[] openEventHandler();
        int waitTimes = 60;
        // 为了解决消息积压问题,为保证消息不丢失
        // 等待订阅者执行60s任务
        while (!shutdown && !hasSubscriber() && waitTimes > 0) {
            ThreadUtils.sleep(1000L);
            waitTimes--;
        }

        while (!shutdown) {
            // 等待获取事件
            final Event event = queue.take(); 
            // 处理事件(单独解读,上文也用)
            receiveEvent(event); 
            UPDATER.compareAndSet(this, lastEventSequence, Math.max(lastEventSequence, event.sequence()));
        }
    }

    // 接收并通知订阅者处理事件
    void receiveEvent(Event event) {
        final long currentEventSequence = event.sequence();
        
        for (Subscriber subscriber : subscribers) {
            //[] notifySubscriber(subscriber, event);
            // 封装onEvent(如:NamingSubscriberServiceV2Impl#onEvent)为Runnable,并使用Executor执行。
            final Runnable job = () -> subscriber.onEvent(event);
            final Executor executor = subscriber.executor();
            
            if (executor != null) {
                executor.execute(job);
            } else {
                job.run(); // 失败时直接执行任务
            }
        }
    }
}

以上就是DefaultPublisher关键代码,而DefaultSharePublisher只是扩展了订阅者管理逻辑,逻辑和DefaultPublisher完全一致。

事件调度源码:NamingEventPublisher

// 一个线程,只有创建了就会默默的执行
public class NamingEventPublisher extends Thread implements ShardedEventPublisher {
    // 存储订阅者
    private final Map<Class<? extends Event>, Set<Subscriber<? extends Event>>> subscribes = new ConcurrentHashMap<>();
    
    // 使用方手动调用
    public void init(Class<? extends Event> type, int bufferSize) {
        this.queueMaxSize = bufferSize;
        this.queue = new ArrayBlockingQueue<>(bufferSize);
        this.publisherName = type.getSimpleName();
        super.setName(THREAD_NAME + this.publisherName);
        super.setDaemon(true);
        super.start();
        initialized = true;
    }

    // 重写线程的run方法:线程启动后,由底层调度(需获得CPU资源)
    @Override
    public void run() {
        try {
            //[] waitSubscriberForInit();
            // 解决消息积压问题,并保证消息不丢失
            for (int waitTimes = DEFAULT_WAIT_TIME; waitTimes > 0; waitTimes--) {
                if (shutdown || !subscribes.isEmpty()) {
                    break;
                }
                ThreadUtils.sleep(1000L);
            }
            //[] handleEvents();
            while (!shutdown) {
                final Event event = queue.take();
                //[] handleEvent(event);
                Set<Subscriber<? extends Event>> subscribers = subscribes.get(event.getClass());
                for (Subscriber subscriber : subscribers) {
                    //[] notifySubscriber(subscriber, event);
                    // 封装onEvent(如:NamingSubscriberServiceV2Impl#onEvent)为Runnable,并使用Executor执行。
                    final Runnable job = () -> subscriber.onEvent(event);
                    final Executor executor = subscriber.executor();
                    
                    if (executor != null) {
                        executor.execute(job);
                    } else {
                        job.run(); // 失败时直接执行任务
                    }
                }
            }
        } catch (Exception e) {
            Loggers.EVT_LOG.error("Naming Event Publisher {}, stop to handle event due to unexpected exception: ",
                    this.publisherName, e);
        }
    }

}

以上就是NamingEventPublisher关键代码,可以看到逻辑和DefaultPublisher也是完全一致的,主要差别体现在按事件类型获取订阅表。

事件调度:TraceEventPublisher

核心逻辑和NamingEventPublisher完全一致,因此不再展开解读。

事件注册和注销

事件注册采用EventPublisher#addSubscriber,事件注销采用EventPublisher#removeSubscriber
在服务启动时注册,服务停止时注销。
这是一个SpringBoot应用,不好找入口,建议调试反向推导,部分入口如下:

// com.alibaba.nacos.auth.config.AuthConfigs#AuthConfigs
public AuthConfigs() {
    NotifyCenter.registerSubscriber(this);
}
// com.alibaba.nacos.naming.core.DistroMapper#init
@PostConstruct
public void init() {
    NotifyCenter.registerSubscriber(this);
}
//com.alibaba.nacos.core.distributed.ProtocolManager#ProtocolManager
public ProtocolManager(ServerMemberManager memberManager) {
    NotifyCenter.registerSubscriber(this);
}
//com.alibaba.nacos.plugin.auth.impl.token.impl.JwtTokenManager#JwtTokenManager
public JwtTokenManager() {
    NotifyCenter.registerSubscriber(this);
}
//com.alibaba.nacos.core.config.AbstractDynamicConfig#AbstractDynamicConfig
protected AbstractDynamicConfig(String configName) {
    NotifyCenter.registerSubscriber(this);
}
//com.alibaba.nacos.naming.core.v2.upgrade.UpgradeStates#init
@PostConstruct
private void init() throws IOException {
    NotifyCenter.registerSubscriber(this);
}
    

延时任务执行引擎

NacosDelayTaskExecuteEngine

目的:主要说明延时调度引擎为啥最终会调用NacosTaskProcessor#process
原理:单线程定时任务,每次都执行所有的task
扩展实现:PushDelayTaskExecuteEngine(名字推送任务)、DistroDelayTaskExecuteEngine、DoubleWriteDelayTaskEngine。

public class PushDelayTaskExecuteEngine extends NacosDelayTaskExecuteEngine implements NacosTaskExecuteEngine<T> {
    
    private final ConcurrentHashMap<Object, NacosTaskProcessor> taskProcessors = new ConcurrentHashMap<>();
    
	private final ScheduledExecutorService processingExecutor;
    protected final ConcurrentHashMap<Object, AbstractDelayTask> tasks;
    
    public NacosDelayTaskExecuteEngine(String name, int initCapacity, Logger logger, long processInterval) {
        super(logger);
        tasks = new ConcurrentHashMap<>(initCapacity);
        processingExecutor = ExecutorFactory.newSingleScheduledExecutorService(new NameThreadFactory(name));
        // 一个线程的定时执行ProcessRunnable#run
        processingExecutor.scheduleWithFixedDelay(new ProcessRunnable(), processInterval, processInterval, TimeUnit.MILLISECONDS);
    }
    
    private class ProcessRunnable implements Runnable {
        @Override
        public void run() {
            try {
                //[] processTasks();
                Collection<Object> keys = getAllTaskKeys(); // 获取tasks的所有key
                for (Object taskKey : keys) {
                    AbstractDelayTask task = removeTask(taskKey);	//获取并删除task
                    // 按key从taskProcessors获取NacosTaskProcessor
                    NacosTaskProcessor processor = getProcessor(taskKey); 
                    try {
                        // *****调用 NacosTaskProcessor#process
                        if (!processor.process(task)) {
                            //[] retryFailedTask(taskKey, task);
                            task.setLastProcessTime(System.currentTimeMillis());
                            // 失败重加
        					addTask(key, task);
                        }
                    } catch (Throwable e) {
                        getEngineLog().error("Nacos task execute error ", e);
                        //[] retryFailedTask(taskKey, task);
                        task.setLastProcessTime(System.currentTimeMillis());
                        // 失败重加
                        addTask(key, task);
                    }
                }
            } catch (Throwable e) {
                getEngineLog().error(e.toString(), e);
            }
        }
    }
}

执行任务执行引擎

NacosExecuteTaskExecuteEngine

原理:包含CPU倍数任务执行Worker(executeWorkers),每个Worker包含一个阻塞队列和一个守护线程(减少线程切换)
扩展实现:DistroExecuteTaskExecuteEngine(完全同NacosExecuteTaskExecuteEngine)

public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> implements NacosTaskExecuteEngine<T> {
    
	private final TaskExecuteWorker[] executeWorkers;
    private final ConcurrentHashMap<Object, NacosTaskProcessor> taskProcessors = new ConcurrentHashMap<>();

    public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) {
        super(logger);
        executeWorkers = new TaskExecuteWorker[dispatchWorkerCount];  // 默认是CPU数的2倍
        for (int mod = 0; mod < dispatchWorkerCount; ++mod) {
            // TaskExecuteWorker 是一个线程+队列,见下文
            executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog());
        }
    }

    public final class TaskExecuteWorker implements NacosTaskProcessor {

        // 任务队列
        private final BlockingQueue<Runnable> queue;
        
    	public TaskExecuteWorker(final String name, final int mod, final int total, final Logger logger) {
            this.name = name + "_" + mod + "%" + total;
            this.queue = new ArrayBlockingQueue<>(QUEUE_CAPACITY);
            this.closed = new AtomicBoolean(false);
            //单线程(守护)执行任务
            new InnerWorker(name).start(); 
        }

        private class InnerWorker extends Thread {
            InnerWorker(String name) {
                setDaemon(false);
                setName(name);
            }
            
            @Override
            public void run() {
                while (!closed.get()) {
                    try {
                        Runnable task = queue.take(); // 等待队列任务
                        // 执行task#run
                        task.run();
                    } catch (Throwable e) {
                        log.error("[TASK-FAILED] " + e.toString(), e);
                    }
                }
            }
        }

        // nacos任务引擎执行的方法,就是把任务入队
        public boolean process(NacosTask task) {
            if (task instanceof AbstractExecuteTask) {
                //[] putTask((Runnable) task); 
                try {
                    // 添加任务到队列
                    queue.put(task);
                } catch (InterruptedException ire) {
                    log.error(ire.toString(), ire);
                }
            }
            return true;
        }
        
    }

}