Nacos系列(二):nacos注册中心实现原理

4,897 阅读8分钟

1、Nacos官网架构图

基本架构及概念

  • 服务 (Service) 服务是指一个或一组软件功能(例如特定信息的检索或一组操作的执行),其目的是不同的客户端可以为不同的目的重用(例如通过跨进程的网络调用)。Nacos 支持主流的服务生态,如 Kubernetes Service、gRPC|Dubbo RPC Service 或者 Spring Cloud RESTful Service.

  • 服务注册中心 (Service Registry) 服务注册中心,它是服务,其实例及元数据的数据库。服务实例在启动时注册到服务注册表,并在关闭时注销。服务和路由器的客户端查询服务注册表以查找服务的可用实例。服务注册中心可能会调用服务实例的健康检查 API 来验证它是否能够处理请求。

  • 服务元数据 (Service Metadata) 服务元数据是指包括服务端点(endpoints)、服务标签、服务版本号、服务实例权重、路由规则、安全策略等描述服务的数据

  • 服务提供方 (Service Provider) 是指提供可复用和可调用服务的应用方

  • 服务消费方 (Service Consumer) 是指会发起对某个服务调用的应用方

  • 配置 (Configuration) 在系统开发过程中通常会将一些需要变更的参数、变量等从代码中分离出来独立管理,以独立的配置文件的形式存在。目的是让静态的系统工件或者交付物(如 WAR,JAR 包等)更好地和实际的物理运行环境进行适配。配置管理一般包含在系统部署的过程中,由系统管理员或者运维人员完成这个步骤。配置变更是调整系统运行时的行为的有效手段之一。

  • 配置管理 (Configuration Management) 在数据中心中,系统中所有配置的编辑、存储、分发、变更管理、历史版本管理、变更审计等所有与配置相关的活动统称为配置管理。

  • 名字服务 (Naming Service) 提供分布式系统中所有对象(Object)、实体(Entity)的“名字”到关联的元数据之间的映射管理服务,例如 ServiceName -> Endpoints Info, Distributed Lock Name -> Lock Owner/Status Info, DNS Domain Name -> IP List, 服务发现和 DNS 就是名字服务的2大场景。

  • 配置服务 (Configuration Service) 在服务或者应用运行过程中,提供动态配置或者元数据以及配置管理的服务提供者。

2、注册中心实现原理

服务注册功能主要体现在:

  • 服务实现在容器启动完成时通过监听器,注册到服务注册表上,并在关闭时注销
  • 服务消费者查询服务注册表,获得可用实例
  • 服务注册中心需要调用服务实例的健康检查API来验证服务健康情况

2.1nacos服务注册与发现的原理图

在这里插入图片描述

3、解读nacos注册中心原码

nacos注册中心,主要以下三部分

  • 服务注册
  • 服务地址的获取
  • 服务地址的变化

3.1、 Spring Cloud 什么时候完成服务注册

在spring-cloud-Common 包中又个类org.springframework.cloud.client.serviceregistry,提供了Spring Cloud 提供的服务注册的标准。集成到Spring cloud中实现服务注册的组件,都回实现该接口

package org.springframework.cloud.client.serviceregistry;

/**
 * Contract to register and deregister instances with a Service Registry.
 *
 * @param <R> registration meta data
 * @author Spencer Gibb
 * @since 1.2.0
 */
public interface ServiceRegistry<R extends Registration> {
	void register(R registration);
	void deregister(R registration);
	void close();
	void setStatus(R registration, String status);
	<T> T getStatus(R registration);

...

  • Spring Cloud集成Nacos的实现过程 在Spring-cloud-commons包下的META-INF/spring.factories中包含自动装配的配置信息

AutoConfiguration

org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
org.springframework.cloud.client.CommonsClientAutoConfiguration,\
org.springframework.cloud.client.ReactiveCommonsClientAutoConfiguration,\
。。。
org.springframework.cloud.client.serviceregistry.AutoServiceRegistrationAutoConfiguration

其中AutoServiceRegistrationAutoConfiguration就是服务注册相关的配置类

@Configuration(proxyBeanMethods = false)
@Import(AutoServiceRegistrationConfiguration.class)
@ConditionalOnProperty(value = "spring.cloud.service-registry.auto-registration.enabled",
		matchIfMissing = true)
public class AutoServiceRegistrationAutoConfiguration {

	@Autowired(required = false)
	private AutoServiceRegistration autoServiceRegistration;

	@Autowired
	private AutoServiceRegistrationProperties properties;
    // 初始方法
	@PostConstruct
	protected void init() {
		if (this.autoServiceRegistration == null && this.properties.isFailFast()) {
			throw new IllegalStateException("Auto Service Registration has "
					+ "been requested, but there is no AutoServiceRegistration bean");
		}
	}

}

在AutoServiceRegistrationAutoConfiguration配置类中,注入了AutoServiceRegistration实例。AbstractAutoServiceRegistration抽象类实现了该接口,最终NacosAutoServiceRegistration继承AbstractAutoServiceRegistration 在这里插入图片描述 具体实现逻辑在ApplicationListener事件监听机制,在webserver服务器注册完成之后,调用this.bind(event)

//  package org.springframework.cloud.client.serviceregistry;
	@Override
	@SuppressWarnings("deprecation")
	public void onApplicationEvent(WebServerInitializedEvent event) {
		bind(event);
	}

	@Deprecated
	public void bind(WebServerInitializedEvent event) {
		ApplicationContext context = event.getApplicationContext();
		if (context instanceof ConfigurableWebServerApplicationContext) {
			if ("management".equals(((ConfigurableWebServerApplicationContext) context)
					.getServerNamespace())) {
				return;
			}
		}
		this.port.compareAndSet(0, event.getWebServer().getPort());
		this.start();
	}

核心类NacosServiceRegistry的注册和心跳入口处

  • beatReactor.addBeatInfo 创建心跳信息实现健康监测,Nacos server必须确保服务实例是健康的,心跳机制为客户端向服务端发送健康信息
  • serverProxy.registerService 实现服务注册,向服务端发送服务信息
    @Override
    public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {

        if (instance.isEphemeral()) {
            BeatInfo beatInfo = new BeatInfo();
            beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName));
            beatInfo.setIp(instance.getIp());
            beatInfo.setPort(instance.getPort());
            beatInfo.setCluster(instance.getClusterName());
            beatInfo.setWeight(instance.getWeight());
            beatInfo.setMetadata(instance.getMetadata());
            beatInfo.setScheduled(false);
            beatInfo.setPeriod(instance.getInstanceHeartBeatInterval());

      //心跳监测      beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
        }

   //注册     serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
    }

beatInfo.setPeriod 心跳机制

    public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
        NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
        String key = buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort());
        BeatInfo existBeat = null;
        //fix #1733
        if ((existBeat = dom2Beat.remove(key)) != null) {
            existBeat.setStopped(true);
        }
        dom2Beat.put(key, beatInfo);
        executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
        MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
    }

客户端向schedule定时向服务端发送数据包,然后启动新线程不断监测服务端的回应。服务端根据客户端 的心跳包不断更新服务的状态。

3.3、Nacos注册的原码解读

Nacos提供SDK和Open API的形式实现服务注册

curl -x POST 'http://127.0.0.1:8848/nacos/v1/ns/instance?serviceName=nacos.naming.serviceName&ip=127.0.0.1&port=8080

基于sdk形式实现服务注册, 其实内部也是基于http来进行调用

void registerInstance(String serviceName,String ip,String port);

nacos服务端提供接口请求地址,/v1/ns/instance,具体代码在nacos-naming下的InstanceController

   @CanDistro
    @PostMapping
    @Secured(
        parser = NamingResourceParser.class,
        action = ActionTypes.WRITE
    )
    public String register(HttpServletRequest request) throws Exception {
        String serviceName = WebUtils.required(request, "serviceName");
        String namespaceId = WebUtils.optional(request, "namespaceId", "public");
        Instance instance = this.parseInstance(request);
        this.serviceManager.registerInstance(namespaceId, serviceName, instance);
        return "ok";
    }

registerInstance为方法注册的核心,主要逻辑为

  • 创建一个空服务,初始化serviceMap
  • getService,从serviceMap中根据namespaceid和serviceName得到一个服务对象
  • 调用addInstance添加到服务对象
    public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {

        createEmptyService(namespaceId, serviceName, instance.isEphemeral());

        Service service = getService(namespaceId, serviceName);

        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }

        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }

createEmptyService创建空服务,其核心方法为createServiceIfAbsent

  • 根据namespaceId,serviceName从缓存中获取实例
  • 如果service实例为空则创建实例且保存到缓存中
public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException {
        Service service = getService(namespaceId, serviceName);
        if (service == null) {

            Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName);
            service = new Service();
            service.setName(serviceName);
            service.setNamespaceId(namespaceId);
            service.setGroupName(NamingUtils.getGroupName(serviceName));
            // now validate the service. if failed, exception will be thrown
            service.setLastModifiedMillis(System.currentTimeMillis());
            service.recalculateChecksum();
            if (cluster != null) {
                cluster.setService(service);
                service.getClusterMap().put(cluster.getName(), cluster);
            }
            service.validate();

            putServiceAndInit(service);
            if (!local) {
                addOrReplaceService(service);
            }
        }
    }

putServiceAndInit涉及到的逻辑稍微复杂一点

  • putService(service)方法将服务缓存到内存中
  • service.init();建立心跳监测机制
  • consistencyService.listen实现数据一致性的监听----未弄懂?????
    private void putServiceAndInit(Service service) throws NacosException {
        putService(service);
        service.init();
        consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
        consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
        Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJSON());
    }

在这里插入图片描述

总结下服务注册的流程

  • Nacos 通过http或者open Api的形式发送服务注册请求
  • Nacos 服务端收到请求后,完成以下工作 1、构建一个service对象保存到ConcurrentHashMap集合中 2、使用定时任务对当前任务简历心跳监测 3、基于数据一致性协议将服务数据进行同步

3.4、服务提供者地址查询

通过openapi获取服务,/v1/ns/instance/list?serviceName=serviceName

 @GetMapping({"/list"})
    @Secured(
        parser = NamingResourceParser.class,
        action = ActionTypes.READ
    )
    public ObjectNode list(HttpServletRequest request) throws Exception {
        String namespaceId = WebUtils.optional(request, "namespaceId", "public");
        String serviceName = WebUtils.required(request, "serviceName");
        String agent = WebUtils.getUserAgent(request);
        String clusters = WebUtils.optional(request, "clusters", "");
        String clientIP = WebUtils.optional(request, "clientIP", "");
        Integer udpPort = Integer.parseInt(WebUtils.optional(request, "udpPort", "0"));
        String env = WebUtils.optional(request, "env", "");
        boolean isCheck = Boolean.parseBoolean(WebUtils.optional(request, "isCheck", "false"));
        String app = WebUtils.optional(request, "app", "");
        String tenant = WebUtils.optional(request, "tid", "");
        boolean healthyOnly = Boolean.parseBoolean(WebUtils.optional(request, "healthyOnly", "false"));
        return this.doSrvIPXT(namespaceId, serviceName, agent, clusters, clientIP, udpPort, env, isCheck, app, tenant, healthyOnly);
    }

通过doSrvIPXT获取服务列表

  • 根据namespaceId、serviceName获取service实例
  • service实例中srvIPs获取所有服务提供者的实例信息
  • 遍历组装成json字符串并返回
  public ObjectNode doSrvIPXT(String namespaceId, String serviceName, String agent, String clusters, String clientIP, int udpPort, String env, boolean isCheck, String app, String tid, boolean healthyOnly) throws Exception {
        ClientInfo clientInfo = new ClientInfo(agent);
        ObjectNode result = JacksonUtils.createEmptyJsonNode();
        Service service = this.serviceManager.getService(namespaceId, serviceName);
        
            this.checkIfDisabled(service);
            long cacheMillis = this.switchDomain.getDefaultCacheMillis();
            List<Instance> srvedIPs = service.srvIPs(Arrays.asList(StringUtils.split(clusters, ",")));
            if (service.getSelector() != null && StringUtils.isNotBlank(clientIP)) {
                srvedIPs = service.getSelector().select(clientIP, srvedIPs);
            }

            if (CollectionUtils.isEmpty(srvedIPs)) {
                if (Loggers.SRV_LOG.isDebugEnabled()) {
                    Loggers.SRV_LOG.debug("no instance to serve for service: {}", serviceName);
                }

                if (clientInfo.type == ClientType.JAVA && clientInfo.version.compareTo(VersionUtil.parseVersion("1.0.0")) >= 0) {
                    result.put("dom", serviceName);
                } else {
                    result.put("dom", NamingUtils.getServiceName(serviceName));
                }

                result.put("hosts", JacksonUtils.createEmptyArrayNode());
                result.put("name", serviceName);
                result.put("cacheMillis", cacheMillis);
                result.put("lastRefTime", System.currentTimeMillis());
                result.put("checksum", service.getChecksum());
                result.put("useSpecifiedURL", false);
                result.put("clusters", clusters);
                result.put("env", env);
                result.put("metadata", JacksonUtils.transferToJsonNode(service.getMetadata()));
                return result;
            } else {
                Map<Boolean, List<Instance>> ipMap = new HashMap(2);
                ipMap.put(Boolean.TRUE, new ArrayList());
                ipMap.put(Boolean.FALSE, new ArrayList());
                Iterator var19 = srvedIPs.iterator();

                while(var19.hasNext()) {
                    Instance ip = (Instance)var19.next();
                    ((List)ipMap.get(ip.isHealthy())).add(ip);
                }

                if (isCheck) {
                    result.put("reachProtectThreshold", false);
                }

                double threshold = (double)service.getProtectThreshold();
                if ((double)((float)((List)ipMap.get(Boolean.TRUE)).size() / (float)srvedIPs.size()) <= threshold) {
                    Loggers.SRV_LOG.warn("protect threshold reached, return all ips, service: {}", serviceName);
                    if (isCheck) {
                        result.put("reachProtectThreshold", true);
                    }

                    ((List)ipMap.get(Boolean.TRUE)).addAll((Collection)ipMap.get(Boolean.FALSE));
                    ((List)ipMap.get(Boolean.FALSE)).clear();
                }

                if (isCheck) {
                    result.put("protectThreshold", service.getProtectThreshold());
                    result.put("reachLocalSiteCallThreshold", false);
                    return JacksonUtils.createEmptyJsonNode();
                } else {
                    ArrayNode hosts = JacksonUtils.createEmptyArrayNode();
                    Iterator var22 = ipMap.entrySet().iterator();

                    label114:
                    while(true) {
                        Entry entry;
                        List ips;
                        do {
                            if (!var22.hasNext()) {
                                result.replace("hosts", hosts);
                                if (clientInfo.type == ClientType.JAVA && clientInfo.version.compareTo(VersionUtil.parseVersion("1.0.0")) >= 0) {
                                    result.put("dom", serviceName);
                                } else {
                                    result.put("dom", NamingUtils.getServiceName(serviceName));
                                }

                                result.put("name", serviceName);
                                result.put("cacheMillis", cacheMillis);
                                result.put("lastRefTime", System.currentTimeMillis());
                                result.put("checksum", service.getChecksum());
                                result.put("useSpecifiedURL", false);
                                result.put("clusters", clusters);
                                result.put("env", env);
                                result.replace("metadata", JacksonUtils.transferToJsonNode(service.getMetadata()));
                                return result;
                            }

                            entry = (Entry)var22.next();
                            ips = (List)entry.getValue();
                        } while(healthyOnly && !(Boolean)entry.getKey());

                        Iterator var25 = ips.iterator();

                        while(true) {
                            Instance instance;
                            do {
                                if (!var25.hasNext()) {
                                    continue label114;
                                }

                                instance = (Instance)var25.next();
                            } while(!instance.isEnabled());

                            ObjectNode ipObj = JacksonUtils.createEmptyJsonNode();
                            ipObj.put("ip", instance.getIp());
                            ipObj.put("port", instance.getPort());
                            ipObj.put("valid", (Boolean)entry.getKey());
                            ipObj.put("healthy", (Boolean)entry.getKey());
                            ipObj.put("marked", instance.isMarked());
                            ipObj.put("instanceId", instance.getInstanceId());
                            ipObj.put("metadata", JacksonUtils.transferToJsonNode(instance.getMetadata()));
                            ipObj.put("enabled", instance.isEnabled());
                            ipObj.put("weight", instance.getWeight());
                            ipObj.put("clusterName", instance.getClusterName());
                            if (clientInfo.type == ClientType.JAVA && clientInfo.version.compareTo(VersionUtil.parseVersion("1.0.0")) >= 0) {
                                ipObj.put("serviceName", instance.getServiceName());
                            } else {
                                ipObj.put("serviceName", NamingUtils.getServiceName(instance.getServiceName()));
                            }

                            ipObj.put("ephemeral", instance.isEphemeral());
                            hosts.add(ipObj);
                        }
                    }
                }
            }
        }

3.5、nacos服务地址动态感知

服务消费者需要获取服务提供者的地址列表,还需要监听服务异常变化 nacos客户端通过HostReactor类,实现服务的动态更新

  • 客户端发起事件订阅后,在HostReactor启动updateTask线程,10s一次pull请求们,获取最新服务端的地址列表
  • 服务端维持心跳机制,一旦服务提供者出现异常,则发送一个push消息给Nacos客户端
  • 服务消费者受到请求后,HostReactor中的processServiceJSON解析消息,并更新本地服务地址列表

在这里插入图片描述 服务动态感知