Spring Cloud Eureka 源码解析

1,872 阅读15分钟

概述

本文主要是从 Eureka 源码的角度分析 Eureka 的实现原理和业务细节流程, 在本文的开头也给出了集群模式服务端的配置以及客户端的配置 demo.

Eureka 相关的历程

Sprng Cloud Netflix 部分项目停止更新说明

大致就是说 eureka2.x 不再更新,但是 eureka1.x 进入维护阶段。

spring.io/blog/2019/0…

Spring Cloud Hoxton 发布说明

Hoxton 版本主要特征是支持响应式编程。

spring.io/blog/2019/1…

集群模式配置

版本概述:

spring-boot 2.4.2

spring-cloud 2020.0.1

服务端配置

  1. pom.xml

    <dependencies>
      <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-server</artifactId>
      </dependency>
    </dependencies>  
    
    <dependencyManagement>
      <dependencies>
        <!--spring boot 2.4.2 -->
        <dependency>
          <groupId>org.springframework.boot</groupId>
          <artifactId>spring-boot-dependencies</artifactId>
          <version>${spring-boot.version}</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
        <!--spring cloud 2020.0.1 -->
        <dependency>
          <groupId>org.springframework.cloud</groupId>
          <artifactId>spring-cloud-dependencies</artifactId>
          <version>${spring-cloud.version}</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
      </dependencies>
    </dependencyManagement>   
    
  2. application.yml

    server:
      port: 3001
    eureka:
      server:
        enable-self-preservation: false # 关闭自我保护机制
        eviction-interval-timer-in-ms: 4000 # 设置间隔(单位:毫秒)
      instance:
        hostname: eureka3000
      client:
        register-with-eureka: false # 不把自己作为一个客户端注册到自己
        fetch-registry: false # 不需要从服务端获取注册信息
        service-url:
          default-zone: http://eureka3001.com:3001/eureka,http://eureka3001.com:3002/eureka,http://eureka3001.com:3003/eureka
    
  3. 启动类

    import org.springframework.boot.SpringApplication;
    import org.springframework.boot.autoconfigure.SpringBootApplication;
    import org.springframework.cloud.netflix.eureka.server.EnableEurekaServer;
    
    @SpringBootApplication
    @EnableEurekaServer
    public class EurekaServerApplication {
    
        public static void main(String[] args) {
            SpringApplication.run(EurekaServerApplication.class);
        }
    }
    

客户端配置

  1. pom.xml

    <dependencies>
      <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
      </dependency>
    
      <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
      </dependency>
    </dependencies>
    
  2. yml

    server:
      port: 6000
    
    eureka:
      client:
        serviceUrl:
          # 这里要注意是小驼峰如果填写为中划线, 会导致配置失效
          defaultZone: http://127.0.0.1:3000/eureka # eureka 服务端提供的注册地址
      instance:
        instance-id: power-1 #此实例注册到eureka服务端的唯一的实例ID
        prefer-ip-address: true #是否显示IP地址
        leaseRenewalIntervalInSeconds: 10 #eureka客户需要多长时间发送心跳给eureka服务器,表明它仍然活着,默认为30 秒 (与下面配置的单位都是秒)
        leaseExpirationDurationInSeconds: 30 #Eureka服务器在接收到实例的最后一次发出的心跳后,需要等待多久才可以将此实例删除,默认为90秒
    
    spring:
      application:
        name: servcie-client
    
  3. 启动类

    import org.springframework.boot.SpringApplication;
    import org.springframework.boot.autoconfigure.SpringBootApplication;
    
    @SpringBootApplication
    public class EurekaClientApplication {
    
        public static void main(String[] args) {
            SpringApplication.run(EurekaClientApplication.class);
        }
    }
    

Spring Boot 自定义 starter

Eureka Server 源码

Eureka 不是采用 Spring MVC 作为 Web 通讯框架的采用的是 jersey 作为底层框架,和 Spring MVC 的区别是采用的是 Filter 作为处理请求派发。

一. 服务注册

注册信息的存储,new ConcurrentHashMap<String, Map<String, Lease<InstanceInfo>>>(); 第一个 string 对应服务名,内层 String 代表实例的id , Lease 租债器, InstanceInfo 表示服务信息

// 服务注册代码
// AbstractInstanceRegistry
// registrant 本次注册请求 传过来的注册信息
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
        read.lock();
        try {
            // registry 所有的注册信息存储地址
            // gMap 通过服务名拿到的微服务组
            Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
            REGISTER.increment(isReplication);
            if (gMap == null) {
                final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
                gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
                if (gMap == null) {
                    gMap = gNewMap;
                }
            }
            // 已经存在的微服务实例对象
            Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
            // Retain the last dirty timestamp without overwriting it, if there is already a lease
            if (existingLease != null && (existingLease.getHolder() != null)) {
                // 当前存在的微服务实例对象的最后操作时间戳
                Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
                // 传过来的注册实例的时间戳
                Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
                logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);

                // this is a > instead of a >= because if the timestamps are equal, we still take the remote transmitted
                // InstanceInfo instead of the server local copy.
                // 那个时间戳比较靠前就用新的
                if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
                    logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
                            " than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
                    logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
                    // getHolder() 具体的微服务实例对象
                    registrant = existingLease.getHolder();
                }
            } else {
                // The lease does not exist and hence it is a new registration
                synchronized (lock) {
                    if (this.expectedNumberOfClientsSendingRenews > 0) {
                        // Since the client wants to register it, increase the number of clients sending renews
                        this.expectedNumberOfClientsSendingRenews = this.expectedNumberOfClientsSendingRenews + 1;
                        // 如果冲突了,自我保护阈值更新
                        updateRenewsPerMinThreshold();
                    }
                }
                logger.debug("No previous lease information found; it is new registration");
            }
            // Lease 租债器对象
            Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
            if (existingLease != null) {
                // 更新最后正常工作时间
                lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
            }
            // 服务注册
            gMap.put(registrant.getId(), lease);
            
            recentRegisteredQueue.add(new Pair<Long, String>(
                    System.currentTimeMillis(),
                    registrant.getAppName() + "(" + registrant.getId() + ")"));
            // This is where the initial state transfer of overridden status happens
            if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
                logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
                                + "overrides", registrant.getOverriddenStatus(), registrant.getId());
                if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
                    logger.info("Not found overridden id {} and hence adding it", registrant.getId());
                    overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
                }
            }
            InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
            if (overriddenStatusFromMap != null) {
                logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
                registrant.setOverriddenStatus(overriddenStatusFromMap);
            }

            // Set the status based on the overridden status rules
            InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
            registrant.setStatusWithoutDirty(overriddenInstanceStatus);

            // If the lease is registered with UP status, set lease service up timestamp
            if (InstanceStatus.UP.equals(registrant.getStatus())) {
                lease.serviceUp();
            }
            registrant.setActionType(ActionType.ADDED);
            recentlyChangedQueue.add(new RecentlyChangedItem(lease));
            registrant.setLastUpdatedTimestamp();
            invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
            logger.info("Registered instance {}/{} with status {} (replication={})",
                    registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
        } finally {
            read.unlock();
        }
    }

二. 服务下架

启动定时器,去判断服务是否过期,判断当前时间是否大于过期时间 lastUpdateTimestamp。

判断公式: 当前系统时间 > lastUpdateTimestamp + 30s (服务过期时间)

  1. 服务剔除

    // 遍历所有的服务信息 15分钟一次, 15% 以上需要剔除
    public void evict(long additionalLeaseMs) {
      logger.debug("Running the evict task");
    
      if (!isLeaseExpirationEnabled()) {
        logger.debug("DS: lease expiration is currently disabled.");
        return;
      }
    
      // We collect first all expired items, to evict them in random order. For large eviction sets,
      // if we do not that, we might wipe out whole apps before self preservation kicks in. By randomizing it,
      // the impact should be evenly distributed across all applications.
      // 遍历所有的服务信息,判断是否过期放入过期的列表中
      List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
      for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
        Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
        if (leaseMap != null) {
          for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
            Lease<InstanceInfo> lease = leaseEntry.getValue();
            // 判断是否服务过期
            if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {
              expiredLeases.add(lease);
            }
          }
        }
      }
    
      // To compensate for GC pauses or drifting local time, we need to use current registry size as a base for
      // triggering self-preservation. Without that we would wipe out full registry.
      // 剔除的数量过大的话, 先剔除一部分
      int registrySize = (int) getLocalRegistrySize();
      // serverConfig.getRenewalPercentThreshold() 默认 85%
      int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
      int evictionLimit = registrySize - registrySizeThreshold;
    
      int toEvict = Math.min(expiredLeases.size(), evictionLimit);
      if (toEvict > 0) {
        logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);
        // 随机剔除, 类似洗牌算法
        Random random = new Random(System.currentTimeMillis());
        for (int i = 0; i < toEvict; i++) {
          // Pick a random item (Knuth shuffle algorithm)
          int next = i + random.nextInt(expiredLeases.size() - i);
          Collections.swap(expiredLeases, i, next);
          Lease<InstanceInfo> lease = expiredLeases.get(i);
    
          String appName = lease.getHolder().getAppName();
          String id = lease.getHolder().getId();
          EXPIRED.increment();
          logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
          internalCancel(appName, id, false);
        }
      }
    }
    

三. 心跳连接

更改最后操作时间

// InstanceResource
@PUT
public Response renewLease(
  @HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication,
  @QueryParam("overriddenstatus") String overriddenStatus,
  @QueryParam("status") String status,
  @QueryParam("lastDirtyTimestamp") String lastDirtyTimestamp) {
  boolean isFromReplicaNode = "true".equals(isReplication);
  boolean isSuccess = registry.renew(app.getName(), id, isFromReplicaNode);

  // Not found in the registry, immediately ask for a register
  if (!isSuccess) {
    logger.warn("Not Found (Renew): {} - {}", app.getName(), id);
    return Response.status(Status.NOT_FOUND).build();
  }
  // Check if we need to sync based on dirty time stamp, the client
  // instance might have changed some value
  Response response;
  if (lastDirtyTimestamp != null && serverConfig.shouldSyncWhenTimestampDiffers()) {
    response = this.validateDirtyTimestamp(Long.valueOf(lastDirtyTimestamp), isFromReplicaNode);
    // Store the overridden status since the validation found out the node that replicates wins
    if (response.getStatus() == Response.Status.NOT_FOUND.getStatusCode()
        && (overriddenStatus != null)
        && !(InstanceStatus.UNKNOWN.name().equals(overriddenStatus))
        && isFromReplicaNode) {
      registry.storeOverriddenStatusIfRequired(app.getAppName(), id, InstanceStatus.valueOf(overriddenStatus));
    }
  } else {
    response = Response.ok().build();
  }
  logger.debug("Found (Renew): {} - {}; reply status={}", app.getName(), id, response.getStatus());
  return response;
}

四. 集群原理

  1. 集群同步由 PeerAwareInstanceRegistryImpl 来负责处理。

  2. 如果是一个 client node 注册到 server node 那么接收到这个 server node 会把 client node 的请求 转发 到其它的 server node

    // 集群同步
    private void replicateToPeers(Action action, String appName, String id,
                                  InstanceInfo info /* optional */,
                                  InstanceStatus newStatus /* optional */, boolean isReplication) {
      Stopwatch tracer = action.getTimer().start();
      try {
        if (isReplication) {
          numberOfReplicationsLastMin.increment();
        }
        // If it is a replication already, do not replicate again as this will create a poison replication
        // 所有集群节点(peerEurekaNodes)为空或者是一个集群操作(isReplication),防止死循环注册
        if (peerEurekaNodes == Collections.EMPTY_LIST || isReplication) {
          return;
        }
    		
        // 同步到所有节点
        for (final PeerEurekaNode node : peerEurekaNodes.getPeerEurekaNodes()) {
          // If the url represents this host, do not replicate to yourself.
          if (peerEurekaNodes.isThisMyUrl(node.getServiceUrl())) { // 排除当前节点
            continue;
          }
          replicateInstanceActionsToPeers(action, appName, id, info, newStatus, node);
        }
      } finally {
        tracer.stop();
      }
    }
    
  3. 集群同步事件处理 replicateInstanceActionsToPeers

    // 集群同步事件处理
    private void replicateInstanceActionsToPeers(Action action, String appName,
                                                 String id, InstanceInfo info, InstanceStatus newStatus,
                                                 PeerEurekaNode node) {
      try {
        InstanceInfo infoFromRegistry;
        CurrentRequestVersion.set(Version.V2);
        switch (action) {
          case Cancel:    // 取消
            node.cancel(appName, id);
            break;
          case Heartbeat: // 心跳
            InstanceStatus overriddenStatus = overriddenInstanceStatusMap.get(id);
            infoFromRegistry = getInstanceByAppAndId(appName, id, false);
            node.heartbeat(appName, id, infoFromRegistry, overriddenStatus, false);
            break;
          case Register:  // 注册
            node.register(info);
            break;
          case StatusUpdate:
            infoFromRegistry = getInstanceByAppAndId(appName, id, false);
            node.statusUpdate(appName, id, newStatus, infoFromRegistry);
            break;
          case DeleteStatusOverride:
            infoFromRegistry = getInstanceByAppAndId(appName, id, false);
            node.deleteStatusOverride(appName, id, infoFromRegistry);
            break;
        }
      } catch (Throwable t) {
        logger.error("Cannot replicate information to {} for action {}", node.getServiceUrl(), action.name(), t);
      } finally {
        CurrentRequestVersion.remove();
      }
    }
    
  4. 集群启动同步

    public int syncUp() {
      // Copy entire entry from neighboring DS node
      int count = 0;
    
      for (int i = 0; ((i < serverConfig.getRegistrySyncRetries()) && (count == 0)); i++) {
        if (i > 0) {
          try {
            Thread.sleep(serverConfig.getRegistrySyncRetryWaitMs());
          } catch (InterruptedException e) {
            logger.warn("Interrupted during registry transfer..");
            break;
          }
        }
        Applications apps = eurekaClient.getApplications();
        for (Application app : apps.getRegisteredApplications()) {
          for (InstanceInfo instance : app.getInstances()) {
            try {
              if (isRegisterable(instance)) {
                register(instance, instance.getLeaseInfo().getDurationInSecs(), true);
                count++;
              }
            } catch (Throwable t) {
              logger.error("During DS init copy", t);
            }
          }
        }
      }
      return count;
    }
    

五. 自我保护机制

  1. 是否启动自我保护机制
// 配置参数 isSelfPreservationModeEnabled
public boolean isSelfPreservationModeEnabled() {
  return serverConfig.shouldEnableSelfPreservation();
}
  1. 触发自我保护机制的配置, 触发的条件,短时间内,大量的心跳连接过期。就是大量宕机。eureka 就会触发自我保护机制。大量的阈值是 85%。
// 触发自我保护机制的配置阈值 numberOfRenewsPerMinThreshold
public boolean isLeaseExpirationEnabled() {
  if (!isSelfPreservationModeEnabled()) {
    // The self preservation mode is disabled, hence allowing the instances to expire.
    return true;
  }
  return numberOfRenewsPerMinThreshold > 0 && getNumOfRenewsInLastMin() > numberOfRenewsPerMinThreshold;
}
  1. 服务注册的执行的自我保护机制
// The lease does not exist and hence it is a new registration
synchronized (lock) {
  if (this.expectedNumberOfClientsSendingRenews > 0) {
    // Since the client wants to register it, increase the number of clients sending renews
    this.expectedNumberOfClientsSendingRenews = this.expectedNumberOfClientsSendingRenews + 1;
    updateRenewsPerMinThreshold();
  }
}

更改自我保护的阈值 updateRenewsPerMinThreshold

// getExpectedClientRenewalIntervalSeconds 默认每分钟发送心跳字数默认两次
// 计算公式: 预估心跳值(所有注册上来的实例) * 每分钟触发心跳连接次数(60s / 服务端每分钟刷新时间默认 30s) * 自我保护机制触发的百分比(默认 85%)
// 就是 15% 没有连接上,就触发自我保护机制
protected void updateRenewsPerMinThreshold() {
  this.numberOfRenewsPerMinThreshold = (int) (this.expectedNumberOfClientsSendingRenews
                                              * (60.0 / serverConfig.getExpectedClientRenewalIntervalSeconds())
                                              * serverConfig.getRenewalPercentThreshold());
}

自我保护机制的定时更改

private void scheduleRenewalThresholdUpdateTask() {
  timer.schedule(new TimerTask() {
    @Override
    public void run() {
      updateRenewalThreshold();
    }
  }, serverConfig.getRenewalThresholdUpdateIntervalMs(),
                 serverConfig.getRenewalThresholdUpdateIntervalMs());
}

总结:如果想要自我保护机制正常工作,需要客户端的每分钟心跳次数与服务端的配置相同

六. 缓存机制

Eureka 3 层缓存

  1. 只读缓存: ConcurrentHashMap
  2. 读写缓存: guava
  3. 真实数据: ConcurrentHashMap

目的:如果直接操作真实数据,这样就可以减少锁的抢占,带来效率上的提升;会出现的问题就是数据不一致,当时不能带来强一致性 95%

  1. 服务发现,获取服务信息的获取
@GET
public Response getContainers(@PathParam("version") String version,
                              @HeaderParam(HEADER_ACCEPT) String acceptHeader,
                              @HeaderParam(HEADER_ACCEPT_ENCODING) String acceptEncoding,
                              @HeaderParam(EurekaAccept.HTTP_X_EUREKA_ACCEPT) String eurekaAccept,
                              @Context UriInfo uriInfo,
                              @Nullable @QueryParam("regions") String regionsStr) {

  boolean isRemoteRegionRequested = null != regionsStr && !regionsStr.isEmpty();
  String[] regions = null;
  if (!isRemoteRegionRequested) {
    EurekaMonitors.GET_ALL.increment();
  } else {
    regions = regionsStr.toLowerCase().split(",");
    Arrays.sort(regions); // So we don't have different caches for same regions queried in different order.
    EurekaMonitors.GET_ALL_WITH_REMOTE_REGIONS.increment();
  }

  // Check if the server allows the access to the registry. The server can
  // restrict access if it is not
  // ready to serve traffic depending on various reasons.
  if (!registry.shouldAllowAccess(isRemoteRegionRequested)) {
    return Response.status(Status.FORBIDDEN).build();
  }
  CurrentRequestVersion.set(Version.toEnum(version));
  KeyType keyType = Key.KeyType.JSON;
  String returnMediaType = MediaType.APPLICATION_JSON;
  if (acceptHeader == null || !acceptHeader.contains(HEADER_JSON_VALUE)) {
    keyType = Key.KeyType.XML;
    returnMediaType = MediaType.APPLICATION_XML;
  }

  Key cacheKey = new Key(Key.EntityType.Application,
                         ResponseCacheImpl.ALL_APPS,
                         keyType, CurrentRequestVersion.get(), EurekaAccept.fromString(eurekaAccept), regions
                        );

  Response response;
  if (acceptEncoding != null && acceptEncoding.contains(HEADER_GZIP_VALUE)) {
    // responseCache 缓存对象
    response = Response.ok(responseCache.getGZIP(cacheKey))
      .header(HEADER_CONTENT_ENCODING, HEADER_GZIP_VALUE)
      .header(HEADER_CONTENT_TYPE, returnMediaType)
      .build();
  } else {
    response = Response.ok(responseCache.get(cacheKey))
      .build();
  }
  CurrentRequestVersion.remove();
  return response;
}

最终查询调用 getValue 方法

// ResponseCacheImpl
Value getValue(final Key key, boolean useReadOnlyCache) {
  Value payload = null;
  try {
    if (useReadOnlyCache) {
      // 1. 只读缓存
      final Value currentPayload = readOnlyCacheMap.get(key);
      if (currentPayload != null) {
        payload = currentPayload;
      } else {
        // 2. 读写缓存 guava
        payload = readWriteCacheMap.get(key);
        readOnlyCacheMap.put(key, payload);
      }
    } else {
      payload = readWriteCacheMap.get(key);
    }
  } catch (Throwable t) {
    logger.error("Cannot get value for key : {}", key, t);
  }
  return payload;
}

查询真实数据

private Value generatePayload(Key key) {
  Stopwatch tracer = null;
  try {
    String payload;
    switch (key.getEntityType()) {
      case Application:
        boolean isRemoteRegionRequested = key.hasRegions();

        if (ALL_APPS.equals(key.getName())) {
          if (isRemoteRegionRequested) {
            tracer = serializeAllAppsWithRemoteRegionTimer.start();
            payload = getPayLoad(key, registry.getApplicationsFromMultipleRegions(key.getRegions()));
          } else {
            tracer = serializeAllAppsTimer.start();
            payload = getPayLoad(key, registry.getApplications());
          }
        } else if (ALL_APPS_DELTA.equals(key.getName())) {
          if (isRemoteRegionRequested) {
            tracer = serializeDeltaAppsWithRemoteRegionTimer.start();
            versionDeltaWithRegions.incrementAndGet();
            versionDeltaWithRegionsLegacy.incrementAndGet();
            payload = getPayLoad(key,
                                 registry.getApplicationDeltasFromMultipleRegions(key.getRegions()));
          } else {
            tracer = serializeDeltaAppsTimer.start();
            versionDelta.incrementAndGet();
            versionDeltaLegacy.incrementAndGet();
            payload = getPayLoad(key, registry.getApplicationDeltas());
          }
        } else {
          tracer = serializeOneApptimer.start();
          payload = getPayLoad(key, registry.getApplication(key.getName()));
        }
        break;
      case VIP:
      case SVIP:
        tracer = serializeViptimer.start();
        payload = getPayLoad(key, getApplicationsForVip(key, registry));
        break;
      default:
        logger.error("Unidentified entity type: {} found in the cache key.", key.getEntityType());
        payload = "";
        break;
    }
    return new Value(payload);
  } finally {
    if (tracer != null) {
      tracer.stop();
    }
  }
}

缓存读取总结:

  1. 首先进入只读缓存;
  2. 如果只读缓存没有的话就进入读写缓存;
  3. 如果读写缓存也没有, 就执行监听器的逻辑,从真实数据里面拿。

缓存在什么时候会发生更改?

  1. 只读缓存修改的地点, 只读缓存只能通过定时任务,每 30 秒进行一次同步更新。
  2. 如果只读缓存找不到,但是读写缓存可以查询到,就更新到只读缓存。
  3. 延迟的统计:30秒计时器延迟 + 客户端缓存延迟 30s + ribbon (1s) = 61秒
// ResponseCacheImpl

Value getValue(final Key key, boolean useReadOnlyCache) {
  Value payload = null;
  try {
    if (useReadOnlyCache) {
      final Value currentPayload = readOnlyCacheMap.get(key);
      if (currentPayload != null) {
        payload = currentPayload;
      } else {
        // 如果只读缓存找不到,但是读写缓存可以查询到的时候会更新。
        payload = readWriteCacheMap.get(key);
        readOnlyCacheMap.put(key, payload);
      }
    } else {
      payload = readWriteCacheMap.get(key);
    }
  } catch (Throwable t) {
    logger.error("Cannot get value for key : {}", key, t);
  }
  return payload;
}

// 只读缓存更新,30s 执行一次同步数据
private TimerTask getCacheUpdateTask() {
  return new TimerTask() {
    @Override
    public void run() {
      logger.debug("Updating the client cache from response cache");
      for (Key key : readOnlyCacheMap.keySet()) {
        if (logger.isDebugEnabled()) {
          logger.debug("Updating the client cache from response cache for key : {} {} {} {}",
                       key.getEntityType(), key.getName(), key.getVersion(), key.getType());
        }
        try {
          CurrentRequestVersion.set(key.getVersion());
          Value cacheValue = readWriteCacheMap.get(key);
          Value currentCacheValue = readOnlyCacheMap.get(key);
          if (cacheValue != currentCacheValue) {
            readOnlyCacheMap.put(key, cacheValue);
          }
        } catch (Throwable th) {
          logger.error("Error while updating the client cache from response cache for key {}", key.toStringCompact(), th);
        } finally {
          CurrentRequestVersion.remove();
        }
      }
    }
  };
}

当做服务发现的时候会锁住注册(修改、下架)操作,服务发现的时候加的是写锁, 注册(修改、下架)操作时读锁;

这样设计的目的是为了尽可能的保证读的准确性。

Eureka Client 源码

Eureka Client 服务更新访问的接口信息

  1. 服务初始化注册
  2. 服务发送心跳信息
  3. 服务列表拉取,全量拉取
  4. 服务列表拉取,增量拉取

一. 初始化过程

// DiscoveryClient
DiscoveryClient(ApplicationInfoManager applicationInfoManager, EurekaClientConfig config, AbstractDiscoveryClientOptionalArgs args,
                Provider<BackupRegistry> backupRegistryProvider, EndpointRandomizer endpointRandomizer) {
  if (args != null) {
    this.healthCheckHandlerProvider = args.healthCheckHandlerProvider;
    this.healthCheckCallbackProvider = args.healthCheckCallbackProvider;
    this.eventListeners.addAll(args.getEventListeners());
    this.preRegistrationHandler = args.preRegistrationHandler;
  } else {
    this.healthCheckCallbackProvider = null;
    this.healthCheckHandlerProvider = null;
    this.preRegistrationHandler = null;
  }

  this.applicationInfoManager = applicationInfoManager;
  InstanceInfo myInfo = applicationInfoManager.getInfo();

  clientConfig = config;
  staticClientConfig = clientConfig;
  transportConfig = config.getTransportConfig();
  instanceInfo = myInfo;
  if (myInfo != null) {
    appPathIdentifier = instanceInfo.getAppName() + "/" + instanceInfo.getId();
  } else {
    logger.warn("Setting instanceInfo to a passed in null value");
  }

  this.backupRegistryProvider = backupRegistryProvider;
  this.endpointRandomizer = endpointRandomizer;
  this.urlRandomizer = new EndpointUtils.InstanceInfoBasedUrlRandomizer(instanceInfo);
  localRegionApps.set(new Applications());

  fetchRegistryGeneration = new AtomicLong(0);

  remoteRegionsToFetch = new AtomicReference<String>(clientConfig.fetchRegistryForRemoteRegions());
  remoteRegionsRef = new AtomicReference<>(remoteRegionsToFetch.get() == null ? null : remoteRegionsToFetch.get().split(","));

  if (config.shouldFetchRegistry()) {
    this.registryStalenessMonitor = new ThresholdLevelsMetric(this, METRIC_REGISTRY_PREFIX + "lastUpdateSec_", new long[]{15L, 30L, 60L, 120L, 240L, 480L});
  } else {
    this.registryStalenessMonitor = ThresholdLevelsMetric.NO_OP_METRIC;
  }

  if (config.shouldRegisterWithEureka()) {
    this.heartbeatStalenessMonitor = new ThresholdLevelsMetric(this, METRIC_REGISTRATION_PREFIX + "lastHeartbeatSec_", new long[]{15L, 30L, 60L, 120L, 240L, 480L});
  } else {
    this.heartbeatStalenessMonitor = ThresholdLevelsMetric.NO_OP_METRIC;
  }

  logger.info("Initializing Eureka in region {}", clientConfig.getRegion());

  if (!config.shouldRegisterWithEureka() && !config.shouldFetchRegistry()) {
    logger.info("Client configured to neither register nor query for data.");
    scheduler = null;
    heartbeatExecutor = null;
    cacheRefreshExecutor = null;
    eurekaTransport = null;
    instanceRegionChecker = new InstanceRegionChecker(new PropertyBasedAzToRegionMapper(config), clientConfig.getRegion());

    // This is a bit of hack to allow for existing code using DiscoveryManager.getInstance()
    // to work with DI'd DiscoveryClient
    DiscoveryManager.getInstance().setDiscoveryClient(this);
    DiscoveryManager.getInstance().setEurekaClientConfig(config);

    initTimestampMs = System.currentTimeMillis();
    initRegistrySize = this.getApplications().size();
    registrySize = initRegistrySize;
    logger.info("Discovery Client initialized at timestamp {} with initial instances count: {}",
                initTimestampMs, initRegistrySize);

    return;  // no need to setup up an network tasks and we are done
  }

  try {
    // default size of 2 - 1 each for heartbeat and cacheRefresh
    // 定时任务的调度类
    scheduler = Executors.newScheduledThreadPool(2,
                                                 new ThreadFactoryBuilder()
                                                 .setNameFormat("DiscoveryClient-%d")
                                                 .setDaemon(true)
                                                 .build());

    // 心跳的线程池
    heartbeatExecutor = new ThreadPoolExecutor(
      1, clientConfig.getHeartbeatExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
      new SynchronousQueue<Runnable>(),
      new ThreadFactoryBuilder()
      .setNameFormat("DiscoveryClient-HeartbeatExecutor-%d")
      .setDaemon(true)
      .build()
    );  // use direct handoff

    // 配置刷新的线程池
    cacheRefreshExecutor = new ThreadPoolExecutor(
      1, clientConfig.getCacheRefreshExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
      new SynchronousQueue<Runnable>(),
      new ThreadFactoryBuilder()
      .setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
      .setDaemon(true)
      .build()
    );  // use direct handoff

    eurekaTransport = new EurekaTransport();
    scheduleServerEndpointTask(eurekaTransport, args);

    AzToRegionMapper azToRegionMapper;
    if (clientConfig.shouldUseDnsForFetchingServiceUrls()) {
      azToRegionMapper = new DNSBasedAzToRegionMapper(clientConfig);
    } else {
      azToRegionMapper = new PropertyBasedAzToRegionMapper(clientConfig);
    }
    if (null != remoteRegionsToFetch.get()) {
      azToRegionMapper.setRegionsToFetch(remoteRegionsToFetch.get().split(","));
    }
    instanceRegionChecker = new InstanceRegionChecker(azToRegionMapper, clientConfig.getRegion());
  } catch (Throwable e) {
    throw new RuntimeException("Failed to initialize DiscoveryClient!", e);
  }
  
  // shouldFetchRegistry 关闭服务发现,不去拉取其它服务列表,自己只是作为服务提供者
  if (clientConfig.shouldFetchRegistry()) {
    try {
      // 服务发现
      // 1. 先去 eureka 上面拿信息
      // 2. 然后去 eureka 上注册
      boolean primaryFetchRegistryResult = fetchRegistry(false);
      if (!primaryFetchRegistryResult) {
        logger.info("Initial registry fetch from primary servers failed");
      }
      boolean backupFetchRegistryResult = true;
      // 服务端拿不到注册信息,就去备用服务器拿注册信息
      if (!primaryFetchRegistryResult && !fetchRegistryFromBackup()) {
        backupFetchRegistryResult = false;
        logger.info("Initial registry fetch from backup servers failed");
      }
      if (!primaryFetchRegistryResult && !backupFetchRegistryResult && clientConfig.shouldEnforceFetchRegistryAtInit()) {
        throw new IllegalStateException("Fetch registry error at startup. Initial fetch failed.");
      }
    } catch (Throwable th) {
      logger.error("Fetch registry error at startup: {}", th.getMessage());
      throw new IllegalStateException(th);
    }
  }

  // call and execute the pre registration handler before all background tasks (inc registration) is started
  if (this.preRegistrationHandler != null) {
    this.preRegistrationHandler.beforeRegistration();
  }
  
  // shouldRegisterWithEureka 自己作为一个消费者,自己只是一个消费者,不去做服务注册
  if (clientConfig.shouldRegisterWithEureka() && clientConfig.shouldEnforceRegistrationAtInit()) {
    try {
      // 服务注册
      if (!register() ) {
        throw new IllegalStateException("Registration error at startup. Invalid server response.");
      }
    } catch (Throwable th) {
      logger.error("Registration error at startup: {}", th.getMessage());
      throw new IllegalStateException(th);
    }
  }

  // finally, init the schedule tasks (e.g. cluster resolvers, heartbeat, instanceInfo replicator, fetch
  initScheduledTasks();

  try {
    Monitors.registerObject(this);
  } catch (Throwable e) {
    logger.warn("Cannot register timers", e);
  }

  // This is a bit of hack to allow for existing code using DiscoveryManager.getInstance()
  // to work with DI'd DiscoveryClient
  DiscoveryManager.getInstance().setDiscoveryClient(this);
  DiscoveryManager.getInstance().setEurekaClientConfig(config);

  initTimestampMs = System.currentTimeMillis();
  initRegistrySize = this.getApplications().size();
  registrySize = initRegistrySize;
  logger.info("Discovery Client initialized at timestamp {} with initial instances count: {}",
              initTimestampMs, initRegistrySize);
}

服务注册

// 服务注册
// DiscoveryClient
boolean register() throws Throwable {
  logger.info(PREFIX + "{}: registering service...", appPathIdentifier);
  EurekaHttpResponse<Void> httpResponse;
  try {
    httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
  } catch (Exception e) {
    logger.warn(PREFIX + "{} - registration failed {}", appPathIdentifier, e.getMessage(), e);
    throw e;
  }
  if (logger.isInfoEnabled()) {
    logger.info(PREFIX + "{} - registration status: {}", appPathIdentifier, httpResponse.getStatusCode());
  }
  return httpResponse.getStatusCode() == Status.NO_CONTENT.getStatusCode();
}

// AbstractJerseyEurekaHttpClient
// 本质就是一个 Jersey 对 Eureka Server 注册接口调用 
@Override
public EurekaHttpResponse<Void> register(InstanceInfo info) {
  String urlPath = "apps/" + info.getAppName();
  ClientResponse response = null;
  try {
    Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
    addExtraHeaders(resourceBuilder);
    response = resourceBuilder
      .header("Accept-Encoding", "gzip")
      .type(MediaType.APPLICATION_JSON_TYPE)
      .accept(MediaType.APPLICATION_JSON)
      .post(ClientResponse.class, info); // info 客户端的数据
    return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
  } finally {
    if (logger.isDebugEnabled()) {
      logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
                   response == null ? "N/A" : response.getStatus());
    }
    if (response != null) {
      response.close();
    }
  }
}

// 服务端就是在  ApplicationResource#addInstance 处理

二. 心跳连接

// DiscoveryClient
boolean renew() {
  EurekaHttpResponse<InstanceInfo> httpResponse;
  try {
    httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
    logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
    // 判断是否返回的是 NOT_FOUND 404
    if (httpResponse.getStatusCode() == Status.NOT_FOUND.getStatusCode()) {
      REREGISTER_COUNTER.increment();
      logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
      long timestamp = instanceInfo.setIsDirtyWithTime();
      boolean success = register();
      if (success) {
        instanceInfo.unsetIsDirty(timestamp);
      }
      return success;
    }
    return httpResponse.getStatusCode() == Status.OK.getStatusCode();
  } catch (Throwable e) {
    logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
    return false;
  }
}

三. 服务发现

  1. 是否全量拉取, 拉取 eureka 所有的注册信息。
  2. 是否增量拉取, 拉取 eureka 最近三分钟数据;是否有更新的信息,如果有更新
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
  Stopwatch tracer = FETCH_REGISTRY_TIMER.start();

  try {
    // If the delta is disabled or if it is the first time, get all
    // applications
    // 获取客户端缓存信息
    Applications applications = getApplications();

    if (clientConfig.shouldDisableDelta()  // 配置只拉取全量
        || (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress())) // VIP 地址, 当前 eureka client 是否对单一的注册地址感兴趣
        || forceFullRegistryFetch // 强制全量拉取
        || (applications == null) // 初始化的时候是全量
        || (applications.getRegisteredApplications().size() == 0)
        || (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
    {
      logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
      logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
      logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
      logger.info("Application is null : {}", (applications == null));
      logger.info("Registered Applications size is zero : {}",
                  (applications.getRegisteredApplications().size() == 0));
      logger.info("Application version is -1: {}", (applications.getVersion() == -1));
      getAndStoreFullRegistry();       // 全量拉取
    } else {
      getAndUpdateDelta(applications); // 增量拉取
    }
    applications.setAppsHashCode(applications.getReconcileHashCode());
    logTotalInstances();
  } catch (Throwable e) {
    logger.info(PREFIX + "{} - was unable to refresh its cache! This periodic background refresh will be retried in {} seconds. status = {} stacktrace = {}",
                appPathIdentifier, clientConfig.getRegistryFetchIntervalSeconds(), e.getMessage(), ExceptionUtils.getStackTrace(e));
    return false;
  } finally {
    if (tracer != null) {
      tracer.stop();
    }
  }

  // Notify about cache refresh before updating the instance remote status
  onCacheRefreshed();

  // Update remote status based on refreshed data held in the cache
  updateInstanceRemoteStatus();

  // registry was fetched successfully, so return true
  return true;
}

增量服务拉取, 防止并发这里有一个加锁的操作。

// 服务增量拉取
private void getAndUpdateDelta(Applications applications) throws Throwable {
  long currentUpdateGeneration = fetchRegistryGeneration.get();

  Applications delta = null;
  EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
  if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
    delta = httpResponse.getEntity();
  }

  if (delta == null) {
    logger.warn("The server does not allow the delta revision to be applied because it is not safe. "
                + "Hence got the full registry.");
    getAndStoreFullRegistry();
  } else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
    logger.debug("Got delta update with apps hashcode {}", delta.getAppsHashCode());
    String reconcileHashCode = "";
    if (fetchRegistryUpdateLock.tryLock()) {
      try {
        updateDelta(delta);
        reconcileHashCode = getReconcileHashCode(applications);
      } finally {
        fetchRegistryUpdateLock.unlock();
      }
    } else {
      logger.warn("Cannot acquire update lock, aborting getAndUpdateDelta");
    }
    // There is a diff in number of instances for some reason
    if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
      reconcileAndLogDifference(delta, reconcileHashCode);  // this makes a remoteCall
    }
  } else {
    logger.warn("Not updating application delta as another thread is updating it already");
    logger.debug("Ignoring delta update with apps hashcode {}, as another thread is updating it already", delta.getAppsHashCode());
  }
}

服务端 recentlyChangedQueue 清空 3 分钟内被修改的数据

30 秒执行一次定时任务, 定时任务里面清空 3 分钟没有更新的微服务实例

让我们的增量 , 与全量数据的 hashCode

client: 本地的数 hashCode + 增量的 hashCode

与服务端传过来的 hashCode 判断是否相同

Applications allApps = getApplicationsFromMultipleRegions(remoteRegions);
apps.setAppsHashCode(allApps.getReconcileHashCode());

四. 服务下架

  1. 手动下架,注入 DiscoveryClient 然后调用 shutdown 方法
// DiscoveryClient
public synchronized void shutdown() {
  if (isShutdown.compareAndSet(false, true)) {
    logger.info("Shutting down DiscoveryClient ...");

    if (statusChangeListener != null && applicationInfoManager != null) {
      applicationInfoManager.unregisterStatusChangeListener(statusChangeListener.getId());
    }

    cancelScheduledTasks();

    // If APPINFO was registered
    if (applicationInfoManager != null
        && clientConfig.shouldRegisterWithEureka()
        && clientConfig.shouldUnregisterOnShutdown()) {
      applicationInfoManager.setInstanceStatus(InstanceStatus.DOWN);
      unregister();
    }

    if (eurekaTransport != null) {
      eurekaTransport.shutdown();
    }

    heartbeatStalenessMonitor.shutdown();
    registryStalenessMonitor.shutdown();

    Monitors.unregisterObject(this);

    logger.info("Completed shut down of DiscoveryClient");
  }
}

Eureka 总结

Eureka 是一个比较优秀的服务注册中心, 实现了 AP 主要是保证可用性和分区容错性。

  1. Eureka Server 的所有的节点配置信息是存储在内存中的,查询服务服务注册信息使用了多层缓存。
    • 多层缓存:首先会进行一级缓存 **readOnlyCacheMap**读取,然后读取二级缓存**readWriteCacheMap**, 最后读取真实数据
    • 缓存过期,在接收到 register , renew cancel 请求后会失效二级缓存;服务剔除会删除二级缓存;二级缓存本身过期。
    • 缓存更新,一级缓存中查询不到的时候,会从二级缓存中查询, 如果二级缓存中存在,一级缓存中不存在会同步到一级缓存中。定时任务 3分钟也会主动同步一次一级缓存。
  2. Eureka Server 集群环境中,注册的服务只会向一台服务发起注册,然后当前服务端节点会遍历其它节点进行注册信息同步。
  3. Eureka Client 默认会 30 秒中发送一次心跳来进行续约,告知 Eureka Server 客户端依然存活没有问题,如果 Eureka Server 90 秒钟没有收到客户端的续约,它会将实例从注册表中删除。
  4. Eureka Server自我保护机制,当大量的服务过期的时候,存活服务低于 85% 的时候,就会启动自我保护机制,每次只会下线 15%的服务。