三、apollo client启动流程

615 阅读7分钟

1、client启动项

在client模块下,我们可以找到META-INF/spring.factories这个文件,在里面定义了springboot启动时需要自动装配的类,此处配置了ApolloAutoConfiguration和ApolloApplicationContextInitializer两个类。

org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
com.ctrip.framework.apollo.spring.boot.ApolloAutoConfiguration
org.springframework.context.ApplicationContextInitializer=\
com.ctrip.framework.apollo.spring.boot.ApolloApplicationContextInitializer
org.springframework.boot.env.EnvironmentPostProcessor=\
com.ctrip.framework.apollo.spring.boot.ApolloApplicationContextInitializer

ApolloAutoConfiguration是基于Spring Boot的AutoConfiguration配置类。一般情况下,和模块功能相关的bean对象的实例化过程,都会由这个类负责。 ApolloApplicationContextInitializer则是初始化容器,从server端读取配置和服务信息的关键。ApolloApplicationContextInitializer实现了EnvironmentPostProcessor和ApplicationContextInitializer接口进行启动initialize方法。

image.png

重写EnvironmentPostProcessor的postProcessEnvironment方法

若我们在application.properties文件中开启了apollo.bootstrap.eagerLoad.enabled属性,则表示在启动springboot时提前加载apollo配置。注意:由于开启了这个属性之后,会使得apollo加载的优先级高于日志系统,那么就会使得apollo的启动日志无法输出。 再来看initialize方法。首先,会读取application.properties文件中配置的所有namespace;之后,初始化一个CompositePropertySource类,这里会根据是否开启apollo.property.names.cache.enable开关来确定加载的类,在1.9.0版本之后加入CachedCompositePropertySource主要是为了应对配置过多使得Spring启动较慢的问题。最后,在
protected void initialize(ConfigurableEnvironment environment) {

  if (environment.getPropertySources().contains(PropertySourcesConstants.APOLLO_BOOTSTRAP_PROPERTY_SOURCE_NAME)) {
    DeferredLogger.replayTo();
    return;
  }

  String namespaces = environment.getProperty(PropertySourcesConstants.APOLLO_BOOTSTRAP_NAMESPACES, ConfigConsts.NAMESPACE_APPLICATION);
  logger.debug("Apollo bootstrap namespaces: {}", namespaces);
  List<String> namespaceList = NAMESPACE_SPLITTER.splitToList(namespaces);

  CompositePropertySource composite;
  final ConfigUtil configUtil = ApolloInjector.getInstance(ConfigUtil.class);
  // 是否开启apollo.property.names.cache.enable开关
  if (configUtil.isPropertyNamesCacheEnabled()) {
    composite = new CachedCompositePropertySource(PropertySourcesConstants.APOLLO_BOOTSTRAP_PROPERTY_SOURCE_NAME);
  } else {
    composite = new CompositePropertySource(PropertySourcesConstants.APOLLO_BOOTSTRAP_PROPERTY_SOURCE_NAME);
  }
  //按照配置的Apollo的各个nameSpace的生效顺序,获取namespace
  for (String namespace : namespaceList) {
    // 核心实现 从Apollo的configservice获取nameSpace的配置信息
    Config config = ConfigService.getConfig(namespace);

    composite.addPropertySource(configPropertySourceFactory.getConfigPropertySource(namespace, config));
  }
  // 设置Apollo配置的优先级最高,在environment.getPropertySources()位置为0
  if (!configUtil.isOverrideSystemProperties()) {
    if (environment.getPropertySources().contains(StandardEnvironment.SYSTEM_ENVIRONMENT_PROPERTY_SOURCE_NAME)) {
      environment.getPropertySources().addAfter(StandardEnvironment.SYSTEM_ENVIRONMENT_PROPERTY_SOURCE_NAME, composite);
      return;
    }
  }
  environment.getPropertySources().addFirst(composite);
}

2、获取配置的核心实现

接下来看下从configservice获取nameSpace配置信息的核心方法ConfigService#getConfig(),先是从内存中获取配置,若内存中不存在该namespace的配置,则创建配置工厂对象,通过create()方法从server端拉取配置。

// 入口方法,先加载配置管理器,再获取配置
public static Config getConfig(String namespace) {
  return s_instance.getManager().getConfig(namespace);
}

//获取配置
public Config getConfig(String namespace) {
  // 内存获取配置
  Config config = m_configs.get(namespace);
  // 单例模式DCL
  if (config == null) {
    synchronized (this) {
      config = m_configs.get(namespace);

      if (config == null) {
        ConfigFactory factory = m_factoryManager.getFactory(namespace);
        //从server端拉配置
        config = factory.create(namespace);
        m_configs.put(namespace, config);
      }
    }
  }

  return config;
}

Apollo client获取配置过程通过com.ctrip.framework.apollo.spi.DefaultConfigFactory#create方法来实现,该方法的核心部分有两个: (1)createConfigRepository()从server端同步拉取配置; (2)createRepositoryConfig开启监听namespace配置的变化

public Config create(String namespace) {
  ConfigFileFormat format = determineFileFormat(namespace);

  ConfigRepository configRepository = null;
  
  if (ConfigFileFormat.isPropertiesCompatible(format) &&
      format != ConfigFileFormat.Properties) {
    configRepository = createPropertiesCompatibleFileConfigRepository(namespace, format);
  } else {
    // 从server端同步拉取配置
    configRepository = createConfigRepository(namespace);
  }

  logger.debug("Created a configuration repository of type [{}] for namespace [{}]",
      configRepository.getClass().getName(), namespace);
  return this.createRepositoryConfig(namespace, configRepository);
}

// 创建 RemoteConfigRepository对象拉取配置
ConfigRepository createConfigRepository(String namespace) {
  if (m_configUtil.isPropertyFileCacheEnabled()) {
    return createLocalConfigRepository(namespace);
  }
  return createRemoteConfigRepository(namespace);
}

截屏2022-09-04 17.11.34.png

RemoteConfigRepository构造函数

从上面的代码可以看到,createConfigRepository的逻辑就是初始化RemoteConfigRepository对象,在看下RemoteConfigRepository类的构造函数,可以看到方框1中通过DI机制生成ConfigServiceLocator对象,完成实现configService服务的发现和心跳检查。方框2中调用了3个内部方法,其中trySync()便是同步从server端拉配置,schedulePeriodicRefresh()则是开启一个调度线程,默认定时每5分钟从远端拉取一次配置,而scheduleLongPollingRefresh()也是开启一个调度线程,通过长轮询方式获取server端推送的变更配置并实时刷新。接下来看下这四部分分别的实现。

截屏2022-09-04 17.29.38.png

ConfigServiceLocator构造函数

截屏2022-09-04 17.39.14.png

获取服务的注册信息

(1)关注ConfigServiceLocator的构造函数,可以看到会调用内部方法 tryUpdateConfigServices(),该方法主要是创建链接访问MetaService获取server端注册在eureka的地址信息保存到内存中,在ConfigServiceLocator中有一个成员变量m_configServices就是专门用来保存server地址信息的。同时通过schedulePeriodicRefresh()启动一个调度线程每隔五分钟从metaService获取一次服务信息实现心跳检查。

image.png

sync:实现同步获取配置

(2)在trySync()方法中,调用try()方法同步配置,先从本地内存中获取previous配置,再从远端拉配置,若远端的配置与previous配置不一致,则覆盖内存中的配置,然后通知本地配置变更。拉配置的关键实现在loadApolloConfig()方法中实现,client发起get请求或取配置,若无配置则返回304的http状态码,否则返回200,client可读取到新的配置。
private ApolloConfig loadApolloConfig() {
  // 限流5s
  if (!m_loadConfigRateLimiter.tryAcquire(5, TimeUnit.SECONDS)) {
    //wait at most 5 seconds
    try {
      TimeUnit.SECONDS.sleep(5);
    } catch (InterruptedException e) {
    }
  }
  String appId = m_configUtil.getAppId();
  String cluster = m_configUtil.getCluster();
  String dataCenter = m_configUtil.getDataCenter();
  String secret = m_configUtil.getAccessKeySecret();
  Tracer.logEvent("Apollo.Client.ConfigMeta", STRING_JOINER.join(appId, cluster, m_namespace));
  int maxRetries = m_configNeedForceRefresh.get() ? 2 : 1;
  long onErrorSleepTime = 0; // 0 means no sleep
  Throwable exception = null;
  // 获取远端的配置服务configServie地址列表,对应上面的ConfigServiceLocator获取到的地址信息
  List<ServiceDTO> configServices = getConfigServices();
  String url = null;
  retryLoopLabel:
  for (int i = 0; i < maxRetries; i++) {
    List<ServiceDTO> randomConfigServices = Lists.newLinkedList(configServices);
    // 对configServie地址列表进行乱序,相当于随机选取一个服务进行调用。负载均衡
    Collections.shuffle(randomConfigServices);
    // 优先访问通知过客户端的服务,放在列表的第一位
    if (m_longPollServiceDto.get() != null) {
      randomConfigServices.add(0, m_longPollServiceDto.getAndSet(null));
    }

    for (ServiceDTO configService : randomConfigServices) {
      if (onErrorSleepTime > 0) {
        logger.warn(
            "Load config failed, will retry in {} {}. appId: {}, cluster: {}, namespaces: {}",
            onErrorSleepTime, m_configUtil.getOnErrorRetryIntervalTimeUnit(), appId, cluster, m_namespace);

        try {
          m_configUtil.getOnErrorRetryIntervalTimeUnit().sleep(onErrorSleepTime);
        } catch (InterruptedException e) {
          //ignore
        }
      }
      //组装url,http://192.168.0.0.1:8080/configs/apollo-demo/default/application?ip=192.168.100.2 请求的是configService的 ConfigController 的/{appId}/{clusterName}/{namespace:.+}
      url = assembleQueryConfigUrl(configService.getHomepageUrl(), appId, cluster, m_namespace,
              dataCenter, m_remoteMessages.get(), m_configCache.get());

      logger.debug("Loading config from {}", url);

      HttpRequest request = new HttpRequest(url);
      if (!StringUtils.isBlank(secret)) {
        Map<String, String> headers = Signature.buildHttpHeaders(url, appId, secret);
        request.setHeaders(headers);
      }

      Transaction transaction = Tracer.newTransaction("Apollo.ConfigService", "queryConfig");
      transaction.addData("Url", url);
      try {

        HttpResponse<ApolloConfig> response = m_httpClient.doGet(request, ApolloConfig.class);
        m_configNeedForceRefresh.set(false);
        m_loadConfigFailSchedulePolicy.success();

        transaction.addData("StatusCode", response.getStatusCode());
        transaction.setStatus(Transaction.SUCCESS);
        // server端返回304 表示没有配置发生变更
        if (response.getStatusCode() == 304) {
          logger.debug("Config server responds with 304 HTTP status code.");
          return m_configCache.get();
        }

        ApolloConfig result = response.getBody();

        logger.debug("Loaded config for {}: {}", m_namespace, result);

        return result;
      } catch (ApolloConfigStatusCodeException ex) {
        ...
      } catch (Throwable ex) {
        ...
      } finally {
        transaction.complete();
      }
    }

  }
  ...
}

(3)再看下schedulePeriodicRefresh(),主要是起了一个线程,每隔5分钟调用trySync()方法获取一次新配置,保证配置的实效性。 image.png

schedulePeriodicRefresh方法同步拉取配置

(4)最后,看下scheduleLongPollingRefresh(),同样的也是在构造函数中初始化remoteConfigLongPollService对象时创建了一个线程池,而此方法则是用来发起线程的,在未收到中断信号前会不断的轮询,获取配置。
private void scheduleLongPollingRefresh() {
  // 提交任务,发起长轮询的http请求
  remoteConfigLongPollService.submit(m_namespace, this);
}

public boolean submit(String namespace, RemoteConfigRepository remoteConfigRepository) {
  boolean added = m_longPollNamespaces.put(namespace, remoteConfigRepository);
  m_notifications.putIfAbsent(namespace, INIT_NOTIFICATION_ID);
  if (!m_longPollStarted.get()) {
    startLongPolling();
  }
  return added;
}


private void startLongPolling() {
  // m_longPollStarted是一个原子类,值为true表示已经开始发起长轮询请求了,无需再次发起
  if (!m_longPollStarted.compareAndSet(false, true)) {
    //already started
    return;
  }
  try {
    // 请求配置的各类参数
    final String appId = m_configUtil.getAppId();
    final String cluster = m_configUtil.getCluster();
    final String dataCenter = m_configUtil.getDataCenter();
    final String secret = m_configUtil.getAccessKeySecret();
    // 间隔时间,默认是2s
    final long longPollingInitialDelayInMills = m_configUtil.getLongPollingInitialDelayInMills();
    m_longPollingService.submit(new Runnable() {
      @Override
      public void run() {
        if (longPollingInitialDelayInMills > 0) {
          try {
            logger.debug("Long polling will start in {} ms.", longPollingInitialDelayInMills);
            TimeUnit.MILLISECONDS.sleep(longPollingInitialDelayInMills);
          } catch (InterruptedException e) {
            //ignore
          }
        }
        // 接受服务端推送的配置的核心实现
        doLongPollingRefresh(appId, cluster, dataCenter, secret);
      }
    });
  } catch (Throwable ex) {
    m_longPollStarted.set(false);
    ApolloConfigException exception =
        new ApolloConfigException("Schedule long polling refresh failed", ex);
    Tracer.logError(exception);
    logger.warn(ExceptionUtil.getDetailMessage(exception));
  }
}

在看下doLongPollingRefresh()这个方法,该方法使用一个while循环,在线程未收到中断信号和停止命令之前,会不断的发起轮询请求,

private void doLongPollingRefresh(String appId, String cluster, String dataCenter, String secret) {
  ServiceDTO lastServiceDto = null;
  while (!m_longPollingStopped.get() && !Thread.currentThread().isInterrupted()) {
    // 限流操作
    if (!m_longPollRateLimiter.tryAcquire(5, TimeUnit.SECONDS)) {
      //wait at most 5 seconds
      try {
        TimeUnit.SECONDS.sleep(5);
      } catch (InterruptedException e) {
      }
    }
    Transaction transaction = Tracer.newTransaction("Apollo.ConfigService", "pollNotification");
    String url = null;
    try {
      if (lastServiceDto == null) {
        // 随机从保存服务端ip地址的list中选择一个
        lastServiceDto = this.resolveConfigService();
      }
      // 长轮询url=http://192.168.0.0.1:8080/notifications/v2?cluster=default&appId=apollo-demo&ip=192.168.100.2&notifications=[{"namespaceName":"application","notificationId":-1}]
      url = assembleLongPollRefreshUrl(lastServiceDto.getHomepageUrl(), appId, cluster, dataCenter,
              m_notifications);

      logger.debug("Long polling from {}", url);

      HttpRequest request = new HttpRequest(url);
      // 请求超时时间90s,服务端最长应答时间是60s,客户端设置的时间必须大于这个时间
      request.setReadTimeout(LONG_POLLING_READ_TIMEOUT);
      if (!StringUtils.isBlank(secret)) {
        Map<String, String> headers = Signature.buildHttpHeaders(url, appId, secret);
        request.setHeaders(headers);
      }

      transaction.addData("Url", url);
      // 发起get请求,长轮询的实现主要依赖于server端实现的
      final HttpResponse<List<ApolloConfigNotification>> response =
          m_httpClient.doGet(request, m_responseType);

      logger.debug("Long polling response: {}, url: {}", response.getStatusCode(), url);
      // 返回状态码200表示变更的配置,则需要通知本地监听器获取配置
      if (response.getStatusCode() == 200 && response.getBody() != null) {
        updateNotifications(response.getBody());
        // 更新本地的配置
        updateRemoteNotifications(response.getBody());
        transaction.addData("Result", response.getBody().toString());
        // 通知本地拉取配置
        notify(lastServiceDto, response.getBody());
      }

      // 状态码为304,配置没有变化,就设置lastServiceDto为空,下次随机获取一个服务地址列表
      if (response.getStatusCode() == 304 && ThreadLocalRandom.current().nextBoolean()) {
        lastServiceDto = null;
      }

      m_longPollFailSchedulePolicyInSecond.success();
      transaction.addData("StatusCode", response.getStatusCode());
      transaction.setStatus(Transaction.SUCCESS);
    } catch (Throwable ex) {
      lastServiceDto = null;
      Tracer.logEvent("ApolloConfigException", ExceptionUtil.getDetailMessage(ex));
      transaction.setStatus(ex);
      long sleepTimeInSecond = m_longPollFailSchedulePolicyInSecond.fail();
      logger.warn(
          "Long polling failed, will retry in {} seconds. appId: {}, cluster: {}, namespaces: {}, long polling url: {}, reason: {}",
          sleepTimeInSecond, appId, cluster, assembleNamespaces(), url, ExceptionUtil.getDetailMessage(ex));
      try {
        TimeUnit.SECONDS.sleep(sleepTimeInSecond);
      } catch (InterruptedException ie) {
        //ignore
      }
    } finally {
      transaction.complete();
    }
  }
}

到这里,client端启动的过程就基本结束了,最主要的操作总结起来有三点:1. 服务发现,且每个5分钟进行一次心跳检查。 2. 同步拉取配置,且起一个调度线程,每隔5分钟拉一次配置。 3. 起一个线程,不断的发起轮询,服务端收到请求后,60s内没有配置变更则返回304的码,若有配置变更则实时返回变更后的配置。