Nacos 配置热更新引发的问题一不小心就踩坑了，一个由 Nacos 配置热更新引发的生产环境问题，涉及的点还颇多的，可

问题描述

最近遇到一个生产环境问题，一直正常运行的项目突然无法上传图片。通过日志可以看到，在调用图片上传时出现报错："OssUpload线程池满"，相关的伪代码如下。

@Bean
public Executor ossUploadPicExecutor() {
    executor.setThreadNamePrefix("ossUpload-");
    executor.setRejectedExecutionHandler((runnable, threadPoolExecutor) -> {
        log.error("OssUpload线程池满");
    });
    executor.initialize();
    return executor;
}

public void uploadImage(File image) {
    executor.execute(() -> {
        // do oss upload here
    });
}

排查

从报错的日志看，首先怀疑的是上传过程太慢，导致线程池任务积压。但是考虑到服务访问的频率可能不会很高，应该不至于出现积压，于是使用 jstack 观察线程状态。

经过观察发现，进程里面居然没有一个 ossUpload 开头的线程，所有的图片上传线程"不见了"。经过测试，如果把 executor 线程池 shutdown 一下，当再次执行 execute 投递任务时候，同样会出现 "OssUpload线程池满" 的报错，于是开始怀疑生产环境的线程池是不是也被关闭了...如果是，究竟是什么操作会导致线程池的关闭？

通过搜索代码，发现整个项目中只有一处会主动关闭线程池，示例代码如下。此处的 @PreDestroy 是为了实现安全停机加入的代码，相关说明可以看另一篇 SpringBoot 安全停机，通过日志继续排查的确发现有 "executor.shutdown" 的输出。那么问题变成了为什么在不触发停机的情况下 preDestroy 方法被回调了...

@Configuration
public class AliyunOssConfiguration {

    private final ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();

    private OSS oss;

    @PreDestroy
    public void preDestroy() {
        log.info("executor.shutdown");
        executor.shutdown();
        oss.shutdown();
    }

}

继续从日志 "executor.shutdown" 出现的地方倒推，发现在这个时间点之前有发生过 nacos 更新配置相关的日志输出，难道是因为配置的热更新触发了 preDestroy 的回调？项目中也有其他地方使用了 @PreDestroy 为什么其他地方又没有被回调呢。

Read the fucking source code

既然怀疑是因为 nacos 更新配置触发的问题，那么先看看 nacos 更新配置的机制，通过源码大致可以看到下面的过程。

当配置发生变更时 NacosContextRefresher 会发布 RefreshEvent 事件，通知 Spring 容器配置发生变更。
RefreshEventListener 接收到 RefreshEvent 事件后，调用 ContextRefresher 的 refreshEnvironment 刷新环境
ContextRefresher 在调用 refreshEnvironment 刷新环境过程又会发出事件 EnvironmentChangeEvent 引发 ConfigurationPropertiesRebinder 的 rebind
最后在 rebind 中就会调用那些加上了 @ConfigurationProperties 注解的配置类中的 destroy 方法，即项目中加上了 @PreDestroy 注解的方法。

因为项目中的其他的 Bean 并没有加上 @ConfigurationProperties，所以对应的 @PreDestroy 方法没有被触发。

为什么会有这种机制

虽然知道了 @PreDestroy 被调用的流程，但还是要继续搞明白为什么会有这种机制。

我们平时使用配置文件一般有下面两种方式。

即时读取的方式，只要配置值发生过变更，下一次请求来到时，就可以读取到最新的配置，这是配置热更新能覆盖到最常见的场景。

@Value("xConfig")
private String xConfig;

@GetMappng("/demo")
public String demo() {
	return xConfig;
}

初始化过程需要读取配置，只读取一次配置，比如初始化阶段需要创建的线程池，相关的参数在配置文件中指定。

@Configuration
@ConfigurationProperties(prefix = "aliyun.oss")
public class AliyunOssConfiguration {

	// 参数从配置文件读取
	private int maxPoolSize;
		
	@Bean
	public Executor ossUploadPicExecutor() {
		ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
		executor.setMaxPoolSize(maxPoolSize);
		executor.initialize();
		return executor;
	}
}

不难联想到，前面调用配置类的销毁方法，就是为了服务第二种场景，销毁后重建，然后用新的配置重新创建我们的各种 Bean，以达到热更新目的。

Bean 的更新

既然猜到了(可能)为什么会有这种机制，那么接下来需要考虑的是，如何使用新的配置来更新我们的 Bean。拿前面的例子来说，我们创建好的线程池，在创建完后，就会被 Spring 注入到各个引用了这个 Bean 的地方。

@RestController
public class DemoController {
	
	@Resource
	Executor executor;
}

此时，如果我们需要使用新的配置来创建线程池，除了要销毁旧的线程池(假如线程池不能动态调整参数)，还要创建一个新的线程池，并且这个新的线程池也要被重新注入到 Bean 被引用的地方，这个重新注入的过程，可能会是一个很大的开销。

那么有没有更好的办法可以实现 Bean 的更新？能想到的一个比较直接的方法如下，使用一层包装，把需要更新的 Executor 包含在其中，其他地方不再直接引用 Executor 而是引用 ExecutorWrapper ，那么只需要在重新创建 Executor 后更新到 ExecutorWrapper 即可。

@Component
public class ExecutorWrapper {
	
	@Resource
	Executor executor;
	
	// executor getter and setter
}

@RestController
public class DemoController {
	
	@Resource
	ExecutorWrapper executorWrapper;
}

这个方法虽然可行，但是会改动到的地方较多，以前的 executor.execute() 调用都需要变成 executorWrapper.getExecutor().execute()。再进一步想想，把包装类换成代理的方式呢(类似 AOP 环绕增强)，这样就可以省下了 getExecutor 这一步，不过相信 Spring 一定已经会提供了解决的办法，没必要自己来实现，保持程序的 DRY (Don’t Repeat Yourself)。

想到这一点基本可以猜出 Spring 会以哪种方式来实现 Bean 的更新了。

RefreshScope

忽略中间的探索过程，直接看结论，的确 Spring 已经为上面的热更新场景做好了准备，其实现方式就是上面猜想的代理类实现，下面通过简单的例子来了解一下如何使用 @RefreshScope 注解实现 Bean 的更新。

@Slf4j
@Component
@RefreshScope
public class TestRefreshScope {

    public String createdAt;

    public TestRefreshScope() {
        log.info("重建");
        createdAt = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss").format(new Date());
    }

    public String getCreatedAt() {
        return createdAt;
    }
}

@RestController
public class DemoController {

    @Resource
    private TestRefreshScope testRefreshScope;

    @GetMapping("/check")
    public Object o() {
        return testRefreshScope.getCreatedAt();
    }
}

从下面接口调用的过程可以看到，在调用 /actuator/refresh 刷新环境之前 (相当于在 nacos 更新配置)，两次获取 TestRefreshScope 的 createdAt 都是一样的，在刷新环境后，createdAt 发生变化，但只要不再刷新环境，createdAt 保持不变。

通过简单的 Debug，就可以看到 @RefreshScope 注解的确是按照前面猜想的方式来实现 Bean 的更新，下图是没有使用 @RefreshScope 注解时，Spring 为我们注入的是一个普通的对象。

下面这个是增加了 @RefreshScope 后，Spring 为我们注入的增强对象。

需要注意的是，被 @RefreshScope 注解的对象，不是在 /actuator/refresh 后马上得到更新，而是要等到下一次访问这个对象时才会重新创建。同时不能通过代理类对象直接访问成员变量，如 testRefreshScope.createdAt ，只能通过方法调用的方式去访问 testRefreshScope.getCreatedAt()。

要不要使用配置热更新

虽然已经大概知道了配置热更新产生的影响，也知道如何使用热更新的配置，但仍然有一个问题需要思考，到底要不要使用配置热更新，或者说要不要其限制其使用范围。比如说前面提到更新线程池的场景，我们大概不希望程序在运行过程中，线程池被重启，因为这会造成业务的影响。至少目前来看，Spring 对配置的刷新没有控制到只修改一个配置值时，只影响一个配置类，而是直接的 refreshAll 。很有可能在不知情的情况下，比如只修改一个日志保存天数，就会导致线程池重启，或数据库连接池的断开重连。

考虑过后，只保留了即时读取的配置可以热更新，需要重新初始化的线程池热更新不启用，只在重新启动服务时才去读取新的配置。

最终改动如下，删除会导致线程池重启配置更新回调，只在服务器停机时才 shutdown 线程池。

diff --git a/AliyunOssConfiguration.java b/AliyunOssConfiguration.java
index 633a8af..02762cb 100644
--- a/AliyunOssConfiguration.java
+++ b/AliyunOssConfiguration.java
@@ -77,41 +68,8 @@ public class AliyunOssConfiguration {
     /** 允许线程的空闲时间60秒 */
     private int keepAliveSeconds = 5;

-    private final ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
-
-    private OSS oss;
-
     @PreDestroy
     public void preDestroy() {
-        log.info("executor.shutdown");
-        executor.shutdown();
-        oss.shutdown();
-    }
-
-    @Bean
-    public OSS initClient() {
-        ClientBuilderConfiguration conf = new ClientBuilderConfiguration();
-        conf.setMaxConnections(maxConnectionsCount);
-        conf.setSocketTimeout(socketTimeout);
-        conf.setConnectionTimeout(connectionTimeout);
-        conf.setConnectionRequestTimeout(connectionRequestTimeout);
-        conf.setIdleConnectionTime(idleConnectionTime);
-        oss =  new OSSClientBuilder().build(this.endPoint, this.accessKeyId, this.accessKeySecret, conf);
-        return oss;
+        log.info("AliyunOssConfiguration PreDestroy");
     }
-
-    @Bean
-    public Executor ossUploadPicExecutor() {
-        executor.setCorePoolSize(corePoolSize);
-        executor.setMaxPoolSize(maxPoolSize);
-        executor.setQueueCapacity(queueCapacity);
-        executor.setKeepAliveSeconds(keepAliveSeconds);
-        executor.setThreadNamePrefix("ossUpload-");
-        executor.setRejectedExecutionHandler((runnable, threadPoolExecutor) -> {
-            log.error("OssUpload线程池满");
-        });
-        executor.initialize();
-        return executor;
-    }
-
 }

diff --git a/AliyunOssFactory.java b/AliyunOssFactory.java
new file mode 100644
index 0000000..203c6eb
--- /dev/null
+++ b/AliyunOssFactory.java
@@ -0,0 +1,48 @@
+@Slf4j
+@Component
+public class AliyunOssFactory {
+
+    @Resource
+    public AliyunOssConfiguration aliyunOssConfiguration;
+
+    @Bean(name = "ossClient", destroyMethod = "shutdown")
+    public OSS initClient() {
+        log.info("initClient endPoint:" + aliyunOssConfiguration.getEndPoint());
+        ClientBuilderConfiguration conf = new ClientBuilderConfiguration();
+        conf.setMaxConnections(aliyunOssConfiguration.getMaxConnectionsCount());
+        conf.setSocketTimeout(aliyunOssConfiguration.getSocketTimeout());
+        conf.setConnectionTimeout(aliyunOssConfiguration.getConnectionTimeout());
+        conf.setConnectionRequestTimeout(aliyunOssConfiguration.getConnectionRequestTimeout());
+        conf.setIdleConnectionTime(aliyunOssConfiguration.getIdleConnectionTime());
+        return new OSSClientBuilder().build(aliyunOssConfiguration.getEndPoint(), aliyunOssConfiguration.getAccessKeyId()
, aliyunOssConfiguration.getAccessKeySecret(), conf);
+    }
+
+    @Bean(name = "ossUploadPicExecutor", destroyMethod = "shutdown")
+    public ThreadPoolTaskExecutor ossUploadPicExecutor(AliyunOssConfiguration config) {
+        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
+        executor.setCorePoolSize(config.getCorePoolSize());
+        executor.setMaxPoolSize(config.getMaxPoolSize());
+        executor.setQueueCapacity(config.getQueueCapacity());
+        executor.setKeepAliveSeconds(config.getKeepAliveSeconds());
+        executor.setThreadNamePrefix("ossUpload-");
+        executor.setRejectedExecutionHandler((runnable, threadPoolExecutor) -> {
+            log.error("OssUpload线程池满");
+        });
+        executor.initialize();
+        return executor;
+    }
+}
+

原文地址 coding.bozhen.live/#/articles/…