【问题记录】HikariCP 连接池线上问题排查

6,993 阅读4分钟

问题描述

这几天线上遇到了一个问题,就是应用程序执行一段时间后,会突然报数据库连接超时,然后程序就一直处于不可用状态。

### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: LoanMarketPool - Connection is not available, request timed out after 30001ms.

配置如下

configuration:
connectionTimeout...............30000
driverClassName................."com.mysql.jdbc.Driver"
idleTimeout.....................600000
maxLifetime.....................900000
maximumPoolSize.................50
minimumIdle.....................50

排查过程

一开始怀疑是连接池不够用导致,通过将连接池数量改大重新运行了一段时间后还是会抛这个异常,于是通过修改nacos配置放开了 hikari 连接池的 DEBUG 日志来定位问题。

logging:
    com:
      zaxxer:
        hikari: debug

Hikari 初始化的时候,会先调用 checkFailFast 方法创建一个连接并放入池中以检查创建数据库连接是否有问题。

private void checkFailFast()
   {
      ......
      do {
         final PoolEntry poolEntry = createPoolEntry();
         if (poolEntry != null) {
            if (config.getMinimumIdle() > 0) {
               connectionBag.add(poolEntry);
               logger.debug("{} - Added connection {}", poolName, poolEntry.connection);
            }
            ......
         }
      } while (elapsedMillis(startTime) < initializationTimeout);
   }

然后在通过创建 houseKeeperTask 定时任务来检查连接状态,并调用 logPoolState 打印连接池里的连接数量信息。

void logPoolState(String... prefix)
   {
      if (logger.isDebugEnabled()) {
         logger.debug("{} - {}stats (total={}, active={}, idle={}, waiting={})",
                      poolName, (prefix.length > 0 ? prefix[0] : ""),
                      getTotalConnections(), getActiveConnections(), getIdleConnections(), getThreadsAwaitingConnection());
      }
   }

它每过一段时间就会检查连接池里的连接是否超过最小空闲超时时间,以及连接最大存活时间,如果有超时并且没有在使用的连接的话就会被关闭。同时在关闭完连接之后会调用 fillPool 方法来填充连接池最小连接数量

private synchronized void fillPool()
   {
      final int connectionsToAdd = Math.min(config.getMaximumPoolSize() - getTotalConnections(), config.getMinimumIdle() - getIdleConnections())
                                   - addConnectionQueueReadOnlyView.size();
      if (connectionsToAdd <= 0) logger.debug("{} - Fill pool skipped, pool is at sufficient level.", poolName);

      for (int i = 0; i < connectionsToAdd; i++) {
         addConnectionExecutor.submit((i < connectionsToAdd - 1) ? poolEntryCreator : postFillPoolEntryCreator);
      }
   }

刚开始启动的时候,Hikari 会填充连接直到 minimumIdle 数量为止,但是通过日志发现,并不是每次启动都会填充至 minimumIdle 数量,每次启动的数量都是不固定的。

2021-01-20 00:50:31.964 [LoanMarketPool housekeeper] DEBUG c.z.h.p.HikariPool - [logPoolState,421] - LoanMarketPool - Pool stats (total=1, active=1, idle=0, waiting=0)
2021-01-20 00:50:32.023 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@3f3c777
2021-01-20 00:50:32.047 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@777d19cc
2021-01-20 00:50:32.073 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@41062367
2021-01-20 00:50:32.099 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@2290fc45
2021-01-20 00:50:32.123 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@4f3653a2
2021-01-20 00:50:32.167 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@5d41f91
2021-01-20 00:50:32.201 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@12e3e900
2021-01-20 00:50:32.227 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@46a02511
2021-01-20 00:50:32.253 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@1149ea06
2021-01-20 00:50:32.278 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@58940a8d
2021-01-20 00:50:32.304 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@6ac7ac81
2021-01-20 00:50:32.328 [LoanMarketPool connection adder] DEBUG c.z.h.p.HikariPool - [call,728] - LoanMarketPool - Added connection com.mysql.jdbc.JDBC4Connection@47afa53b
2021-01-20 00:51:01.967 [LoanMarketPool housekeeper] DEBUG c.z.h.p.HikariPool - [logPoolState,421] - LoanMarketPool - Pool stats (total=13, active=0, idle=13, waiting=0)
2021-01-20 00:51:31.967 [LoanMarketPool housekeeper] DEBUG c.z.h.p.HikariPool - [logPoolState,421] - LoanMarketPool - Pool stats (total=13, active=0, idle=13, waiting=0)

这里为了快速暴露问题,将 maxlifetime 修改为 90 秒,然后发现在 houseKeeper 将超过最大生命周期的连接关闭之后,没有再往连接池里添加新的连接,导致最后连接池里面的连接最后没有可用的连接,程序抛出异常。

2021-01-20 00:54:26.239 [http-nio-8085-exec-8] DEBUG c.z.h.p.HikariPool - [addBagItem,342] - LoanMarketPool - Add connection elided, waiting 1, queue 50
2021-01-20 00:54:31.971 [LoanMarketPool housekeeper] DEBUG c.z.h.p.HikariPool - [logPoolState,421] - LoanMarketPool - Pool stats (total=0, active=0, idle=0, waiting=1)
2021-01-20 00:54:31.971 [LoanMarketPool housekeeper] DEBUG c.z.h.p.HikariPool - [fillPool,517] - LoanMarketPool - Fill pool skipped, pool is at sufficient level.

Add connection elided, waiting 1, queue 50 通过这行日志得知,其实 houseKeeper 是有再通过 fillPool 方法往 addConnectionExecutor 线程池提交创建数据库连接的任务的,但不知道为什么好像任务被阻塞了,无法创建连接,导致后面的添加数据库连接任务一直在队列中排队着。

总结

目前能力有限,就只能排查到这里了,目前我是通过修改 maxlifetime 这个参数,把它设置为 0 来让 houseKeeper 每次执行定时任务的时候不会关闭连接,只要刚开始启动的时候将连接池填满,并且里面的连接永远不被清空就不会有问题,但这也只是缓兵之计,可能要接着往下排查我呢提还是需要理解底层 Socket,同时我也在 GitHub 的 Hikari 项目底下找到了类似问题的 issue 地址贴在下面,希望对你有所帮助。 github.com/brettwooldr…