记一次连接泄漏问题排查

695 阅读1分钟

最近发现某个服务偶发性抛出如下错误:

Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms.
        at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:676)
        at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:190)
        at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:155)
        at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)

首先进行常见的检查:数据库连接数、CPU、慢查询一切正常,服务 QPS、RT、网络连接也均无异常。

检查源码了解是有个异步线程创建连接:

@Override
public void addBagItem(final int waiting)
{
   final boolean shouldAdd = waiting - addConnectionQueue.size() >= 0; // Yes, >= is intentional.
   if (shouldAdd) {
      addConnectionExecutor.submit(POOL_ENTRY_CREATOR);
   }
}

然而检查线程状态却是在等待任务,这就显得非常奇怪。

"HikariPool-1 connection adder" Id=8393 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@20dfb873
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
        at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

进一步检查源码,有如下判断逻辑:

private boolean shouldCreateAnotherConnection() {
   return getTotalConnections() < config.getMaximumPoolSize() &amp;&amp;
      (connectionBag.getWaitingThreadCount() > 0 || getIdleConnections() < config.getMinimumIdle());
}

使用 Arthas 发现该方法返回 False ( 即意味着不会创建连接 ),进一步检查发现,现有的 PoolEntry ( 代表连接 ) 均处于 IN_USE 状态,且 lastAccessed 都是一两天前的时间,这看起来是明显是发生了连接泄漏。

打开连接泄漏探测等待一两天,得到如下调用堆栈:

java.lang.Exception: Apparent connection leak detected
        at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) ~[HikariCP-3.2.0.jar!/:?]
        at com.zaxxer.hikari.HikariDataSource$$FastClassBySpringCGLIB$$eeb1ae86.invoke(<generated>) ~[HikariCP-3.2.0.jar!/:?]
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[spring-core-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:752) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:691) ~[spring-aop-5.1.18.RELEASE.jar!/:5.1.18.RELEASE]
        at com.zaxxer.hikari.HikariDataSource$$EnhancerBySpringCGLIB$$f71d77fa.getConnection(<generated>) ~[HikariCP-3.2.0.jar!/:?]
        at com.tencent.tsf.sleuth.instrument.jdbc.proxy.TraceJdbcDataSource.getConnection(TraceJdbcDataSource.java:68) ~[spring-cloud-tsf-sleuth-1.23.16-Greenwich-RELEASE.jar!/:1.23.16-Greenwich-RELEASE]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
        at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:688) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:722) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeWithType(BeanSerializerBase.java:607) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:729) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:722) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:166) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:722) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeWithType(BeanSerializerBase.java:607) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:729) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:722) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:166) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:480) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:400) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1429) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1135) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at com.fasterxml.jackson.databind.ObjectWriter.writeValueAsString(ObjectWriter.java:1005) ~[Qoconut-1.0-SNAPSHOT.jar:1.0-SNAPSHOT]
        at org.apache.skywalking.apm.commons.replay.utils.JsonUtils.toJsonWithType(JsonUtils.java:159) ~[apm-replay-2.0-SNAPSHOT.jar!/:?]
        at org.apache.skywalking.apm.plugin.guaua.cache.v1.GuauaCacheInterceptor.afterMethod(GuauaCacheInterceptor.java:116) ~[?:?]
        at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:97) ~[skywalking-agent.jar:8.1.0]
        at com.google.common.cache.LocalCache$LocalManualCache.getIfPresent(LocalCache.java) ~[guava-20.0.jar!/:?]
        at com.github.pagehelper.cache.GuavaCache.get(GuavaCache.java:67) ~[pagehelper-5.1.4.jar!/:?]
        at com.github.pagehelper.PageInterceptor.intercept(PageInterceptor.java:106) ~[pagehelper-5.1.4.jar!/:?]
        at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61) ~[mybatis-3.5.3.jar!/:3.5.3]
        at com.sun.proxy.$Proxy410.query(Unknown Source) ~[?:?]
        at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147) ~[mybatis-3.5.3.jar!/:3.5.3]
        at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140) ~[mybatis-3.5.3.jar!/:3.5.3]
        at sun.reflect.GeneratedMethodAccessor563.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
        at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426) ~[mybatis-spring-2.0.3.jar!/:2.0.3]
        at com.sun.proxy.$Proxy238.selectList(Unknown Source) ~[?:?]
        at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:223) ~[mybatis-spring-2.0.3.jar!/:2.0.3]
        at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:147) ~[mybatis-3.5.3.jar!/:3.5.3]
        at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:80) ~[mybatis-3.5.3.jar!/:3.5.3]
        at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:93) ~[mybatis-3.5.3.jar!/:3.5.3]
        at com.sun.proxy.$Proxy288.selectPageByRequestVo(Unknown Source) ~[?:?]

分析堆栈发现,在 TraceJdbcConnection 类中有如下方法:

public Connection getConnection() throws SQLException {
    return new TraceJdbcConnection(this.dataSource.getConnection(), this.tracing);
}

至此,一切清晰明了:某个方法使用 Jackson 框架序列化某个对象,该对象持有 TraceJdbcConnection 对象引用,由于 getConnection() 符合 Getter 格式会在序列化过程中被执行,从而意外触发创建连接,但没有释放。

无感知的代码注入确实致命!