记一次线上full gc问题处理

868 阅读3分钟

问题现象

[root@2wkvx /]# jstat -gc 1 2000
S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT   
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796531.9  109388.0 104798.1 12364.0 11638.8    181    4.508 4129   914.951  919.459
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796536.4  109388.0 104798.1 12364.0 11638.8    181    4.508 4139   916.811  921.318
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796542.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4150   918.822  923.330
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796541.4  109388.0 104798.1 12364.0 11638.8    181    4.508 4160   920.780  925.287
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796543.5  109388.0 104798.1 12364.0 11638.8    181    4.508 4171   922.759  927.267
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796544.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4182   924.802  929.310
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796544.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4193   926.755  931.263
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796544.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4204   928.730  933.238
465408.0 465920.0  0.0    0.0   446464.0 446455.5 2796544.0  2796542.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4215   930.797  935.304
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796544.0  109388.0 104798.1 12364.0 11638.8    181    4.508 4222   932.032  936.540
465408.0 465920.0  0.0    0.0   446464.0 446464.0 2796544.0  2796521.6  109388.0 104798.5 12364.0 11638.8    181    4.508 4230   934.761  939.269
* S0C:第一个幸存区的大小
* S1C:第二个幸存区的大小
* S0U:第一个幸存区的使用大小
* S1U:第二个幸存区的使用大小
* EC:伊甸园区的大小
* EU:伊甸园区的使用大小
* OC:老年代大小
* OU:老年代使用大小
* MC:方法区大小
* MU:方法区使用大小
* CCSC:压缩类空间大小
* CCSU:压缩类空间使用大小
* YGC:年轻代垃圾回收次数
* YGCT:年轻代垃圾回收消耗时间
* FGC:老年代垃圾回收次数
* FGCT:老年代垃圾回收消耗时间
* GCT:垃圾回收消耗总时间

[root@kuai-websocket-77b9dc5dbf-8z5mp /]# jinfo 1
VM Flags:
Non-default VM flags: -XX:CICompilerCount=15 -XX:InitialHeapSize=1073741824 -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=1431306240 -XX:MinHeapDeltaBytes=524288 -XX:NewSize=357564416 -XX:OldSize=716177408 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseParallelGC
Command line:  -Xms1024m -Xmx4096m

[root@kuai-server-757799875b-2wkvx /]# jmap -heap 1
Attaching to process ID 1, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.181-b13


using thread-local object allocation.
Parallel GC with 33 thread(s)


Heap Configuration:
   MinHeapFreeRatio         = 0
   MaxHeapFreeRatio         = 100
   MaxHeapSize              = 4294967296 (4096.0MB)
   NewSize                  = 357564416 (341.0MB)
   MaxNewSize               = 1431306240 (1365.0MB)
   OldSize                  = 716177408 (683.0MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)


Heap Usage:
PS Young Generation
Eden Space:
   capacity = 457179136 (436.0MB)
   used     = 457179136 (436.0MB)
   free     = 0 (0.0MB)
   100.0% used
From Space:
   capacity = 476577792 (454.5MB)
   used     = 0 (0.0MB)
   free     = 476577792 (454.5MB)
   0.0% used
To Space:
   capacity = 477102080 (455.0MB)
   used     = 0 (0.0MB)
   free     = 477102080 (455.0MB)
   0.0% used
PS Old Generation
   capacity = 2863661056 (2731.0MB)
   used     = 2861965360 (2729.382858276367MB)
   free     = 1695696 (1.6171417236328125MB)
   99.94078572963629% used

[root@kuai-server-757799875b-slmzd /]#  jmap -histo:live 1|grep WsFrameServer
175:           202          22624  org.apache.tomcat.websocket.server.WsFrameServer
[root@kuai-server-757799875b-slmzd /]#  jmap -dump:live,format=b,file=dump.hprof 1

分析dump文件:

  1. Histogram里查看到WsFrameServer的引用对象总共有3.1G大小

    Shallow Heap表示对象本身占用内存的大小,不包含对其他对象的引用,也就是对象头加成员变量(不是成员变量的值)的总和。

    Retained Heap是该对象自己的Shallow Heap,并加上从该对象能直接或间接访问到对象的Shallow Heap之和。换句话说,Retained Heap是该对象GC之后所能回收到内存的总和。

  2. 查找WsFrameServer的GCRoot未释放的外部引用

    Paths to GC Roots -> exclude all phantom/weak/soft etc.reference (排除所有虚弱软引用) -查找GC Root线程 -> 查找未释放的内存占用最高的代码逻辑段(很可能是产生内存溢出代码)

  3. 过滤查看InstanceMonitorHandler的大小

  4. 查看WsFrameServer的内存占用情况,里面会有两个对象,一个messageBufferText和一个messageBufferBinary,一个10M和一个5M

  5. 代码诊断:

    a. spring-websocket.xml中配置bufferSize:

<bean class="org.springframework.web.socket.server.standard.ServletServerContainerFactoryBean">
    <property name="maxTextMessageBufferSize" value="5242800"/>
    <property name="maxBinaryMessageBufferSize" value="5242800"/>
</bean>

b. 页面关闭时,websocket连接未关闭,导致连接数一直增大