JVM 之 parNew 和 CMS 日志简单介绍

124 阅读12分钟

日志介绍

ParNew 收集器

日志

选取生产环境某台机器(pro-group1-jgb4b)进行日志分析

2023-09-17T10:19:15.929+0800: 11356346.845: [GC (Allocation Failure) 2023-09-17T10:19:15.929+0800: 11356346.845: [ParNew: 1627488K->4787K(1824768K), 0.0321305 secs] 1917451K->294751K(5040128K), 0.0326831 secs] [Times: user=0.06 sys=0.05, real=0.04 secs] 
2023-09-17T10:29:13.592+0800: 11356944.508: [GC (Allocation Failure) 2023-09-17T10:29:13.592+0800: 11356944.508: [ParNew: 1626803K->3238K(1824768K), 0.0271679 secs] 1916767K->293206K(5040128K), 0.0276740 secs] [Times: user=0.09 sys=0.00, real=0.02 secs] 
2023-09-17T10:38:45.919+0800: 11357516.835: [GC (Allocation Failure) 2023-09-17T10:38:45.919+0800: 11357516.836: [ParNew: 1625254K->3476K(1824768K), 0.0265338 secs] 1915222K->293446K(5040128K), 0.0270676 secs] [Times: user=0.08 sys=0.00, real=0.03 secs] 
2023-09-17T10:48:15.945+0800: 11358086.861: [GC (Allocation Failure) 2023-09-17T10:48:15.945+0800: 11358086.862: [ParNew: 1625492K->5590K(1824768K), 0.0235735 secs] 1915462K->295561K(5040128K), 0.0240974 secs] [Times: user=0.05 sys=0.03, real=0.03 secs] 
2023-09-17T10:58:15.894+0800: 11358686.811: [GC (Allocation Failure) 2023-09-17T10:58:15.894+0800: 11358686.811: [ParNew: 1627606K->3970K(1824768K), 0.0261846 secs] 1917577K->293943K(5040128K), 0.0266812 secs] [Times: user=0.09 sys=0.00, real=0.03 secs] 
2023-09-17T11:07:45.922+0800: 11359256.839: [GC (Allocation Failure) 2023-09-17T11:07:45.923+0800: 11359256.839: [ParNew: 1625986K->3726K(1824768K), 0.0305558 secs] 1915959K->293701K(5040128K), 0.0310881 secs] [Times: user=0.05 sys=0.06, real=0.03 secs] 
2023-09-17T11:17:15.941+0800: 11359826.858: [GC (Allocation Failure) 2023-09-17T11:17:15.942+0800: 11359826.858: [ParNew: 1625742K->6153K(1824768K), 0.0266791 secs] 1915717K->296129K(5040128K), 0.0271493 secs] [Times: user=0.09 sys=0.00, real=0.02 secs] 
2023-09-17T11:27:15.902+0800: 11360426.818: [GC (Allocation Failure) 2023-09-17T11:27:15.902+0800: 11360426.818: [ParNew: 1628169K->5127K(1824768K), 0.0244332 secs] 1918145K->295105K(5040128K), 0.0248581 secs] [Times: user=0.04 sys=0.04, real=0.02 secs] 
2023-09-17T11:36:45.932+0800: 11360996.848: [GC (Allocation Failure) 2023-09-17T11:36:45.932+0800: 11360996.848: [ParNew: 1627143K->4621K(1824768K), 0.0267798 secs] 1917121K->294600K(5040128K), 0.0271773 secs] [Times: user=0.10 sys=0.00, real=0.02 secs] 
2023-09-17T11:46:15.971+0800: 11361566.887: [GC (Allocation Failure) 2023-09-17T11:46:15.971+0800: 11361566.887: [ParNew: 1622786K->8561K(1824768K), 0.0310064 secs] 1912765K->298543K(5040128K), 0.0316434 secs] [Times: user=0.08 sys=0.01, real=0.03 secs] 
2023-09-17T11:56:15.907+0800: 11362166.824: [GC (Allocation Failure) 2023-09-17T11:56:15.908+0800: 11362166.824: [ParNew: 1630577K->5443K(1824768K), 0.0265778 secs] 1920559K->295426K(5040128K), 0.0270651 secs] [Times: user=0.09 sys=0.00, real=0.03 secs] 
2023-09-17T12:05:45.945+0800: 11362736.862: [GC (Allocation Failure) 2023-09-17T12:05:45.946+0800: 11362736.862: [ParNew: 1627459K->5275K(1824768K), 0.0253777 secs] 1917442K->295259K(5040128K), 0.0260535 secs] [Times: user=0.04 sys=0.04, real=0.03 secs] 
## 2023-09-17T12:15:45.898+0800: 11363336.814: [GC (Allocation Failure) 2023-09-17T12:15:45.898+0800: 11363336.815: [ParNew: 1627291K->3691K(1824768K), 0.0287133 secs] 1917275K->293677K(5040128K), 0.0292527 secs] [Times: user=0.04 sys=0.05, real=0.03 secs]
2023-09-17T12:25:15.936+0800: 11363906.852: [GC (Allocation Failure) 2023-09-17T12:25:15.936+0800: 11363906.852: [ParNew: 1625707K->3968K(1824768K), 0.0272128 secs] 1915693K->293956K(5040128K), 0.0276816 secs] [Times: user=0.10 sys=0.00, real=0.03 secs] 

从日志可以看出,平均 10分钟发生一次 young GC

监控

WX20230917-163823@2x.png 从grafana jvm监控上可以看出,ParNew 每10分钟收集一次

日志分析

2023-09-17T11:36:45.932+0800: 11360996.848: [GC (Allocation Failure) 
2023-09-17T11:36:45.932+0800: 11360996.848: [ParNew: 1627143K->4621K(1824768K), 0.0267798 secs] 1917121K->294600K(5040128K), 0.0271773 secs]
[Times: user=0.10 sys=0.00, real=0.02 secs]
  • GC开始的时钟时间和JVM启动的相对时间
  • GC——区分minor GC 和 Full GC的标志
  • Allocation Failure——> 垃圾收集的原因,年轻代没有足够的空间被分配
  • ParNew——> 用到的收集器名称,ParNew 是年轻代垃圾收集器,多线程收集、使用复制算法、STW。被设计和老年代的CMS收集器结合使用。
  • 1627143K->4621K(1824768K), 0.0267798 secs
    • 年轻代收集前后使用量 1.55G、4.51M
    • 年轻代总容量 1.74G
  • 1917121K->294600K(5040128K), 0.0271773 secs
    • 整个堆收集前后使用量
    • 整个堆的可用容量4.8G
  • 总结:GC前后,年轻代使用量变小,而整个堆的使用量变大,即老年代使用量变大,有很多对象超过年龄阈值从年轻代晋升到老年代。
  • Times: user=0.10 sys=0.00, real=0.02 secs
    • user:收集阶段 GC线程 使用的总的CPU时间
    • sys:系统调用使用的时间
    • real:用户线程停止的时钟时间,多线程GC,这个数字略大于 (user + sys)/ GC 线程数,由于一些活动不能并行化。本例中并行收集器的线程数设置为8。

CMS收集器

监控

查看grafna监控,15:30发生了一次FullGC WX20230917-165218@2x.png

日志

查看gc日志,pod名称 9wq5k

2023-09-17T15:26:55.433+0800: 518583.749: [GC (CMS Initial Mark) [1 CMS-initial-mark: 5087358K(6359040K)] 5113671K(10073088K), 0.0036786 secs] [Times: user=0
.02 sys=0.00, real=0.01 secs] 
2023-09-17T15:26:55.438+0800: 518583.754: [CMS-concurrent-mark-start]
2023-09-17T15:26:55.580+0800: 518583.896: [CMS-concurrent-mark: 0.133/0.143 secs] [Times: user=0.56 sys=0.31, real=0.14 secs] 
2023-09-17T15:26:55.580+0800: 518583.896: [CMS-concurrent-preclean-start]
2023-09-17T15:26:55.598+0800: 518583.914: [CMS-concurrent-preclean: 0.017/0.017 secs] [Times: user=0.03 sys=0.02, real=0.02 secs] 
2023-09-17T15:26:55.598+0800: 518583.914: [CMS-concurrent-abortable-preclean-start]
2023-09-17T15:26:58.248+0800: 518586.564: [CMS-concurrent-abortable-preclean: 2.629/2.650 secs] [Times: user=6.07 sys=2.13, real=2.65 secs] 
2023-09-17T15:26:58.253+0800: 518586.569: [GC (CMS Final Remark) [YG occupancy: 1726006 K (3714048 K)]2023-09-17T15:26:58.253+0800: 518586.569: [Rescan (para
llel) , 0.1338179 secs]2023-09-17T15:26:58.387+0800: 518586.703: [weak refs processing, 0.0009936 secs]2023-09-17T15:26:58.388+0800: 518586.704: [class unloa
ding, 0.0364321 secs]2023-09-17T15:26:58.425+0800: 518586.741: [scrub symbol table, 0.0274942 secs]2023-09-17T15:26:58.452+0800: 518586.768: [scrub string ta
ble, 0.0018922 secs][1 CMS-remark: 5087358K(6359040K)] 6813364K(10073088K), 0.2012828 secs] [Times: user=1.13 sys=0.00, real=0.20 secs] 
2023-09-17T15:26:58.455+0800: 518586.771: [CMS-concurrent-sweep-start]
2023-09-17T15:27:00.780+0800: 518589.096: [GC (Allocation Failure) 2023-09-17T15:27:00.780+0800: 518589.096: [ParNew: 3327483K->26043K(3714048K), 0.0148547 secs] 6771221K->3470626K(10073088K), 0.0155098 secs] [Times: user=0.08 sys=0.03, real=0.02 secs] 
2023-09-17T15:27:05.833+0800: 518594.149: [GC (Allocation Failure) 2023-09-17T15:27:05.833+0800: 518594.149: [ParNew: 3327419K->25889K(3714048K), 0.0162620 secs] 5002227K->1701564K(10073088K), 0.0169240 secs] [Times: user=0.09 sys=0.02, real=0.02 secs] 
2023-09-17T15:27:10.444+0800: 518598.760: [CMS-concurrent-sweep: 11.861/11.989 secs] [Times: user=36.43 sys=4.77, real=11.98 secs] 
2023-09-17T15:27:10.445+0800: 518598.761: [CMS-concurrent-reset-start]
2023-09-17T15:27:10.457+0800: 518598.773: [CMS-concurrent-reset: 0.012/0.012 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] 

Jps

使用jps命令查看JVM参数 jps -v

1 AppXXXApplication 
-Xms10240m -Xmx10240m -Xmn4030m 
-XX:MaxMetaspaceSize=1024m 
-XX:MetaspaceSize=1024m 
-Dfile.encoding=UTF-8 
-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC 
-XX:+UseCMSCompactAtFullCollection 
-XX:CMSFullGCsBeforeCompaction=0 
-XX:CMSInitiatingOccupancyFraction=80 
-XX:SurvivorRatio=8 
-XX:+CMSClassUnloadingEnabled 
-XX:+ExplicitGCInvokesConcurrent 
-XX:ParallelGCThreads=8 
-XX:+UseCMSInitiatingOccupancyOnly 
-Xloggc:/workspace/carkey/AppLocationSemantics/latest/logs/gc.log 
-XX:+PrintGCDateStamps 
-XX:+PrintGCDetails 
-XX:+HeapDumpOnOutOfMemoryError 
XX:HeapDumpPath=/workspace/carkey/AppLocationSemantics/latest/logs/heapdump.hprof 
-XX:+UseGCLogFileRotation 
-XX:GCLogFileSize=100M 
-XX:NumberOfGCLogFiles=3 
-Dcom.sun.management.jmxremote.port=5555 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote=true 
  • -Xms10240m ——初始堆大小 10G
  • -Xmx10240m ——最大堆大小 10G
  • -Xmn4030m ——新生代大小 3.94G

CMS 基础知识复习

CMS(Concurrent Mark Sweep) 收集器是一种获取最短回收停顿时间为目标的收集器 基于标记-清除算法 分为4个步骤

  • 初始标记(CMS Initial Mark)
  • 并发标记(CMS concurrent Mark)
  • 重新标记(CMS Remark)
  • 并发清除(CMS concurrent Sweep)

初始标记、重新标记需要 stop the world。

初始标记(CMS Initial Mark)

初始标记需要 Stop The World 初始标记用于标记老年代中GC root 能直接关联到的对象,或者被年轻代对象引用的存活对象。

g1-06.png 图片来源于## GC Algorithms: Implementations

2023-09-17T15:26:55.433+0800: 518583.749: 
[GC (CMS Initial Mark) [1 CMS-initial-mark: 5087358K(6359040K)] 5113671K(10073088K), 0.0036786 secs] 
[Times: user=0 .02 sys=0.00, real=0.01 secs]

注释

  • GC开始的时间,以及相对于JVM启动的时间
  • 初始标记,收集所有GC root,以及直接引用的对象
    • 5087358K _ 当前老年代使用的容量,这里是4.8G
    • 6359040K _ 老年代可用最大容量,这里是6G
    • 5113671K _ 整个堆目前使用的容量
    • 10073088K _ 堆可用容量 9.6G
  • 该阶段时间统计

并发标记(CMS concurrent Mark)

cms-2.png 根据GC root 遍历老年代,标记所有存活的对象。并不是老年代中所有存活对象都会被标记,并发标记期间,用户程序继续运作而导致标记产生变动。

2023-09-17T15:26:55.438+0800: 518583.754: [CMS-concurrent-mark-start] 
2023-09-17T15:26:55.580+0800: 518583.896: [CMS-concurrent-mark: 0.133/0.143 secs] [Times: user=0.56 sys=0.31, real=0.14 secs] 
  • CMS-concurrent-mark _ 并发标记阶段,遍历老年代标记所有存活对象
  • 0.133/0.143 secs _ 这个阶段持续的时间和时钟时间
  • [Times: user=0.56 sys=0.31, real=0.14 secs] _ 并发阶段意义不大,因为从并发标记开始时间开始计算,包含的不仅仅是并发标记的时间

阶段3: concurrent-preclean

cms-3.png

2023-09-17T15:26:55.580+0800: 518583.896: [CMS-concurrent-preclean-start]
2023-09-17T15:26:55.598+0800: 518583.914: [CMS-concurrent-preclean: 0.017/0.017 secs] [Times: user=0.03 sys=0.02, real=0.02 secs] 

阶段4:concurrent-abortable-preclean

2023-09-17T15:26:55.598+0800: 518583.914: [CMS-concurrent-abortable-preclean-start]
2023-09-17T15:26:58.248+0800: 518586.564: [CMS-concurrent-abortable-preclean: 2.629/2.650 secs] [Times: user=6.07 sys=2.13, real=2.65 secs] 

重新标记(CMS Remark)

2023-09-17T15:26:58.253+0800: 518586.569: [GC (CMS Final Remark) 
[YG occupancy: 1726006 K (3714048 K)]
2023-09-17T15:26:58.253+0800: 518586.569: [Rescan (para llel) , 0.1338179 secs]
2023-09-17T15:26:58.387+0800: 518586.703: [weak refs processing, 0.0009936 secs]
2023-09-17T15:26:58.388+0800: 518586.704: [class unloa ding, 0.0364321 secs]
2023-09-17T15:26:58.425+0800: 518586.741: [scrub symbol table, 0.0274942 secs]
2023-09-17T15:26:58.452+0800: 518586.768: [scrub string ta ble, 0.0018922 secs]
[1 CMS-remark: 5087358K(6359040K)] 6813364K(10073088K), 0.2012828 secs] [Times: user=1.13 sys=0.00, real=0.20 secs]
  • [YG occupancy: 1726006 K (3714048 K)]——年轻代当前的占用量和容量,这里分别是 1.64G和 3.54G.
  • [1 CMS-remark: 5087358K(6359040K)] 6813364K(10073088K), 0.2012828 secs] ——这个阶段之后,
    • 老年代的使用量和总量,4.85, 6.06G
    • 堆的使用量和总量: 6.49G, 9.6G
    • 这个阶段持续的时间

并发清除(CMS concurrent Sweep)

和用户线程一起工作,不需要STW,清除那些不在使用的对象,回收他们占用的空间为将来使用。

2023-09-17T15:26:58.455+0800: 518586.771: [CMS-concurrent-sweep-start]
2023-09-17T15:27:10.444+0800: 518598.760: [CMS-concurrent-sweep: 11.861/11.989 secs] [Times: user=36.43 sys=4.77, real=11.98 secs] 
  • CMS-concurrent-sweep —— 并发清除阶段,清除没有被标记的对象,回收他们占用的空间
  • 11.861/11.989 secs —— 这个阶段的持续时间与时钟时间
  • [Times: user=36.43 sys=4.77, real=11.98 secs] —— 并发阶段意义不大,从并发标记开始的时间开始计算,包含的不仅仅是并发标记的工作

阶段7:Concurrent Reset

并发执行,重设CMS算法内部数据结构,为下一次GC做准备

2023-09-17T15:27:10.445+0800: 518598.761: [CMS-concurrent-reset-start]
2023-09-17T15:27:10.457+0800: 518598.773: [CMS-concurrent-reset: 0.012/0.012 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] 
  • 0.012/0.012 secs —— 这个阶段的持续时间和时钟时间
  • 并发阶段意义不大

参考