先说结论,一般情况下,要使用ZGC,只需要开启-XX:+UseZGC和 设置-Xmx即可, 其余参数不用调整,真可谓使用简洁,就可以达到极好的效果。
1 ZGC效果
-XX:+UseZGC,一定要注意, 大堆内存是使用前提。以线上真实业务流量得出的数据, 进程单次卡顿时间确实可以达到10ms内。 下图依次分别是G1, shenandoah, ZGC, 环境为java17, 8c24g
在CPU的占用率上, G1比较节省一些, 吞吐量更高一些,但如果目标是吞吐量,Parallel GC 可能是更好的选择。根据 OptaPlanner 的基准测试,Parallel GC 在 Java 17 中比 G1 GC 快 16.39%
2 调优参数
要查看jvm默认参数,-XX:+PrintFlagsFinal,这些参数会根据配置规格计算
double ZAllocationSpikeTolerance = 1.000000 {product} {default}
double ZCollectionInterval = 0.000000 {product} {default}
uint ParallelGCThreads = 10 {product} {default}
uint ConcGCThreads = 4 {product} {default}
其中最重要的调优选项,就是设置-Xmx, 理论上内存越大越好,即不管堆的大小如何,GC 导致的停顿时间都保持在一个很小的范围内,接近于 O(1) 时间复杂度
Setting the Heap Size
The most important tuning option for ZGC is setting the max heap size (-Xmx). Since ZGC is a concurrent collector a max heap size must be selected such that, 1) the heap can accommodate the live-set of your application, and 2) there is enough headroom in the heap to allow allocations to be serviced while the GC is running. How much headroom is needed very much depends on the allocation rate and the live-set size of the application. In general, the more memory you give to ZGC the better. But at the same time, wasting memory is undesirable, so it’s all about finding a balance between memory usage and how often the GC needs to run.
Setting Number of Concurrent GC Threads
The second tuning option one might want to look at is setting the number of concurrent GC threads (-XX:ConcGCThreads). ZGC has heuristics to automatically select this number. This heuristic usually works well but depending on the characteristics of the application this might need to be adjusted. This option essentially dictates how much CPU-time the GC should be given. Give it too much and the GC will steal too much CPU-time from the application. Give it too little, and the application might allocate garbage faster than the GC can collect it.
Returning Unused Memory to the Operating System
By default, ZGC uncommits unused memory, returning it to the operating system. This is useful for applications and environments where memory footprint is a concern. This feature can be disabled using -XX:-ZUncommit. Furthermore, memory will not be uncommitted so that the heap size shrinks below the minimum heap size (-Xms). This means this feature will be implicitly disabled if the minimum heap size (-Xms) is configured to be equal to the maximum heap size (-Xmx).
An uncommit delay can be configured using -XX:ZUncommitDelay=<seconds> (default is 300 seconds). This delay specifies for how long memory should have been unused before it's eligible for uncommit.
3 ZGC日志
-Xlog:gc*:stdout:time,level,tags
[2024-09-06T16:20:07.263+0800][info][gc,phases ] GC(2352) Pause Mark Start 0.023ms
[2024-09-06T16:20:07.265+0800][info][gc,phases ] GC(2352) Concurrent Mark 1.524ms
[2024-09-06T16:20:07.265+0800][info][gc,phases ] GC(2352) Pause Mark End 0.006ms
[2024-09-06T16:20:07.265+0800][info][gc,phases ] GC(2352) Concurrent Mark Free 0.001ms
[2024-09-06T16:20:07.265+0800][info][gc,phases ] GC(2352) Concurrent Process Non-Strong References 0.346ms
[2024-09-06T16:20:07.265+0800][info][gc,phases ] GC(2352) Concurrent Reset Relocation Set 0.000ms
[2024-09-06T16:20:07.267+0800][info][gc ] Allocation Stall (Thread-0) 4.156ms
[2024-09-06T16:20:07.267+0800][info][gc ] Allocation Stall (Thread-1) 2.448ms
[2024-09-06T16:20:07.268+0800][info][gc,phases ] GC(2352) Concurrent Select Relocation Set 2.851ms
[2024-09-06T16:20:07.268+0800][info][gc,phases ] GC(2352) Pause Relocate Start 0.011ms
[2024-09-06T16:20:07.268+0800][info][gc,phases ] GC(2352) Concurrent Relocate 0.031ms
[2024-09-06T16:20:07.268+0800][info][gc,load ] GC(2352) Load: 1.03/0.51/0.30
[2024-09-06T16:20:07.268+0800][info][gc,mmu ] GC(2352) MMU: 2ms/93.0%, 5ms/97.0%, 10ms/98.3%, 20ms/99.1%, 50ms/99.6%, 100ms/99.8%
[2024-09-06T16:20:07.268+0800][info][gc,marking ] GC(2352) Mark: 1 stripe(s), 2 proactive flush(es), 1 terminate flush(es), 0 completion(s), 0 continuation(s)
[2024-09-06T16:20:07.268+0800][info][gc,marking ] GC(2352) Mark Stack Usage: 32M
[2024-09-06T16:20:07.268+0800][info][gc,nmethod ] GC(2352) NMethods: 76 registered, 0 unregistered
[2024-09-06T16:20:07.268+0800][info][gc,metaspace] GC(2352) Metaspace: 0M used, 0M committed, 1088M reserved
[2024-09-06T16:20:07.268+0800][info][gc,ref ] GC(2352) Soft: 9 encountered, 7 discovered, 0 enqueued
[2024-09-06T16:20:07.268+0800][info][gc,ref ] GC(2352) Weak: 49 encountered, 0 discovered, 0 enqueued
[2024-09-06T16:20:07.268+0800][info][gc,ref ] GC(2352) Final: 0 encountered, 0 discovered, 0 enqueued
[2024-09-06T16:20:07.268+0800][info][gc,ref ] GC(2352) Phantom: 1 encountered, 0 discovered, 0 enqueued
[2024-09-06T16:20:07.268+0800][info][gc,reloc ] GC(2352) Small Pages: 1 / 2M, Empty: 0M, Relocated: 0M, In-Place: 0
[2024-09-06T16:20:07.268+0800][info][gc,reloc ] GC(2352) Medium Pages: 0 / 0M, Empty: 0M, Relocated: 0M, In-Place: 0
[2024-09-06T16:20:07.268+0800][info][gc,reloc ] GC(2352) Large Pages: 53 / 456M, Empty: 450M, Relocated: 0M, In-Place: 0
[2024-09-06T16:20:07.268+0800][info][gc,reloc ] GC(2352) Forwarding Usage: 0M
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Min Capacity: 8M(2%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Max Capacity: 460M(100%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Soft Max Capacity: 460M(100%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Mark Start Mark End Relocate Start Relocate End High Low
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Capacity: 460M (100%) 460M (100%) 460M (100%) 460M (100%) 460M (100%) 460M (100%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Free: 2M (0%) 2M (0%) 434M (94%) 434M (94%) 452M (98%) 2M (0%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Used: 458M (100%) 458M (100%) 26M (6%) 26M (6%) 458M (100%) 8M (2%)
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Live: - 6M (1%) 6M (1%) 6M (1%) - -
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Allocated: - 0M (0%) 18M (4%) 18M (4%) - -
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Garbage: - 451M (98%) 1M (0%) 1M (0%) - -
[2024-09-06T16:20:07.268+0800][info][gc,heap ] GC(2352) Reclaimed: - - 450M (98%) 450M (98%) - -
[2024-09-06T16:20:07.268+0800][info][gc ] GC(2352) Garbage Collection (Allocation Stall) 458M(100%)->26M(6%)
[2024-09-06T16:20:07.312+0800][info][gc,start ] GC(2353) Garbage Collection (Allocation Stall)
[2024-09-06T16:20:07.312+0800][info][gc,ref ] GC(2353) Clearing All SoftReferences
[2024-09-06T16:20:07.312+0800][info][gc,task ] GC(2353) Using 1 workers
[2024-09-06T16:20:07.312+0800][info][gc,ref ] GC(2353) Clearing All SoftReferences
可以看到有一条日志
[2024-09-06T16:20:07.267+0800][info][gc ] Allocation Stall (Thread-0) 4.156ms
这条日志表示在指定的时间点,由于内存分配速度超过了垃圾回收速度,导致一个线程在尝试分配内存时被阻塞了 4.156 毫秒。这种情况应当尽量避免,因为它会影响应用程序的性能和响应速度。在 ZGC 中,可以通过调整垃圾回收的触发机制和堆大小等参数来减少或避免这种分配阻塞的发生, 比如ZAllocationSpikeTolerance。
4 辅助工具
4.1 async-profiler
可以使用async-profiler来输助查看一些线上运行细节. 比如使用-e查询指定的单事件, cpu or alloc。
async-profiler-3.0-linux-x64/bin/asprof -d 30 -e cpu -f cpu.html [pid]
async-profiler-3.0-linux-x64/bin/asprof -d 30 -e alloc -f alloc.html [pid]
想多事件的话, 只有jfr支持
asprof -e cpu,alloc,lock -f profile.jfr
4.2 pidstat
使用在OS上执行一些命令, 进行一些观察, 比如使用pidstat。pidstat是sysstat工具的一个命令,用于监控全部或指定进程的cpu、内存、线程、设备IO等系统资源的占用情况。pidstat首次运行时显示自系统启动开始的各项统计信息,之后运行pidstat将显示自上次运行该命令以后的统计信息。用户可以通过指定统计的次数和时间来获得所需的统计信息。安装的话, 安装 sysstat即可, 例如
yum install sysstat
以1秒为采样周期,输出线程额外信息, -u(cpu) -r(内存) -d(IO)
pidstat -t -u -p 31663 1
pidstat -t -r -p 31663 1
pidstat -t -d -p 31663 1
上下文切换情况 pidstat -t -w -p 4625 1
- cswch:代表自愿上下文切换(voluntary context switches),是指进程由于等待资源(如 I/O 操作)而主动放弃 CPU,从而发生的上下文切换。这种切换通常是良性的,不会对系统性能产生负面影响。
- nvcswch:代表非自愿上下文切换(non-voluntary context switches),是指进程在时间片用尽时被系统强制从 CPU 上移除,从而发生的上下文切换。这种切换可能会对系统性能产生影响,尤其是当它发生得非常频繁时,可能表明系统存在 CPU 竞争问题。