记一次 Hadoop MR GC 蛮神奇的| Java Debug 笔记

272 阅读10分钟

本文正在参加「Java主题月 - Java Debug笔记活动」,详情查看

1 说明

发现在资源有限的情况下,还是蛮容易接触到很多的问题,蛮有意思的。
这次主要是搭建了hadoop小环境,测试用,给的资源极少。
1台机器 2C4G

2 GC 异常

  • Java heap space
    • 堆溢出,发生在 map 阶段
    • 这种一般发生在 Xmx 不够的情况下,yang区或者old区都塞不下新创建的对象了

image-20201209221804514

  • 后来打了 GC 日志
    • 最后发生了好几次 Full GC 但是堆里面没啥东西,才 2%
    • ps:写文章的时候才发现都是 Allocation Failure,原来是一直有东西塞不进去
      • Allocation Failure 表示给新对象分配内存失败
[GC (Allocation Failure) [PSYoungGen: 14848K->2038K(16896K)] 14848K->2606K(55296K), 0.0026520 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 16886K->2041K(31744K)] 17454K->4037K(70144K), 0.0034995 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 31737K->2021K(31744K)] 33733K->7351K(70144K), 0.0052218 secs] [Times: user=0.01 sys=0.01, real=0.01 secs] 
[GC (Allocation Failure) [PSYoungGen: 31717K->2034K(61440K)] 37047K->10238K(99840K), 0.0072265 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
[GC (Metadata GC Threshold) [PSYoungGen: 42160K->2021K(61440K)] 50365K->12918K(99840K), 0.0056580 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
[Full GC (Metadata GC Threshold) [PSYoungGen: 2021K->0K(61440K)] [ParOldGen: 10897K->8623K(35328K)] 12918K->8623K(96768K), [Metaspace: 20839K->20839K(1069056K)], 0.0277290 secs] [Times: user=0.05 sys=0.00, real=0.02 secs] 
[GC (Allocation Failure) [PSYoungGen: 59392K->5374K(93184K)] 68015K->13997K(128512K), 0.0035573 secs] [Times: user=0.01 sys=0.01, real=0.01 secs] 
[GC (Allocation Failure) [PSYoungGen: 49468K->4018K(124416K)] 58091K->12649K(474112K), 0.0037595 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[GC (Allocation Failure) [PSYoungGen: 4018K->4003K(135168K)] 12649K->12634K(484864K), 0.0027440 secs] [Times: user=0.01 sys=0.01, real=0.01 secs] 
[Full GC (Allocation Failure) [PSYoungGen: 4003K->0K(135168K)] [ParOldGen: 8631K->8546K(39424K)] 12634K->8546K(174592K), [Metaspace: 28225K->28225K(1075200K)], 0.0297214 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
[GC (Allocation Failure) [PSYoungGen: 0K->0K(135168K)] 8546K->8546K(484864K), 0.0005536 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
[Full GC (Allocation Failure) [PSYoungGen: 0K->0K(135168K)] [ParOldGen: 8546K->8057K(49152K)] 8546K->8057K(184320K), [Metaspace: 28225K->28214K(1075200K)], 0.0371439 secs] [Times: user=0.07 sys=0.00, real=0.04 secs] 
Heap
 PSYoungGen      total 135168K, used 9416K [0x00000000f5580000, 0x0000000100000000, 0x0000000100000000)
  eden space 128512K, 7% used [0x00000000f5580000,0x00000000f5eb21d8,0x00000000fd300000)
  from space 6656K, 0% used [0x00000000fd300000,0x00000000fd300000,0x00000000fd980000)
  to   space 7168K, 0% used [0x00000000ff900000,0x00000000ff900000,0x0000000100000000)
 ParOldGen       total 349696K, used 8057K [0x00000000e0000000, 0x00000000f5580000, 0x00000000f5580000)
  object space 349696K, 2% used [0x00000000e0000000,0x00000000e07de4f8,0x00000000f5580000)
 Metaspace       used 28262K, capacity 28534K, committed 28928K, reserved 1075200K
  class space    used 3154K, capacity 3259K, committed 3328K, reserved 1048576K

3 关键配置

  • etc/hadoop/yarn-site.xml
    • 资源给的比较小
    • 之前调试时应该 minimum 给大了,发现任务卡在了 map 0%
    • 查日志之后,发现 AppMst 去申请 4G 内存去了,一直给不到
  <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>3172</value>
  </property>
  <!-- 每个容器的最小内存分配 默认:1024 -->
  <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
  </property>
  <!-- 每个容器的最大内存分配 默认:8192 -->
  <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>3172</value>
  </property>
  • etc/hadoop/mapred-site.xml
    • 会产生 OOM 的配置
<!-- map操作最大内存 默认:1536 1.5G -->
  <property>
        <name>mapreduce.map.memory.mb</name>
        <value>512</value>
  </property>
  <!-- map操作java参数 默认:-Xmx1024M -->
  <property>
        <name>mapreduce.map.java.opts</name>
        <value>-Xmx512M</value>
  </property>
  
  <!-- reduce操作最大内存 默认:3072 3G -->
  <property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>1024</value>
  </property>
  <!-- reduce操作java参数 默认:-Xmx2560M -->
  <property>
        <name>mapreduce.reduce.java.opts</name>
        <value>-Xmx1024M</value>
  </property>
  
  <!-- io排序最大内存 默认:512 -->
  <property>
        <name>mapreduce.task.io.sort.mb</name>
        <value>512</value>
  </property>

4 发现问题出在哪里

  • 修改了 mapred-site.xml 加大了 map 堆内存及内存为 1024
  • 通过看 GC 日志发现了问题(目前 GC 排错还不是很熟悉)
    • ParOldGen total 560128K, used 532981K
    • 560128 K = 512 MB
    • 原来是有哪里来的对象一下子申请了 500 MB 多 内存 ... ...
{Heap before GC invocations=7 (full 1):
 PSYoungGen      total 61440K, used 59392K [0x00000000eab00000, 0x00000000f2c00000, 0x0000000100000000)
  eden space 59392K, 100% used [0x00000000eab00000,0x00000000ee500000,0x00000000ee500000)
  from space 2048K, 0% used [0x00000000ee500000,0x00000000ee500000,0x00000000ee700000)
  to   space 6656K, 0% used [0x00000000f2580000,0x00000000f2580000,0x00000000f2c00000)
 ParOldGen       total 35328K, used 8685K [0x00000000c0000000, 0x00000000c2280000, 0x00000000eab00000)
  object space 35328K, 24% used [0x00000000c0000000,0x00000000c087b598,0x00000000c2280000)
 Metaspace       used 24829K, capacity 25096K, committed 25472K, reserved 1071104K
  class space    used 2792K, capacity 2882K, committed 2944K, reserved 1048576K
1.426: [GC (Allocation Failure) [PSYoungGen: 59392K->5394K(95744K)] 68077K->14087K(131072K), 0.0046666 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
Heap after GC invocations=7 (full 1):
 PSYoungGen      total 95744K, used 5394K [0x00000000eab00000, 0x00000000f2b00000, 0x0000000100000000)
  eden space 90112K, 0% used [0x00000000eab00000,0x00000000eab00000,0x00000000f0300000)
  from space 5632K, 95% used [0x00000000f2580000,0x00000000f2ac48b0,0x00000000f2b00000)
  to   space 7168K, 0% used [0x00000000f1d00000,0x00000000f1d00000,0x00000000f2400000)
 ParOldGen       total 35328K, used 8693K [0x00000000c0000000, 0x00000000c2280000, 0x00000000eab00000)
  object space 35328K, 24% used [0x00000000c0000000,0x00000000c087d598,0x00000000c2280000)
 Metaspace       used 24829K, capacity 25096K, committed 25472K, reserved 1071104K
  class space    used 2792K, capacity 2882K, committed 2944K, reserved 1048576K
}
Heap
 PSYoungGen      total 95744K, used 67268K [0x00000000eab00000, 0x00000000f2b00000, 0x0000000100000000)
  eden space 90112K, 68% used [0x00000000eab00000,0x00000000ee76ca98,0x00000000f0300000)
  from space 5632K, 95% used [0x00000000f2580000,0x00000000f2ac48b0,0x00000000f2b00000)
  to   space 7168K, 0% used [0x00000000f1d00000,0x00000000f1d00000,0x00000000f2400000)
 ParOldGen       total 560128K, used 532981K [0x00000000c0000000, 0x00000000e2300000, 0x00000000eab00000)
  object space 560128K, 95% used [0x00000000c0000000,0x00000000e087d5a8,0x00000000e2300000)
 Metaspace       used 29087K, capacity 29334K, committed 29824K, reserved 1075200K
  class space    used 3248K, capacity 3333K, committed 3456K, reserved 1048576K

5 找到了问题配置

  • etc/hadoop/mapred-site.xml
    • 此处的 512 MB,会让 map task 在 map 节点申请这么大的内存数组,作为后面排序合并等用途
    • 看看源码是怎么做的
  <!-- io排序最大内存 默认:512 -->
  <!-- map阶段排序时会使用该值占用堆内存,请注意 -->
  <property>
        <name>mapreduce.task.io.sort.mb</name>
        <value>512</value>
  </property>

6 源码分析

  • MRJobConfig.IO_SORT_MB
    • 配置项
public interface MRJobConfig {
	String IO_SORT_MB = "mapreduce.task.io.sort.mb";
  public static final int DEFAULT_IO_SORT_MB = 100;
}
  • MapTask
    • 看着像是环形缓冲区部分耶
// 占了 16 byte; meta 看着像元数
private static final int NMETA = 4;            // num meta ints
private static final int METASIZE = NMETA * 4; // size in bytes


// 默认值是 100MB
final int sortmb = job.getInt(MRJobConfig.IO_SORT_MB,
          MRJobConfig.DEFAULT_IO_SORT_MB);

// 左移 20; 100 << 20 = 104857600; 神奇的操作,相当于 100*1024*1024,计算出来 byte 数
int maxMemUsage = sortmb << 20;
// 减去个什么数的 100M - (100M % 16b); 居然是 100M - 0; 此处回头再看
// ps: idea 都提示了 Result of 'maxMemUsage % METASIZE' is always '0'
// 不知道此处何用
maxMemUsage -= maxMemUsage % METASIZE;
// 就是这了,这个数组就是 OOM 的原因
kvbuffer = new byte[maxMemUsage];

// 此处就是环形缓冲区 80% 就 spill 的由来了
final float spillper = job.getFloat(JobContext.MAP_SORT_SPILL_PERCENT, (float)0.8);
softLimit = (int)(kvbuffer.length * spillper);
  • 程序日志
mapreduce.task.io.sort.mb: 512
soft limit at 429496736
bufstart = 0; bufvoid = 536870912
kvstart = 134217724; length = 33554432