1. 程式人生 > >MapReduce job Shuffle 過程的ERROR

MapReduce job Shuffle 過程的ERROR

1.錯誤描述

error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#43
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1550)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:414)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:343)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:166)

2.問題解決

           從error的stack資訊可以看出是mapreduce 的job shuffle過程中發生的OOM,map的輸出資料在reduce節點load的過多導致的OOM,所以將reduce的shuffle.input 由預設的0.7 改為0.6,該引數的變化需要根據實際情況確定。
  conf.set("mapreduce.reduce.shuffle.input.buffer.percent", "0.6");

3.相關引數

  • mapreduce.reduce.shuffle.input.buffer.percent :
the percentage of the reducer's heap memory to be allocated for the circular buffer to store the intermedite outputs copied from multiple mappers.

  • mapreduce.reduce.shuffle.memory.limit.percent
the maximum percentage of the above memory buffer that a single shuffle (output copied from single Map task) should take. The shuffle's size above this size will not be copied to the memory buffer, instead they will be directly written to the disk of the reducer.

  • mapreduce.reduce.shuffle.merge.percent:
the threshold percentage by where the in-memory merger thread will run to merge the available shuffle contents on the memory buffer into a single file and immediately spills the merged file into the disk.