1. 程式人生 > >hadoop平臺讀取檔案報錯

hadoop平臺讀取檔案報錯

背景: 生產環境有個指令碼執行讀取st層表資料時出現IO錯誤,查看錶目錄下的檔案,都是壓縮後的檔案。詳細資訊如下:


Task with the most failures(4):
-----
Task ID:
task_201408301703_172845_m_003505


URL:
http://master:50030/taskdetails.jsp?jobid=job_201408301703_172845&tipid=task_201408301703_172845_m_003505
-----
Diagnostic Messages for this Task:
java.io.IOException: IO error in map input file hdfs://master:9000/user/hive/warehouse/pc.db/dwd_st_pc_list/dt=startup/startup-m-03653.gz
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
... 9 more
Caused by: java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:137)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:77)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:176)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:43)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
... 13 more




FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 12137 Reduce: 100 Cumulative CPU: 23227.51 sec HDFS Read: 2121148099 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 days 6 hours 27 minutes 7 seconds 510 msec


檢視對應的job任務日誌,發現有些檔案解壓報錯,找到對應的檔案,下載到本地解壓,還是報錯。
考慮到hadoop的備份了幾份,於是把這份出錯的檔案刪掉,刪除重試上面的指令碼即可。