1. 程式人生 > >啟動hadoop報ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile

啟動hadoop報ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile

    問題:重啟所有伺服器後,在啟動hadoop叢集時發現namenode(standby)始終啟動不了,檢視日誌發現報錯:ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile,詳細報錯資訊如下:

2016-04-29 18:01:03,770 WARN org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded
2016-04-29 18:01:04,582 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/opt/hadoop/dfs/name/current/fsimage_0000000000000001344, cpktTxId=0000000000000001344)
java.io.IOException: Premature EOF from inputStream
	at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:221)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:913)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:899)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:722)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:660)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
2016-04-29 18:01:04,596 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: Failed to load an FSImage file!
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:671)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
2016-04-29 18:01:04,602 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@lida2:50070
2016-04-29 18:01:04,607 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2016-04-29 18:01:04,607 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2016-04-29 18:01:04,608 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2016-04-29 18:01:04,608 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: Failed to load an FSImage file!
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:671)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:751)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:735)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
2016-04-29 18:01:04,609 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-04-29 18:01:04,614 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at lida2/10.30.12.170
************************************************************/
原因:namenode(standby)load FSImage file失敗!

    解決辦法:

    1、手動copy namenode(active)所在的那臺伺服器上XXX/dfs/name/current/下的所有檔案到namenode(standby)所在的那臺伺服器的對應資料夾下。(這個方法是我的解決辦法)

    2、重新格式化namenode(active),然後再把格式化後的元資料複製到namenode(standby)。(這個方法是別人提供的,我感覺這樣不妥!因為,重新格式化namenode,namenode上的元資料會丟失,元資料丟失了,後果不堪設想!所以此方法風險太大,請謹慎使用!!)

    3、有人說“有個命令執行一下就好了,這個命令的意思就是把所有元資料過濾一次,把好的留下,壞的剔除。後果就是會丟失一小部分壞的元資料。”不知是什麼命令,搜也搜不到,有知道的,煩請告訴小弟一下!