1. 程式人生 > >Docker之Hadoop普通叢集搭建(五)

Docker之Hadoop普通叢集搭建(五)

2017-01-08 03:36:29,815 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.net.UnknownHostException: 26b72653d296: 26b72653d296: unknown error
	at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
	at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:187)
	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:207)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2289)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2338)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2515)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2539)
Caused by: java.net.UnknownHostException: 26b72653d296: unknown error
	at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
	at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
	at java.net.InetAddress.getLocalHost(InetAddress.java:1500)

按日誌來看,覺得應該是主機名解析有問題而出錯,然後直接把slaves中原來儲存的主機名全都換成ip地址,問題就可以解決了。

8.2、Start-dfs.sh啟動datanode提示所有節點啟動成功,但在50070網頁上並未發現任何節點,日誌發現如下錯誤:

2017-01-08 05:39:48,880 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool BP-1803144284-192.168.1.1-1483792012201 (Datanode Uuid null) service to 172.17.0.2/172.17.0.2:9000 Datanode denied communication with namenode because hostname cannot be resolved (ip=172.17.0.3, hostname=172.17.0.3): DatanodeRegistration(0.0.0.0, datanodeUuid=c19b9c4d-0e64-43ba-b458-281ae1af4738, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-40010481-40ed-4830-8547-27eabc5af90f;nsid=517203298;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:904)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5088)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1141)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:93)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28293)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)

需要在hdfs-site.xml加上如下內容就可以解決:

<property>
        <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
        <value>false</value>
</property>

8.3、不要多次呼叫hdfs namenode -format進行格式化namenode,中間遇到過一個問題,50070上只顯示一個節點,每次刷新發現ip都會發生改變,相應log忘記儲存,但最終把/data/hdfs/name/data/hdfs/data
/data/tmp下的所以檔案刪除並重新格式化後,重新啟動start-dfs.sh50070上正常顯示多節點。後續再格式化,再執行start-dfs.sh發現datanode無法正常啟動,日誌如下:

2017-01-08 11:27:03,961 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to /172.17.0.2:9000. Exiting. 
java.io.IOException: All specified directories are failed to load.
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:479)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1398)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1363)
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:845)
	at java.lang.Thread.run(Thread.java:745)

8.4、執行wordcount用例報錯

17/01/09 15:15:05 INFO mapreduce.Job: Job job_1483969899500_0004 failed with state FAILED due to: Ap
plication application_1483969899500_0004 failed 2 times due to Error launching appattempt_1483969899
500_0004_000002. Got exception: java.io.IOException: Failed on local exception: java.io.IOException:
 Couldn't set up IO streams; Host Details : local host is: "6696a6544d4c/172.17.0.2"; destination ho
st is: "3e2a08477956":39523;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy32.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startCo
ntainers(ContainerManagementProtocolPBClientImpl.java:96)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.jav
a:119)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:2
54)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't set up IO streams
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:786)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 9 more
Caused by: java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Net.java:101)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
以上日誌 destination host is: "3e2a08477956":39523,覺得只出現了3e2a08477956主機名,而沒有對應ip,然而在local host is: "6696a6544d4c/172.17.0.2"中即出現主機名又出現ip,所以覺得是沒法解釋3e2a08477956主機對應的ip地址,最終嘗試在/etc/hosts中新增ip及對應的主機名,測試發現成功執行。