1. 程式人生 > >INFO org.apache.hadoop.ipc.RPC: Server at /:9000 not available yet, Zzzzz

INFO org.apache.hadoop.ipc.RPC: Server at /:9000 not available yet, Zzzzz

本以為這個樣子就大功告成了,

然後我用bin/hadoop dfsadmin -report 檢視hadoop的情況,現實的資訊如下;
Configured Capacity: 0(0KB)
Present Capacity: 0(0KB)
DFS Remaining: 0(0KB)
DFS Used: 0(0KB)
DSF Used%:?%
Under Replicated blocks:0
Blocks with corrupt replicas: 0
Missing blocks: 0
----------------------------------------------------
Databodes available: 0(0 total, 0 dead)

總是出現datanode連線不上namenode的問題。

在datanode也就是slave上面檢視datanode的日誌時,的錯誤為:

2011-10-26 17:57:05,231 INFO org.apache.hadoop.ipc.RPC: Server at /192.168.0.100:9000 not available yet, Zzzzz...
2011-10-26 17:57:07,235 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 0 time(s).
2011-10-26 17:57:08,236 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 1 time(s).
2011-10-26 17:57:09,237 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 2 time(s).
2011-10-26 17:57:10,239 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 3 time(s).
2011-10-26 17:57:11,240 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 4 time(s).
2011-10-26 17:57:12,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.0.100:9000. Already tried 5 time(s).

也是datanode連線不上namenode。

而在namenode上面顯示:

2011-10-26 14:18:49,686 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call addBlock(/root/hadoop/tmp/mapred/system/jobtracker.info, DFSClient_-1928560478, null, null) from 127.0.0.1:32817: error: java.io.IOException: File /root/hadoop/tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /root/hadoop/tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1448)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:690)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)

也就是說namenode也在試圖將jobtracker.info存入hdfs檔案系統中,而又存不進去。然後查了一下網上的說法,之後發現原來是/etc/hosts中的ip對映的問題。由於在master中/etc/hosts的配置為:

127.0.0.1      master

127.0.1.1     server.ubuntu-domain    server

192.168.0.100 server

192.168.0.111  hdfs1因此可能存在一個優先匹配第一個碰見的問題,之後是將前兩行註釋掉(後來又將第一行改為了127.0.0.1 localhost)。然後在進行正常的hadoop format和啟動,就可以連線上了。

最後可能還有時候出現錯誤datanode自動關閉的問題。解決這個問題的方法是刪除所有masters,slaves中的tmp檔案。然後format,重啟就可以了。