hadoop,hbase三臺叢集環境,其中有一臺突然中斷,如何重啟。
阿新 • • 發佈:2018-12-12
由於閘道器的原因,一臺伺服器連線失敗,造成hadoop的namenode均掛掉了。
檢視hbase時,發現如下找到master錯誤:
ERROR: Can't get master address from ZooKeeper; znode data == null
那麼問題來了,如何重啟?
[[email protected] ~]$ stop-hbase.sh stopping hbase....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
前提條件:zookeeper程序必須正常。即有下面的程序
24454 QuorumPeerMain
1. 發現hbase很難關掉。這時候需要手動強制關閉Hbase程序。命令如下:
kill -9 pid.
25075 HMaster
25076 HRegionServer
kill -9 25075/kill -9 25076
2.強制關閉Hbase之後,關閉hadoop程序。
在${HADOOP_HOME}/sbin下執行./stop-all.sh 列印日誌如下:
[[email protected] sbin]$ ./stop-all.sh This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh Stopping namenodes on [hadoop001 hadoop002] hadoop002: no namenode to stop hadoop001: no namenode to stop hadoop003: no datanode to stop hadoop002: no datanode to stop hadoop001: no datanode to stop Stopping journal nodes [hadoop001 hadoop002 hadoop003] hadoop002: no journalnode to stop hadoop003: no journalnode to stop hadoop001: no journalnode to stop Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002] hadoop002: no zkfc to stop hadoop001: no zkfc to stop stopping yarn daemons no resourcemanager to stop hadoop003: no nodemanager to stop hadoop002: no nodemanager to stop hadoop001: no nodemanager to stop no proxyserver to stop
完畢後用jps檢視hadoop進行是否停止。(namenode已掛)
發現並未停止。
然後使用kill -9 pid強制關閉程序。
[[email protected] sbin]$ kill -9 5778 [[email protected] sbin]$ kill -9 6130 [[email protected] sbin]$ kill -9 5977 [[email protected] sbin]$ jps 2677 Bootstrap 22837 jar 24454 QuorumPeerMain 9254 ConsoleConsumer 31640 Jps 30411 Bootstrap
其他伺服器也做此處理。
[[email protected] ~]$ jps
18562 DataNode
18754 DFSZKFailoverController
22340 QuorumPeerMain
18661 JournalNode
23605 Bootstrap
17847 Jps
1626 Kafka
[[email protected] ~]$ kill -9 18562
[[email protected] ~]$ kill -9 18661
[[email protected] ~]$ jps
17905 Jps
18754 DFSZKFailoverController
22340 QuorumPeerMain
23605 Bootstrap
1626 Kafka
[[email protected] ~]$
3.完畢後啟動hadoop。
./start-all.sh
[[email protected] sbin]$ ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hadoop001 hadoop002]
hadoop002: starting namenode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop001: starting namenode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop002: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop002.out
hadoop003: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop003.out
hadoop001: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop001.out
Starting journal nodes [hadoop001 hadoop002 hadoop003]
hadoop003: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop003.out
hadoop002: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop002.out
hadoop001: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop001.out
Starting ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: starting zkfc, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-zkfc-hadoop002.out
hadoop001: starting zkfc, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-zkfc-hadoop001.out
starting yarn daemons
starting resourcemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop001.out
[[email protected] sbin]$ jps
2019 NameNode
2677 Bootstrap
22837 jar
2133 DataNode
24454 QuorumPeerMain
9254 ConsoleConsumer
2345 JournalNode
2633 ResourceManager
30411 Bootstrap
3115 Jps
2540 DFSZKFailoverController
2765 NodeManager
使用jps檢視是否存在進行。
如果正常執行,則OK。
4.啟動Hbase
使用./start-hbase.sh啟動hbase
./start-hbase.sh
starting master, logging to /data/hadoop/hbase/logs/hbase-hadoop-master-hadoop001.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop001: starting regionserver, logging to /data/hadoop/hbase/logs/hbase-hadoop-regionserver-hadoop001.out
hadoop003: starting regionserver, logging to /data/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-hadoop003.out
hadoop002: starting regionserver, logging to /data/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-hadoop002.out
hadoop001: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop001: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop003: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop003: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop002: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop002: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
檢視hbase的程序成功。
hbase shell檢視即可。