1. 程式人生 > >hadoop,hbase三臺叢集環境,其中有一臺突然中斷,如何重啟。

hadoop,hbase三臺叢集環境,其中有一臺突然中斷,如何重啟。

由於閘道器的原因,一臺伺服器連線失敗,造成hadoop的namenode均掛掉了。

檢視hbase時,發現如下找到master錯誤:

ERROR: Can't get master address from ZooKeeper; znode data == null

那麼問題來了,如何重啟?

[[email protected] ~]$ stop-hbase.sh
stopping hbase....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

前提條件:zookeeper程序必須正常。即有下面的程序

24454 QuorumPeerMain

1. 發現hbase很難關掉。這時候需要手動強制關閉Hbase程序。命令如下:

kill -9 pid.

25075 HMaster

25076 HRegionServer

kill -9 25075/kill -9 25076

2.強制關閉Hbase之後,關閉hadoop程序。

在${HADOOP_HOME}/sbin下執行./stop-all.sh 列印日誌如下:

[[email protected] sbin]$ ./stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [hadoop001 hadoop002]
hadoop002: no namenode to stop
hadoop001: no namenode to stop
hadoop003: no datanode to stop
hadoop002: no datanode to stop
hadoop001: no datanode to stop
Stopping journal nodes [hadoop001 hadoop002 hadoop003]
hadoop002: no journalnode to stop
hadoop003: no journalnode to stop
hadoop001: no journalnode to stop
Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: no zkfc to stop
hadoop001: no zkfc to stop
stopping yarn daemons
no resourcemanager to stop
hadoop003: no nodemanager to stop
hadoop002: no nodemanager to stop
hadoop001: no nodemanager to stop
no proxyserver to stop

完畢後用jps檢視hadoop進行是否停止。(namenode已掛)

發現並未停止。

然後使用kill -9 pid強制關閉程序。

[[email protected] sbin]$ kill -9 5778
[[email protected] sbin]$ kill -9 6130
[[email protected] sbin]$ kill -9 5977
[[email protected] sbin]$ jps
2677 Bootstrap
22837 jar
24454 QuorumPeerMain
9254 ConsoleConsumer
31640 Jps
30411 Bootstrap

其他伺服器也做此處理。

[[email protected] ~]$ jps
18562 DataNode
18754 DFSZKFailoverController
22340 QuorumPeerMain
18661 JournalNode
23605 Bootstrap
17847 Jps
1626 Kafka
[[email protected] ~]$ kill -9 18562
[[email protected] ~]$ kill -9 18661
[[email protected] ~]$ jps
17905 Jps
18754 DFSZKFailoverController
22340 QuorumPeerMain
23605 Bootstrap
1626 Kafka
[[email protected] ~]$ 

3.完畢後啟動hadoop。

./start-all.sh

[[email protected] sbin]$ ./start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hadoop001 hadoop002]
hadoop002: starting namenode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop001: starting namenode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop002: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop002.out
hadoop003: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop003.out
hadoop001: starting datanode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop001.out
Starting journal nodes [hadoop001 hadoop002 hadoop003]
hadoop003: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop003.out
hadoop002: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop002.out
hadoop001: starting journalnode, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-journalnode-hadoop001.out
Starting ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: starting zkfc, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-zkfc-hadoop002.out
hadoop001: starting zkfc, logging to /data/hadoop/hadoop/logs/hadoop-hadoop-zkfc-hadoop001.out
starting yarn daemons
starting resourcemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /data/hadoop/hadoop/logs/yarn-hadoop-nodemanager-hadoop001.out
[[email protected] sbin]$ jps
2019 NameNode
2677 Bootstrap
22837 jar
2133 DataNode
24454 QuorumPeerMain
9254 ConsoleConsumer
2345 JournalNode
2633 ResourceManager
30411 Bootstrap
3115 Jps
2540 DFSZKFailoverController
2765 NodeManager

使用jps檢視是否存在進行。

如果正常執行,則OK。

4.啟動Hbase

使用./start-hbase.sh啟動hbase

./start-hbase.sh 
starting master, logging to /data/hadoop/hbase/logs/hbase-hadoop-master-hadoop001.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop001: starting regionserver, logging to /data/hadoop/hbase/logs/hbase-hadoop-regionserver-hadoop001.out
hadoop003: starting regionserver, logging to /data/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-hadoop003.out
hadoop002: starting regionserver, logging to /data/hadoop/hbase/bin/../logs/hbase-hadoop-regionserver-hadoop002.out
hadoop001: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop001: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop003: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop003: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop002: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
hadoop002: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

檢視hbase的程序成功。

 hbase shell檢視即可。