1. 程式人生 > >Hadoop集群(四) Hadoop升級

Hadoop集群(四) Hadoop升級

bootstrap nts word sad sbin 拷貝 軟件 als atan

Hadoop前面安裝的集群是2.6版本,現在升級到2.7版本。

註意,這個集群上有運行Hbase,所以,升級前後,需要啟停Hbase。

更多安裝步驟,請參考:

Hadoop集群(一) Zookeeper搭建

Hadoop集群(二) HDFS搭建

Hadoop集群(三) Hbase搭建

升級步驟如下:

集群IP列表

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Namenode: 192.168.143.46 192.168.143.103 Journalnode: 192.168.143.101 192.168.143.102
192.168.143.103 Datanode&Hbase regionserver: 192.168.143.196 192.168.143.231 192.168.143.182 192.168.143.235 192.168.143.41 192.168.143.127 Hbase master: 192.168.143.103 192.168.143.101 Zookeeper: 192.168.143.101 192.168.143.102 192.168.143.103

1. 首先確定hadoop運行的路徑,將新版本的軟件分發到每個節點的這個路徑下,並解壓。

1 2 3 4 5 6 7 8 # ll /usr/local/hadoop/ total 493244 drwxrwxr-x 9 root root 4096 Mar 21 2017 hadoop-release ->hadoop-2.6.0-EDH-0u1-SNAPSHOT-HA-SECURITY drwxr-xr-x 9 root root 4096 Oct 11 11:06 hadoop-2.7.1 -rw-r--r-- 1 root root 194690531 Oct 9 10:55 hadoop-2.7.1.tar.gz drwxrwxr-x 7 root root 4096 May 21 2016 hbase-1.1.3
-rw-r--r-- 1 root root 128975247 Apr 10 2017 hbase-1.1.3.tar.gz lrwxrwxrwx 1 root root 29 Apr 10 2017 hbase-release -> /usr/local/hadoop/hbase-1.1.3

由於是升級,配置文件完全不變,將原hadoop-2.6.0下的etc/hadoop路徑完全拷貝/替換到hadoop-2.7.1下。

至此,升級前的準備就已經完成了。

下面開始升級操作過程。全程都是在一個中轉機上執行的命令,通過shell腳本執行,省去頻繁ssh登陸的操作。

## 停止hbase,hbase用戶執行

2. 停止Hbase master,hbase用戶執行

狀態檢查,確認master,先停standby master

1 http://192.168.143.101:16010/master-status
1 2 3 4 5 master: ssh -t -q 192.168.143.103 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master" ssh -t -q 192.168.143.103 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.101 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master" ssh -t -q 192.168.143.101 sudo su -l hbase -c "jps"

3. 停止Hbase regionserver,hbase用戶執行

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver" ssh -t -q 192.168.143.231 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver" ssh -t -q 192.168.143.182 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver" ssh -t -q 192.168.143.235 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver" ssh -t -q 192.168.143.41 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver" ssh -t -q 192.168.143.127 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

檢查運行狀態

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.231 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.182 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.235 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.41 sudo su -l hbase -c "jps" ssh -t -q 192.168.143.127 sudo su -l hbase -c "jps"

## 停止服務--HDFS

4. 先確認,active的namenode,網頁確認.後續要先啟動這個namenode

1 https://192.168.143.46:50470/dfshealth.html#tab-overview

5. 停止NameNode,hdfs用戶執行

NN: 先停standby namenode

1 2 3 4 5 ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode" ssh -t -q 192.168.143.46 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode" 檢查狀態 ssh -t -q 192.168.143.103 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.46 sudo su -l hdfs -c "jps"

6. 停止DataNode,hdfs用戶執行

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode" ssh -t -q 192.168.143.231 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode" ssh -t -q 192.168.143.182 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode" ssh -t -q 192.168.143.235 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode" ssh -t -q 192.168.143.41 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode" ssh -t -q 192.168.143.127 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

7. 停止ZKFC,hdfs用戶執行

1 2 ssh -t -q 192.168.143.46 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc" ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc"

8.停止JournalNode,hdfs用戶執行

1 2 3 4 JN: ssh -t -q 192.168.143.101 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode" ssh -t -q 192.168.143.102 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode" ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"

### 備份NameNode的數據,由於生產環境,原有的數據需要備份。以備升級失敗回滾。

9. 備份namenode1

1 2 ssh -t -q 192.168.143.46 "cp -r /data1/dfs/name /data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*" ssh -t -q 192.168.143.46 "cp -r /data2/dfs/name /data2/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"

10. 備份namenode2

1 2 ssh -t -q 192.168.143.103 "cp -r /data1/dfs/name /data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"

11. 備份journal

1 2 3 ssh -t -q 192.168.143.101 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*" ssh -t -q 192.168.143.102 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*" ssh -t -q 192.168.143.103 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"

journal路徑,可以查看hdfs-site.xml文件

1 2 dfs.journalnode.edits.dir: /data1/journalnode

### 升級相關

12. copy文件(已提前處理,參考第一步)

切換軟連接到2.7.1版本

1 ssh -t -q $h "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

13. 切換文件軟鏈接,root用戶執行

1 2 3 4 5 6 7 8 9 10 ssh -t -q 192.168.143.46 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.103 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.101 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.102 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.196 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.231 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.182 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.235 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.41 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release" ssh -t -q 192.168.143.127 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

確認狀態

1 2 3 4 5 6 7 8 9 10 ssh -t -q 192.168.143.46 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.103 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.101 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.102 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.196 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.231 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.182 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.235 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.41 "cd /usr/local/hadoop; ls -al" ssh -t -q 192.168.143.127 "cd /usr/local/hadoop; ls -al"

### 啟動HDFS,hdfs用戶執行

14. 啟動JournalNode

1 2 3 4 JN: ssh -t -q 192.168.143.101 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode" ssh -t -q 192.168.143.102 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode" ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"
1 2 3 ssh -t -q 192.168.143.101 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.102 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.103 sudo su -l hdfs -c "jps"

15. 啟動第一個NameNode

1 2 3 ssh 192.168.143.46 su - hdfs /usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode -upgrade

16. 確認狀態,在狀態完全OK之後,才可以啟動另一個namenode

1 https://192.168.143.46:50470/dfshealth.html#tab-overview

17. 啟動第一個ZKFC

1 2 3 su - hdfs /usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc 192.168.143.46

18. 啟動第二個NameNode

1 2 3 4 ssh 192.168.143.103 su - hdfs /usr/local/hadoop/hadoop-release/bin/hdfs namenode -bootstrapStandby /usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode

19. 啟動第二個ZKFC

1 2 3 ssh 192.168.143.103 su - hdfs /usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc

20. 啟動DataNode

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode" ssh -t -q 192.168.143.231 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode" ssh -t -q 192.168.143.182 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode" ssh -t -q 192.168.143.235 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode" ssh -t -q 192.168.143.41 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode" ssh -t -q 192.168.143.127 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

確認狀態

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.231 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.182 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.235 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.41 sudo su -l hdfs -c "jps" ssh -t -q 192.168.143.127 sudo su -l hdfs -c "jps"

21. 一切正常之後,啟動hbase, hbase用戶執行

啟動hbase master,最好先啟動原來的active master。

1 2 ssh -t -q 192.168.143.101 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master" ssh -t -q 192.168.143.103 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master"

啟動Hbase regionserver

1 2 3 4 5 6 ssh -t -q 192.168.143.196 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver" ssh -t -q 192.168.143.231 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver" ssh -t -q 192.168.143.182 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver" ssh -t -q 192.168.143.235 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver" ssh -t -q 192.168.143.41 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver" ssh -t -q 192.168.143.127 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

22. Hbase region需要手動Balance開啟、關閉

需要登錄HBase Shell運行如下命令

開啟

balance_switch true

關閉

balance_switch false

23. 本次不執行,系統運行一周,確保系統運行穩定,再執行Final。

註意:這期間,磁盤空間可能會快速增長。在執行完final之後,會釋放一部分空間。

Finallize upgrade: hdfs dfsadmin -finalizeUpgrade

http://blog.51cto.com/hsbxxl/1976472

Hadoop集群(四) Hadoop升級