Hadoop集群(四) Hadoop升級
Hadoop前面安裝的集群是2.6版本,現在升級到2.7版本。
註意,這個集群上有運行Hbase,所以,升級前後,需要啟停Hbase。
更多安裝步驟,請參考:
Hadoop集群(一) Zookeeper搭建
Hadoop集群(二) HDFS搭建
Hadoop集群(三) Hbase搭建
升級步驟如下:
集群IP列表
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Namenode:
192.168.143.46
192.168.143.103
Journalnode:
192.168.143.101
192.168.143.102 192.168.143.103
Datanode&Hbase regionserver:
192.168.143.196
192.168.143.231
192.168.143.182
192.168.143.235
192.168.143.41
192.168.143.127
Hbase master:
192.168.143.103
192.168.143.101
Zookeeper:
192.168.143.101
192.168.143.102
192.168.143.103
|
1. 首先確定hadoop運行的路徑,將新版本的軟件分發到每個節點的這個路徑下,並解壓。
1 2 3 4 5 6 7 8 |
# ll /usr/ local /hadoop/
total 493244
drwxrwxr-x 9 root root 4096 Mar 21 2017 hadoop-release ->hadoop-2.6.0-EDH-0u1-SNAPSHOT-HA-SECURITY
drwxr-xr-x 9 root root 4096 Oct 11 11:06 hadoop-2.7.1
-rw-r --r-- 1 root root 194690531 Oct 9 10:55 hadoop-2.7.1.tar.gz
drwxrwxr-x 7 root root 4096 May 21 2016 hbase-1.1.3 -rw-r --r-- 1 root root 128975247 Apr 10 2017 hbase-1.1.3.tar.gz
lrwxrwxrwx 1 root root 29 Apr 10 2017 hbase-release -> /usr/ local /hadoop/hbase-1.1.3
|
由於是升級,配置文件完全不變,將原hadoop-2.6.0下的etc/hadoop路徑完全拷貝/替換到hadoop-2.7.1下。
至此,升級前的準備就已經完成了。
下面開始升級操作過程。全程都是在一個中轉機上執行的命令,通過shell腳本執行,省去頻繁ssh登陸的操作。
## 停止hbase,hbase用戶執行
2. 停止Hbase master,hbase用戶執行
狀態檢查,確認master,先停standby master
1 |
http://192.168.143.101:16010/master-status
|
1 2 3 4 5 |
master:
ssh -t -q 192.168.143.103 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master"
ssh -t -q 192.168.143.103 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.101 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master"
ssh -t -q 192.168.143.101 sudo su -l hbase -c "jps"
|
3. 停止Hbase regionserver,hbase用戶執行
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
ssh -t -q 192.168.143.231 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
ssh -t -q 192.168.143.182 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
ssh -t -q 192.168.143.235 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
ssh -t -q 192.168.143.41 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
ssh -t -q 192.168.143.127 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"
|
檢查運行狀態
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.231 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.182 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.235 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.41 sudo su -l hbase -c "jps"
ssh -t -q 192.168.143.127 sudo su -l hbase -c "jps"
|
## 停止服務--HDFS
4. 先確認,active的namenode,網頁確認.後續要先啟動這個namenode
1 |
https://192.168.143.46:50470/dfshealth.html#tab-overview
|
5. 停止NameNode,hdfs用戶執行
NN: 先停standby namenode
1 2 3 4 5 |
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode"
ssh -t -q 192.168.143.46 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode"
檢查狀態
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.46 sudo su -l hdfs -c "jps"
|
6. 停止DataNode,hdfs用戶執行
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
ssh -t -q 192.168.143.231 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
ssh -t -q 192.168.143.182 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
ssh -t -q 192.168.143.235 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
ssh -t -q 192.168.143.41 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
ssh -t -q 192.168.143.127 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"
|
7. 停止ZKFC,hdfs用戶執行
1 2 |
ssh -t -q 192.168.143.46 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc"
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc"
|
8.停止JournalNode,hdfs用戶執行
1 2 3 4 |
JN:
ssh -t -q 192.168.143.101 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"
ssh -t -q 192.168.143.102 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"
|
### 備份NameNode的數據,由於生產環境,原有的數據需要備份。以備升級失敗回滾。
9. 備份namenode1
1 2 |
ssh -t -q 192.168.143.46 "cp -r /data1/dfs/name /data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"
ssh -t -q 192.168.143.46 "cp -r /data2/dfs/name /data2/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"
|
10. 備份namenode2
1 2 |
ssh -t -q 192.168.143.103 "cp -r /data1/dfs/name
/data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"
|
11. 備份journal
1 2 3 |
ssh -t -q 192.168.143.101 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"
ssh -t -q 192.168.143.102 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"
ssh -t -q 192.168.143.103 "cp -r /data1/journalnode /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"
|
journal路徑,可以查看hdfs-site.xml文件
1 2 |
dfs.journalnode.edits.dir:
/data1/journalnode
|
### 升級相關
12. copy文件(已提前處理,參考第一步)
切換軟連接到2.7.1版本
1 |
ssh -t -q $h "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
|
13. 切換文件軟鏈接,root用戶執行
1 2 3 4 5 6 7 8 9 10 |
ssh -t -q 192.168.143.46 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.103 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.101 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.102 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.196 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.231 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.182 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.235 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.41 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
ssh -t -q 192.168.143.127 "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"
|
確認狀態
1 2 3 4 5 6 7 8 9 10 |
ssh -t -q 192.168.143.46 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.103 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.101 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.102 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.196 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.231 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.182 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.235 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.41 "cd /usr/local/hadoop; ls -al"
ssh -t -q 192.168.143.127 "cd /usr/local/hadoop; ls -al"
|
### 啟動HDFS,hdfs用戶執行
14. 啟動JournalNode
1 2 3 4 |
JN:
ssh -t -q 192.168.143.101 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"
ssh -t -q 192.168.143.102 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"
|
1 2 3 |
ssh -t -q 192.168.143.101 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.102 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.103 sudo su -l hdfs -c "jps"
|
15. 啟動第一個NameNode
1 2 3 |
ssh 192.168.143.46
su - hdfs
/usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode -upgrade
|
16. 確認狀態,在狀態完全OK之後,才可以啟動另一個namenode
1 |
https://192.168.143.46:50470/dfshealth.html#tab-overview
|
17. 啟動第一個ZKFC
1 2 3 |
su - hdfs
/usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc
192.168.143.46
|
18. 啟動第二個NameNode
1 2 3 4 |
ssh 192.168.143.103
su - hdfs
/usr/ local /hadoop/hadoop-release/bin/hdfs namenode -bootstrapStandby
/usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode
|
19. 啟動第二個ZKFC
1 2 3 |
ssh 192.168.143.103
su - hdfs
/usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc
|
20. 啟動DataNode
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
ssh -t -q 192.168.143.231 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
ssh -t -q 192.168.143.182 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
ssh -t -q 192.168.143.235 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
ssh -t -q 192.168.143.41 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
ssh -t -q 192.168.143.127 sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"
|
確認狀態
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.231 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.182 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.235 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.41 sudo su -l hdfs -c "jps"
ssh -t -q 192.168.143.127 sudo su -l hdfs -c "jps"
|
21. 一切正常之後,啟動hbase, hbase用戶執行
啟動hbase master,最好先啟動原來的active master。
1 2 |
ssh -t -q 192.168.143.101 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master"
ssh -t -q 192.168.143.103 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master"
|
啟動Hbase regionserver
1 2 3 4 5 6 |
ssh -t -q 192.168.143.196 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
ssh -t -q 192.168.143.231 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
ssh -t -q 192.168.143.182 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
ssh -t -q 192.168.143.235 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
ssh -t -q 192.168.143.41 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
ssh -t -q 192.168.143.127 sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"
|
22. Hbase region需要手動Balance開啟、關閉
需要登錄HBase Shell運行如下命令
開啟
balance_switch true
關閉
balance_switch false
23. 本次不執行,系統運行一周,確保系統運行穩定,再執行Final。
註意:這期間,磁盤空間可能會快速增長。在執行完final之後,會釋放一部分空間。
Finallize upgrade: hdfs dfsadmin -finalizeUpgrade
http://blog.51cto.com/hsbxxl/1976472
Hadoop集群(四) Hadoop升級