1. 程式人生 > >hadoop叢集搭建3之叢集啟動

hadoop叢集搭建3之叢集啟動

前面叢集已經成功搭建,現在來嘗試啟動叢集。第一次系統啟動的時候,是需要初始化的

啟動zookeeper

1.啟動zookeeper的命令:./zkServer.sh start|stop|status

[[email protected] ~]$3 zkServer.sh start (指令碼已經被配置在路徑下面了,所以不用再到zookeeper的bin目錄下面執行)
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.6/bin/../conf/zoo.cfg
[[email protected]
~]$3 zkServer.sh status(檢視其狀態) Mode: leader(有一臺機器的模式為leader,其他兩臺的機器為follower,說明其配置且啟動成功)

啟動hadoop(HDFS+YARN)

a.在格式化之前,先在journalnode節點機器上先啟動JournalNode程序

[[email protected] hadoop]$3 cd /home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[[email protected] ~]$3 sbin/hadoop-daemon.sh start journalnode (三臺機器都啟動)
[
[email protected]
~]$3 jps 17857 JournalNode (JournalNode的程序) 17759 QuorumPeerMain (zookeeper的程序)

b.NameNode 格式化 (把第一臺機器的namenode格式化,第二臺的namenode啟動不能也格式化)

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 hadoop namenode -format (把001臺機器的namenode給格式化)
INFO common.Storage: Storage directory /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/name has been successfully formatted.

c.hadoop001 和hadoop002 上的namenode是一樣的,但是不能兩臺都格式化,所以將hadoop001機器上的元資料同步到hadoop002機器上。主要是dfs.namenode.name.dir,dfs.namenode.edits.dir 還應該確保共享儲存目錄下(dfs.namenode.edits.dir)包含namenode所有的元資料

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 scp -r data hadoop002:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/ (直接把hadoop001機器的/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data資料夾傳送到hadoop002機器上,替換掉hadoop002機器裡面/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/資料夾下面的jn。反正三臺機器的jn配置是一樣的,所以不影響,再把hadoop001 上面的/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/下面的name和jn都給hadoop002  name裡面就是namenode 的元資料)

d.初始化ZKFC
相當於是做初始化的一個元資料,session connected

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 hdfs zkfc -formatZK (只需要在當前機器上格式化。注意控制檯上有一句話)
18/11/08 10:37:34 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ruozeclusterg5 in ZK.

e.啟動hdfs分散式儲存系統
在第一臺機器上去啟動分散式系統,它會自己去啟動第二臺第三臺機器上的節點,要注意看一下它的啟動順序

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 start-dfs.sh(在當前機器啟動分散式系統,sbin不用的原因是我們之前已經把sbin配置到個人環境變數中了)
Starting namenodes on [hadoop001 hadoop002]
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop001: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop001.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
hadoop003: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop003.out
Starting journal nodes [hadoop001 hadoop002 hadoop003]
hadoop001: journalnode running as process 17857. Stop it first.
hadoop003: journalnode running as process 3208. Stop it first.
hadoop002: journalnode running as process 3198. Stop it first.
Starting ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop002.out
hadoop001: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop001.out

啟動完分散式系統,用jps檢視一下程序是否三臺機器都已經正常啟動

[[email protected] hadoop-2.6.0-cdh5.7.0]$3 jps (看下三臺機器上面的節點是否正常啟動,如果沒有啟動,就去/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs下面檢視日誌)
###注:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs 下面包含namenode datanode journalnode  zkfc 的日誌,當然有的機器上並沒有部署zkfc namenode,所以就沒有這兩個日誌
啟動的時候發現第三臺機器的datanode沒有起來,日誌說是 namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured
解決方法:考慮到是hdfs.site.xml檔案配置可能出現問題,然後發現第三臺機器沒有,可能rz上傳時候出現了問題,當時沒注意,重新從第一臺機器中scp到第三臺上面,問題解決

小插曲:單程序的啟動

namenode(hadoop001 ,hadoop002):
hadoop-daemon.sh start namenode

datanode  (hadoop001,hadoop002,hadoop003):
hadoop-daemon.sh start datanode

JournalNode (hadoop001, hadoop002 ,hadoop003):
hadoop-daemon.sh start journalnode

ZKFC(hadoop001,hadoop002)
hadoop-daemon.sh start zkfc

f.啟動yarn
1.hadoop001機器上去啟動yarn。(注意看啟動順序)

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 start-yarn.sh (在當前機器啟動yarn)
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop001: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop001.out
hadoop003: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop003.out
[[email protected] hadoop-2.6.0-cdh5.7.0]$3 jps (檢視三臺機器的程序)
發現三臺機器的nodemanager都已經起來了,第一臺的resourcemanager也起來了,但是第二臺的resourcemanager沒有起來,需要手動到第二臺機器上去啟動

2.hadoop002 備機啟動resourcemanager

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh start resourcemanager
[[email protected] hadoop-2.6.0-cdh5.7.0]$1 jps(檢視該機器的rm是否已經起來)

小插曲:單程序的啟動以及yarn程序的關閉

ResourceManager(hadoop001, hadoop002)
yarn-daemon.sh start resourcemanager

NodeManager(hadoop001, hadoop002, hadoop003)
yarn-daemon.sh start nodemanager
關閉如下:a.關閉hadoop    (從yarn到hdfs)
步驟1:[[email protected] hadoop-2.6.0-cdh5.7.0]$1 stop-yarn.sh
步驟2:[[email protected] hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh stop resourcemanager
步驟3:[[email protected] hadoop-2.6.0-cdh5.7.0]$1 stop-dfs.sh
b.關閉zookeeper
[[email protected] hadoop-2.6.0-cdh5.7.0]$3 zkServer.sh stop (三臺同時操作)

再次啟動叢集:
[[email protected] hadoop-2.6.0-cdh5.7.0]$3 zkServer.sh start 
[[email protected] hadoop-2.6.0-cdh5.7.0]$1 start-dfs.sh
[[email protected] hadoop-2.6.0-cdh5.7.0]$1 start-yarn.sh
[[email protected] hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh start resourcemanager

到此為止,hadoop叢集的啟動與關閉介紹完畢

web介面檢視

hadoop001:
http://192.168.2.65:50070/
hadoop002:
http://192.168.2.199:50070/
resourcemanager(active):
http://192.168.2.65:8088
resourcemanager(standby):
http://192.168.2.199:8088/cluster/cluster
jobhistory:
http://192.168.2.65:19888/jobhistory

[[email protected] hadoop-2.6.0-cdh5.7.0]$1 hdfs haadmin -getServiceState nn1 (檢視狀態是active還是standby)
[[email protected] hadoop-2.6.0-cdh5.7.0]$1 hdfs dfsadmin -report (監控叢集狀態)
[[email protected] hadoop-2.6.0-cdh5.7.0] 1 1 HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver(檢視執行的作業的歷史job)