【hadoop】hadoop完全分散式叢集安裝
阿新 • • 發佈:2018-11-12
文章目錄
前言
後面準備更新hdfs操作(shell命令版本),hbase,hive的操作。
所以這裡先更新一下hadoop叢集安裝。
裝備
1.hadoop-2.6.5.tar.gz
2.三臺伺服器(虛擬機器就可以)
3.centos7
Core
-
伺服器規劃
後面我就直接說名字不說IP了
(192.168.31.60)master | (192.168.31.61)slave1 | (192.168.31.62)slave2 |
---|---|---|
NameNode | ResourceManage | SecondaryNameNode |
DataNode | DataNode | DataNode |
NodeManager | NodeManager | NodeManager |
HistoryServer |
-
下載hadoop原始碼包和JDK
hadoop官方下載
https://archive.apache.org/dist/hadoop/common/
java官方下載
https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
-
上傳到伺服器master
根據個人規劃路徑
# cd /app/install # ls hadoop-2.6.5.tar.gz jdk-8u171-linux-x64.tar.gz
-
建立hadoop使用者
# useradd hadoop # passwd hadoop
-
配置hostname
# vi /etc/hosts 192.168.31.60 master 192.168.31.61 slave1 192.168.31.62 slave2
-
配置SSH免密登入
# cd ~/.ssh/ # ssh-keygen -t rsa # ssh-copy-id -i 192.168.31.60 # scp -r /root/.ssh/ [email protected]:/root/ # scp -r /root/.ssh/ [email protected]:/root/
-
安裝JDK
$ cd /app/install $ tar -zxvf jdk-8u171-linux-x64.tar.gz -C /usr/local/java
配置java環境變數
$ vi /etc/profile #set java environment JAVA_HOME=/usr/local/java/jdk1.8.0_171 JRE_HOME=$JAVA_HOME/jre PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME:/bin CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export JAVA_HOME JRE_HOME PATH CLASSPATH
-
安裝Hadoop
$ cd /app/install $ tar -zxvf hadoop-2.6.5.tar.gz -C /usr/local/
配置hadoop環境變數
$ vi /etc/profile #set hadoop environment export HADOOP_HOME=/usr/local/hadoop-2.6.5 export PATH=$PATH:$HADOOP_HOME/bin
-
讓配置檔案起效
$ source /etc/profile $ source /etc/hosts
-
修改hadoop配置檔案
$ cd /usr/local/hadoop-2.6.5/etc/hadoop
-
修改hadoop-env.sh、mapred-env.sh、yarn-env.sh新增jdk路徑
$ export JAVA_HOME=/usr/local/java/jdk1.8.0_171
-
配置core-site.xml
$ vi core-site.xml <configuration> #NameNode的地址+埠 <property> <name>fs.defaultFS</name> <value>hdfs://master:8020</value> </property> #hadoop臨時目錄的地址,預設情況NameNode和DataNode的資料檔案都會存在這個目錄 <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-2.6.5/data/tmp</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.tmp.dir}/dfs/data</value> </property> </configuration>
-
配置hdfs-site.xml
$ vi hdfs-site.xml <configuration> #secondaryNameNode的地址+埠號 <property> <name>dfs.namenode.secondary.http-address</name> <value>slave2:50090</value> </property> </configuration>
-
配置slaves
master slave1 slave2
-
配置yarn-site.xml
$ vi yarn-site.xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> #resourcemanager的地址 <property> <name>yarn.resourcemanager.hostname</name> <value>slave1</value> </property> #啟用日誌聚集功能 <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> #日誌儲存時間 <property> <name>yarn.log-aggregation.retain-seconds</name> <value>106800</value> </property> </configuration>
-
配置mapred-site.xml
$ cp mapred-site.xml.template mapred-site.xml $ vi cp mapred-site.xml.template mapred-site.xml <configuration> #設定yarn執行mapreduce任務 <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> #mapreduce的history伺服器安裝節點 <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> #history的web地址 <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> </configuration>
-
刪除doc
$ cd /usr/local/hadoop-2.6.5/share $ rm -rf doc
-
配置另外兩臺伺服器(slave1,slave2)
#複製hadoop到slave1,slave2 $ scp -r hadoop-2.6.5/ [email protected]:/usr/local/ $ scp -r hadoop-2.6.5/ [email protected]:/usr/local/ #複製jdk到slave1,slave2 $ scp -r java/ [email protected]:/usr/local/ $ scp -r java/ [email protected]:/usr/local/ #複製環境變數到slave1,slave2 $ scp /etc/profile [email protected]:/etc/ $ scp /etc/profile [email protected]:/etc/ #複製hostname到slave1,slave2 $ scp /etc/hosts [email protected]:/etc/ $ scp /etc/hosts [email protected]:/etc/ #記得source起效
-
NameNode格式化
$ cd /usr/local/hadoop-2.6.5/bin $ sh hdfs namenode –format
-
啟動叢集
$ cd /usr/local/hadoop-2.6.5/sin $ sh start-dfs.sh
-
啟動yarn
$ sh start-yarn.sh
-
Slave1啟動ResourceManager
$ ssh slave1 $ cd /usr/local/hadoop-2.6.5/sin $ sh yarn-daemon.sh start resourcemanager
-
master啟動historyServer
$ cd /usr/local/hadoop-2.6.5/sin $ sh mr-jobhistory-daemon.sh start historyserver
-
web頁面訪問
-
圖看效果
總結
- 搭建叢集不難。重點是親手去操作。
- 後面用上hive了,加hive,用了hbase,加hbase
- 更新到了zookeeper,就慢慢改造成高可用的
- 轉載註明下作者 感謝~