Linux鞏固記錄(3) hadoop 2.7.4 環境搭建
由於要近期使用hadoop等進行相關任務執行,操作linux時候就多了
以前只在linux上配置J2EE項目執行環境,無非配置下jdk,部署tomcat,再通過docker或者jenkins自動部署上去
看下進程,復制粘貼刪除等基本操作,很多東西久了不用就忘了,所有寫個demo鞏固下曾經的linux知識
後續會有hadoop等主流的大數據相關環境的搭建及使用
---------------------------------------------------------------------------------------------------------------------------------------------------------
這次講hadoop 2.7.4環境搭建
本次需要三個節點 操作用戶均為root
192.168.0.80 master
192.168.0.81 slave1
192.168.0.82 slave2
1.按照 Linux鞏固記錄(1) J2EE開發環境搭建及網絡配置 配置好三臺虛擬機的網絡和jdk 並能互通(都關掉防火墻)
2.更改80虛擬機hostname為master,81為slave1,82為slave2
vi /etc/sysconfig/network
以80為例:刪除localhost 增加 HOSTNAME=master
3.修改三臺虛擬機的hosts, 三臺虛擬機一樣
vi /etc/hosts
192.168.0.80 master
192.168.0.81 slave1
192.168.0.82 slave2
4.修改sshd配置
vi /etc/ssh/sshd_config
#放開註釋
RSAAuthentication yes
PubkeyAuthentication yes
5.三臺虛擬機全部重啟 shutdown -r now
--------------------------------------------------------------
6.ssh key配置,
cd ~/.ssh #(.ssh是目錄,如果沒有,執行$ ssh xxxxxx)
#master ssh master ssh-keygen –t rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys scp -r [email protected]:~/.ssh/id_rsa.pub slave1.pub scp -r [email protected]:~/.ssh/id_rsa.pub slave2.pub cat ~/.ssh/slave2.pub >> ~/.ssh/authorized_keys cat ~/.ssh/slave1.pub >> ~/.ssh/authorized_keys #slave1 ssh slave1 ssh-keygen –t rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys scp -r [email protected]:~/.ssh/id_rsa.pub master.pub scp -r [email protected]:~/.ssh/id_rsa.pub slave2.pub cat ~/.ssh/slave2.pub >> ~/.ssh/authorized_keys cat ~/.ssh/master.pub >> ~/.ssh/authorized_keys #slave2 ssh slave2 ssh-keygen –t rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys scp -r [email protected]:~/.ssh/id_rsa.pub master.pub scp -r [email protected]:~/.ssh/id_rsa.pub slave1.pub cat ~/.ssh/slave1.pub >> ~/.ssh/authorized_keys cat ~/.ssh/master.pub >> ~/.ssh/authorized_keys
配置完畢後可以無密碼登錄 如master中到salve1 ssh slave1
[[email protected] /]# ssh slave1
Last login: Wed Aug 30 21:34:51 2017 from slave2
[[email protected] ~]#
hadoop配置只需要在master上進行,配置完成後復制到slave上即可
7. 下載hadoop 2.7.4壓縮包到master /home下並解壓 重命名為 hadoop-2.7.4 tar -xzvf xxxxxx /home/hadoop-2.7.4
8.
vi /home/hadoop-2.7.4/etc/hadoop/hadoop-env.sh 設置JAVA_HOME
vi /home/hadoop-2.7.4/etc/hadoop/mapred-env.sh 設置JAVA_HOME
9 修改 /home/hadoop-2.7.4/etc/hadoop/core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> <description>設定namenode的主機名及端口(建議不要更改端口號)</description> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> <description> 設置緩存大小 </description> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop-2.7.4/tmp</value> <description> 存放臨時文件的目錄 </description> </property> <property> <name>hadoop.security.authorization</name> <value>false</value> </property> </configuration>
10 修改 /home/hadoop-2.7.4/etc/hadoop/hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop-2.7.4/hdfs/name</value> <description> namenode 用來持續存放命名空間和交換日誌的本地文件系統路徑 </description> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop-2.7.4/hdfs/data</value> <description> DataNode 在本地存放塊文件的目錄列表,用逗號分隔 </description> </property> <property> <name>dfs.replication</name> <value>2</value> <description> 設定 HDFS 存儲文件的副本個數,默認為3 </description> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration>
11 修改 /home/hadoop-2.7.4/etc/hadoop/mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> </property> <property> <name>mapreduce.jobtracker.http.address</name> <value>master:50030</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> <property> <name>mapred.job.tracker</name> <value>http://master:9001</value> </property> </configuration>
12 修改 /home/hadoop-2.7.4/etc/hadoop/yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> </configuration>
13 創建對應的文件夾 mkdir -p logs (其實可以先創建好了文件夾再復制,文件夾多了不影響)
在每個節點上創建數據存儲目錄/home/hadoop-2.7.4/hdfs 用來存放集群數據。
在主節點node上創建目錄/home/hadoop-2.7.4/hdfs/name 用來存放文件系統元數據。
在每個從節點上創建目錄/home/hadoop-2.7.4/hdfs/data 用來存放真正的數據。
所有節點上的日誌目錄為/home/hadoop-2.7.4/logs
所有節點上的臨時目錄為/home/hadoop-2.7.4/tmp
14復制配置好的配置到slave節點
scp -r /home/hadoop-2.7.4 [email protected]:/home/hadoop-2.7.4
scp -r /home/hadoop-2.7.4 [email protected]:/home/hadoop-2.7.4
15 在master節點上配置hadoop salve配置文件 增加節點
vi /home/hadoop-2.7.4/etc/hadoop/slaves
增加
salve1
slave2
16格式化namenode和datanode並啟動,(在master上執行就可以了 不需要在slave上執行)
/home/hadoop-2.7.4/bin/hadoop namenode -format
/home/hadoop-2.7.4/bin/hadoop datanode -format
/home/hadoop-2.7.4/sbin/start-all.sh
17 通過jps命令查看是否啟動成功
[[email protected] ~]# ssh master
Last login: Sat Sep 2 00:47:50 2017 from slave1
[[email protected] ~]# jps
9187 Jps
3221 ResourceManager
3062 SecondaryNameNode
2856 NameNode
[[email protected] ~]# ssh slave1
Last login: Sat Sep 2 00:25:55 2017 from master
[[email protected] ~]# jps
6044 Jps
2685 NodeManager
2590 DataNode
[[email protected] ~]# ssh slave2
Last login: Wed Aug 30 21:34:38 2017 from master
j[[email protected] ~]# jps
2679 NodeManager
5994 Jps
2590 DataNode
[[email protected] ~]#
如果啟動異常,一定要仔細看log並修正配置
Linux鞏固記錄(3) hadoop 2.7.4 環境搭建