1. 程式人生 > >3-3 Hadoop集群完全分布式配置部署

3-3 Hadoop集群完全分布式配置部署

連接 repl lis pts 創建目錄 啟動 ant window 主機

Hadoop集群完全分布式配置部署

下面的部署步驟,除非說明是在哪個服務器上操作,否則默認為在所有服務器上都要操作。為了方便,使用root用戶。

1.準備工作

1.1 centOS6服務器3臺

手動指定3服務器臺以下信息:

hostname

IP

mask

gateway

DNS

備註

master

172.17.138.82

255.255.255.0

172.17.138.1

202.203.85.88

服務器1

slave1

172.17.138.83

255.255.255.0

172.17.138.1

202.203.85.88

服務器2

slave2

172.17.138.84

255.255.255.0

172.17.138.1

202.203.85.88

服務器3

PC

172.17.138.61

255.255.255.0

172.17.138.1

202.203.85.88

Windows PC

1.2 軟件包

hadoop-2.7.6.tar.gz

jdk-8u171-linux-x64.tar.gz

上傳到3臺服務器的/soft目錄下

(下載地址:https://pan.baidu.com/s/1a_Pjl8uJ2d_-r1hbN05fWA)

1.3 關閉防火墻

關閉並檢查防火墻

  1. [root@ ~]# chkconfig iptables off
  2. [root@ ~]# service iptables stop
  3. [root@ ~]# service iptables status

1.4 關閉selinux

臨時關閉

[root@ ~]# setenforce 0

永久關閉,SELINUX=enforcing改為SELINUX=disabled

[root@ ~]# vi /etc/selinux/config

#SELINUX=enforcing

SELINUX=disabled

1.5 開啟sshd,windows下用Xshell連接3臺虛擬機,方便配置(復制、粘貼)

[root@ ~]# service sshd start

Windows下用Xshell連接3臺虛擬機

(文件下載:https://pan.baidu.com/s/1K052DJT9Pq0xy8XAVa764Q)

1.6 安裝JDK

解壓jdk

[root@ ~]# mkdir -p /soft/java

[root@ soft]# tar -zxvf jdk-8u171-linux-x64.tar.gz -C /soft/java/

配置環境變量

[root@ soft]# echo -e "\nexport JAVA_HOME=/soft/java/jdk1.8.0_171" >> /etc/profile

[root@ soft]# echo -e "\nexport PATH=\$PATH:\$JAVA_HOME/bin" >> /etc/profile

[root@ soft]# echo -e "\nexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar" >> /etc/profile

[root@ soft]# source /etc/profile

1.7 配置主機域名

1)master 172.17.138.82上操作

[root@ soft]# hostname master

[root@master ~]# vi /etc/hostname master


2)slave1 172.17.138.83上操作

[root@ soft]# hostname slave1

[root@master ~]# vi /etc/hostname

slave1

3)slave2 172.17.138.84上操作

[root@ soft]# hostname slave2

[root@master ~]# vi /etc/hostname slave2

1.8 配置hosts

3臺服務器上都執行

[root@master ~]# echo ‘172.17.138.82 master‘ >> /etc/hosts

[root@master ~]# echo ‘172.17.138.83 slave1‘ >> /etc/hosts

[root@master ~]# echo ‘172.17.138.84 slave2‘ >> /etc/hosts

1.9 ssh免密碼登錄

master上操作

[root@master home]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Created directory ‘/root/.ssh‘.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

1d:33:50:ac:03:2f:d8:10:8f:3d:48:95:d3:f8:7a:05 root@master

The key‘s randomart image is:

+--[ RSA 2048]----+

| oo.+.o. |

| ..== E.. |

| o++= o+ |

| . o.=..+ |

| oSo. |

| . . |

| . |

| |

| |

+-----------------+

[root@master home]#

一直enter,信息中會看到.ssh/id_rsa.pub的路徑。

[root@master ~]# cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys


檢查slave1,slave2上/root下,是否有.ssh目錄,沒有則創建,註意要有ll -a命令

slave1,slave2上操作

[root@ ~]# ll -a /root/

總用量 36

dr-xr-x---. 2 root root 4096 11月 16 17:31 .

dr-xr-xr-x. 18 root root 4096 11月 17 16:49 ..

-rw-------. 1 root root 953 11月 16 17:27 anaconda-ks.cfg

-rw-------. 1 root root 369 11月 17 18:12 .bash_history

-rw-r--r--. 1 root root 18 12月 29 2013 .bash_logout

-rw-r--r--. 1 root root 176 12月 29 2013 .bash_profile

-rw-r--r--. 1 root root 176 12月 29 2013 .bashrc

-rw-r--r--. 1 root root 100 12月 29 2013 .cshrc

-rw-r--r--. 1 root root 129 12月 29 2013 .tcshrc

[root@ ~]# mkdir /root/.ssh

把master上的/root/.ssh/authorized_keys復制到slave1,slave2的/root/.ssh上

master上操作

[root@master ~]# scp /root/.ssh/authorized_keys [email protected]:/root/.ssh/

[root@master ~]# scp /root/.ssh/authorized_keys [email protected]:/root/.ssh/

master,slave1,slave2上都操作

[root@master ~]# chmod 700 /root/.ssh


驗證

master上操作

ssh master,ssh slave1,ssh slave2

[root@master .ssh]# ssh slave1

Last failed login: Fri Nov 18 16:52:28 CST 2016 from master on ssh:notty

There were 2 failed login attempts since the last successful login.

Last login: Fri Nov 18 16:22:23 2016 from 192.168.174.1

[root@slave1 ~]# logout

Connection to slave1 closed.

[root@master .ssh]# ssh slave2

The authenticity of host ‘slave2 (172.17.138.84)‘ can‘t be established.

ECDSA key fingerprint is 95:76:9a:bc:ef:5e:f2:b3:cf:35:67:7a:3e:da:0e:e2.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘slave2‘ (ECDSA) to the list of known hosts.

Last failed login: Fri Nov 18 16:57:12 CST 2016 from master on ssh:notty

There was 1 failed login attempt since the last successful login.

Last login: Fri Nov 18 16:22:40 2016 from 192.168.174.1

[root@slave2 ~]# logout

Connection to slave2 closed.

[root@master .ssh]# ssh master

Last failed login: Fri Nov 18 16:51:45 CST 2016 from master on ssh:notty

There was 1 failed login attempt since the last successful login.

Last login: Fri Nov 18 15:33:56 2016 from 192.168.174.1

[root@master ~]#



2.配置hadoop集群

下面操作,若無特別指明,均是3臺服務器都執行操作。

2.1 解壓

[root@master soft]# mkdir -p /soft/hadoop/

[root@master soft]# tar -zxvf hadoop-2.7.6.tar.gz -C /soft/hadoop/

2.2 配置環境

[root@master ~]# vim /root/.bashrc

#HADOOP START

#export HADOOP_HOME=/soft/hadoop

export HADOOP_HOME=/soft/hadoop/hadoop-2.7.6

#HADOOP END

export PATH=/usr/local/sbin:/usr/local/bin/:/usr/bin:/usr/sbin:/sbin:/bin:/soft/hadoop/hadoop-2.7.6/bin:/soft/hadoop/hadoop-2.7.6/sbin

[root@master ~]# source ~/.bashrc

[root@master hadoop-2.7.6]# source /etc/profile

[root@master hadoop-2.7.6]# hadoop version

Hadoop 2.7.6

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff

Compiled by root on 2016-08-18T01:41Z

Compiled with protoc 2.5.0

From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4

This command was run using /soft/hadoop/hadoop-2.7.6/share/hadoop/common/hadoop-common-2.7.6.jar

[root@master hadoop-2.7.6]#

修改hadoop配置文件

hadoop-env.sh,yarn-env.sh增加JAVA_HOME配置

[root@slave2 soft]# echo -e "export JAVA_HOME=/soft/java/jdk1.8.0_171" >> /soft/hadoop/hadoop-2.7.6/etc/hadoop/hadoop-env.sh

[root@slave2 soft]# echo -e "export JAVA_HOME=/soft/java/jdk1.8.0_171" >> /soft/hadoop/hadoop-2.7.6/etc/hadoop/yarn-env.sh

創建目錄/hadoop,/hadoop/tmp,/hadoop/hdfs/data,/hadoop/hdfs/name

[root@master hadoop]# mkdir -p /hadoop/tmp

[root@master hadoop]# mkdir -p /hadoop/hdfs/data

[root@master hadoop]# mkdir -p /hadoop/hdfs/name

修改core-site.xml文件

[root@ ~]# vi /soft/hadoop/hadoop-2.7.6/etc/hadoop/core-site.xml

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/hadoop/tmp</value>

<description>Abase for other temporary directories.</description>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs://master:9000</value>

</property>

<property>

<name>io.file.buffer.size</name>

<value>4096</value>

</property>

</configuration>

修改hdfs-site.xml

[root@ ~]# vi /soft/hadoop/hadoop-2.7.6/etc/hadoop/hdfs-site.xml

<configuration>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/hadoop/hdfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/hadoop/hdfs/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>master:9001</value>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

</configuration>

復制mapred-site.xml.template為mapred-site.xml,並修改

[root@master hadoop]# cd /soft/hadoop/hadoop-2.7.6/etc/hadoop/

[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml

[root@master hadoop]# vi mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

<final>true</final>

</property>

<property>

<name>mapreduce.jobtracker.http.address</name>

<value>master:50030</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>master:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>master:19888</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>http://master:9001</value>

</property>

</configuration>

修改yarn-site.xml

[root@master hadoop]# vi yarn-site.xml

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>master:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>master:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>master:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>master:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>master:8088</value>

</property>

/soft/hadoop/hadoop-2.7.6/etc/hadoop/slave,刪除默認的,添加slave1,slave2

[root@master hadoop]# echo -e "slave1\nslave2" > /soft/hadoop/hadoop-2.7.6/etc/hadoop/slaves

2.3 啟動

只在master執行,格式化

[root@master hadoop]# cd /soft/hadoop/hadoop-2.7.6/bin/

[root@master bin]# ./hadoop namenode -format


啟動,只在master執行

[root@master bin]# cd /soft/hadoop/hadoop-2.7.6/sbin/

[root@master sbin]# ./start-all.sh

3.驗證

請參照上一篇偽分布式截圖,一模一樣

3.1 jps查看各節點

master

[root@master sbin]# jps

3337 Jps

2915 SecondaryNameNode

3060 ResourceManager

2737 NameNode

[root@master sbin]#



slave1

[root@slave1 hadoop]# jps

2608 DataNode

2806 Jps

2706 NodeManager

[root@slave1 hadoop]#


slave2

[root@slave2 hadoop]# jps

2614 DataNode

2712 NodeManager

2812 Jps

[root@slave2 hadoop]#
瀏覽器訪問master的50070,比如http://172.17.138.82:50070

http://172.17.138.82:8088/



好了,說明hadoop集群正常工作了

3.2 創建輸入的數據,采用/etc/protocols文件作為測試

先將文件拷貝到 hdfs 上:

[root@master sbin]# hadoop dfs -put /etc/protocols /user/hadoop/input

3.3 執行Hadoop WordCount應用(詞頻統計)

# 如果存在上一次測試生成的output,由於hadoop的安全機制,直接運行可能會報錯,所以請手動刪除上一次生成的output文件夾

$ hadoop jar /soft/hadoop/hadoop-2.7.6/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.6-sources.jar org.apache.hadoop.examples.WordCount input output

3.4查看生成的單詞統計數據

$ hadoop dfs -cat /user/hadoop/output/*

3.5停止

[root@master bin]# cd /soft/hadoop/hadoop-2.7.6/sbin/

[root@master sbin]# ./stop-all.sh

3-3 Hadoop集群完全分布式配置部署