1. 程式人生 > >Hadoop叢集的搭建(虛擬機器準備,JDK和Hadoop安裝,Hadoop目錄結構)

Hadoop叢集的搭建(虛擬機器準備,JDK和Hadoop安裝,Hadoop目錄結構)

目錄

虛擬機器準備

JDK和Hadoop安裝

Hadoop的目錄結構:


虛擬機器準備

環境:一臺剛裝好的CentOS,操作如下:

[[email protected] ~]# ifconfig

獲取當前主機的ip地址,然後使用shell登入,使用shell操作,更加便捷。在我們使用shell登入之後,我們可以先檢視一下相關的關於網路方面的資訊:

# 檢視一下關於ip的資訊
[[email protected] ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0 

DEVICE=eth0
HWADDR=00:0C:29:22:E8:45
TYPE=Ethernet
UUID=bd07f61f-25b4-4899-810d-91046e2b145a
ONBOOT=no
NM_CONTROLLED=yes
BOOTPROTO=dhcp


#檢視一個關於網絡卡的配置資訊:
[
[email protected]
~]# vim /etc/udev/rules.d/70-persistent-net.rules # This file was automatically generated by the /lib/udev/write_net_rules # program, run by the persistent-net-generator.rules rules file. # # You can modify it, as long as you keep each rule on a single # line, and change only the value of the NAME= key. # PCI device 0x8086:0x100f (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:22:e8:45", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x8086:0x100f (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:66:02:e5", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" ~

接下來我們開始配置

#1,關閉防火牆
[[email protected] ~]# chkconfig iptables off
[[email protected] ~]# chkconfig --list iptables
iptables       	0:關閉	1:關閉	2:關閉	3:關閉	4:關閉	5:關閉	6:關閉

#2,設定靜態ip
[[email protected] ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0 
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
NAME="eth0"
IPADDR=192.168.1.101
PREFIX=24
GATEWAY=192.168.1.2
DNS=192.168.1.2

#3,修改主機名
[
[email protected]
~]# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=hadoop101 #4,修改網絡卡的資訊,刪0改1 [[email protected] ~]# vim /etc/udev/rules.d/70-persistent-net.rules # PCI device 0x8086:0x100f (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:66:02:e5", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" ~ #5,配置/etc/hosts檔案,儘量配置的多一點,可以備用,另外在Windows下的hosts檔案需要配置一下(此處不演示)。 [[email protected] ~]# vim /etc/hosts 192.168.1.100 hadoop100 192.168.1.101 hadoop101 192.168.1.102 hadoop102 192.168.1.103 hadoop103 192.168.1.104 hadoop104 192.168.1.105 hadoop105 192.168.1.106 hadoop106 192.168.1.107 hadoop107 192.168.1.108 hadoop108 192.168.1.109 hadoop109 192.168.1.110 hadoop110 192.168.1.111 hadoop111 192.168.1.112 hadoop112 192.168.1.113 hadoop113 192.168.1.114 hadoop114 #6,建立一個一般使用者,並設定密碼 [[email protected] ~]# useradd isea [[email protected] ~]# passwd isea #7,配置該使用者的root許可權,91 shift + g 直接定位到改行 [[email protected] ~]# vim /etc/sudoers isea ALL=(ALL) NOPASSWD:ALL #8,建立opt目錄下建立module資料夾和software資料夾,並將所有權賦給一般使用者 [[email protected] ~]# mkdir /opt/module /opt/software [[email protected] ~]# chown isea:isea /opt/software/ /opt/module/ [[email protected] ~]# ll /opt/ 總用量 12 drwxr-xr-x. 2 isea isea 4096 11月 14 17:12 module drwxr-xr-x. 2 root root 4096 3月 26 2015 rh drwxr-xr-x. 2 isea isea 4096 11月 14 17:12 software #9,重新啟動,普通使用者登入,檢視機器的ip地址是否正確。 [[email protected] opt]$ ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:82:89:7B inet addr:192.168.1.101 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe82:897b/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:549 errors:0 dropped:0 overruns:0 frame:0 TX packets:194 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:43261 (42.2 KiB) TX bytes:23769 (23.2 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:16 errors:0 dropped:0 overruns:0 frame:0 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1248 (1.2 KiB) TX bytes:1248 (1.2 KiB)

以上的過程,我們就完成了一臺虛擬機器的配置。接下來,我們需要安裝一下jdk 和 hadoop。

JDK和Hadoop安裝

#1,將準備的jdk的原始碼包和hadoop原始碼包匯入software目錄,並檢查是否成功
[[email protected] software]$ ll
總用量 374196
-rw-rw-r--. 1 isea isea 197657687 11月 14 17:55 hadoop-2.7.2.tar.gz
-rw-rw-r--. 1 isea isea 185515842 11月 14 17:55 jdk-8u144-linux-x64.tar.gz


#2,將jar包解壓縮到module目錄,並檢視是否解壓縮成功
[[email protected] software]$ tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/module/
*
*
*
[[email protected] software]$ tar -zxvf hadoop-2.7.2.tar.gz -C /opt/module/
*
*
*
[[email protected] module]$ ll
總用量 8
drwxr-xr-x. 9 isea isea 4096 5月  22 2017 hadoop-2.7.2
drwxr-xr-x. 8 isea isea 4096 7月  22 2017 jdk1.8.0_144


#3,進入jdk和hadoop的安裝目錄獲取各自的路徑,準備配置環境變數
[[email protected] module]$ cd jdk1.8.0_144/
[[email protected] jdk1.8.0_144]$ pwd
/opt/module/jdk1.8.0_144
[[email protected] jdk1.8.0_144]$ cd ..
[[email protected] module]$ cd hadoop-2.7.2/
[[email protected] hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2

#4,配置jdk 和hadoop的環境變數,編輯/etc/profile檔案(注意sudo),在文末(shift + g)分別新增
[[email protected] module]$ sudo vim /etc/profile

#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin

#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

#5,source profile檔案,檢查是否安裝成功
[[email protected] module]$ hadoop version
Hadoop 2.7.2
Subversion Unknown -r Unknown
Compiled by root on 2017-05-22T10:49Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/module/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar
[[email protected] module]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

如此一來,我們就安裝好了Hadoop和JDK,我們在將這臺虛擬機器,克隆兩份,在分別修改一下IP地址,就可以愉快的開始hadoop之旅了。

Hadoop的目錄結構:

[[email protected] hadoop-2.7.2]$ ll
總用量 52
drwxr-xr-x. 2 isea isea  4096 5月  22 2017 bin
drwxr-xr-x. 3 isea isea  4096 5月  22 2017 etc
drwxr-xr-x. 2 isea isea  4096 5月  22 2017 include
drwxr-xr-x. 3 isea isea  4096 5月  22 2017 lib
drwxr-xr-x. 2 isea isea  4096 5月  22 2017 libexec
-rw-r--r--. 1 isea isea 15429 5月  22 2017 LICENSE.txt
-rw-r--r--. 1 isea isea   101 5月  22 2017 NOTICE.txt
-rw-r--r--. 1 isea isea  1366 5月  22 2017 README.txt
drwxr-xr-x. 2 isea isea  4096 5月  22 2017 sbin
drwxr-xr-x. 4 isea isea  4096 5月  22 2017 share

(1)bin目錄:存放對Hadoop相關服務(HDFS,YARN)進行操作的指令碼

(2)etc目錄:Hadoop的配置檔案目錄,存放Hadoop的配置檔案

(3)lib目錄:存放Hadoop的本地庫(對資料進行壓縮解壓縮功能)

(4)sbin目錄:存放啟動或停止Hadoop相關服務的指令碼

(5)share目錄:存放Hadoop的依賴jar包、文件、和官方案例