1. 程式人生 > >Hadoop安裝雜記(1)

Hadoop安裝雜記(1)

hadoop 安裝 偽分布式模型 基礎

一、Hadoop基礎

1、偽分布式模型(單節點)

1.1 配置centos7默認JDK1.7的環境變量

[root@master1 ~]# vim /etc/profile.d/java.sh
i
export JAVA_HOME=/usr

[root@master1 ~]# source /etc/profile.d/java.sh

安裝jdk-devl包:
[root@master1 ~]# yum install java-1.7.0-openjdk-devel.x86_64

1.2 創建hadoop目錄,並將hadoop展開至目錄

[root@master1 ~]# mkdir /bdapps
[root@master1 ~]# tar xf hadoop-2.6.2.tar.gz -C /bdapps/

[root@master1 ~]# cd /bdapps/
創建軟鏈接:
[root@master1 bdapps]# ln -sv hadoop-2.6.2 hadoop

1.2 設置hadoop環境變量

[root@master1 hadoop]# vim /etc/profile.d/hadoop.sh

export HADOOP_PREFIX=/bdapps/hadoop
export PATH=$PATH:${HADOOP_PREFIX}/bin:${HADOOP_PREFIX}/sbin
export HADOOP_YARN_HOME=${HADOOP_PREFIX}
export HADOOP_MAPPERD_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}

重載文件:
[root@master1 ~]# source /etc/profile.d/hadoop.sh

1.3 創建運行Hadoop進程的用戶和相關目錄

創建組
[root@master1 ~]# groupadd hadoop
創建用戶,劃入hadoop組
[root@master1 ~]# useradd -g hadoop yarn
[root@master1 ~]# useradd -g hadoop hdfs
[root@master1 ~]# useradd -g hadoop mapred

創建數據目錄:
[root@master1 ~]# mkdir -pv /data/hadoop/hdfs/{nn,snn,dn}
數據目錄授權:
[root@master1 ~]# chown -R hdfs:hadoop /data/hadoop/hdfs
[root@master1 ~]# ll /data/hadoop/hdfs
total 0
drwxr-xr-x 2 hdfs hadoop 6 Apr 19 08:44 dn
drwxr-xr-x 2 hdfs hadoop 6 Apr 19 08:44 nn
drwxr-xr-x 2 hdfs hadoop 6 Apr 19 08:44 snn

創建日誌目錄並配置用戶權限(在安裝目錄下配置):
[root@master1 ~]# cd /bdapps/hadoop
[root@master1 hadoop]# mkdir logs
[root@master1 hadoop]# chmod g+w logs/
[root@master1 hadoop]# chown -R yarn:hadoop logs
[root@master1 hadoop]# ll | grep log
drwxrwxr-x 2 yarn  hadoop     6 Apr 19 08:47 logs

修改安裝目錄屬主屬組
[root@master1 hadoop]# chown -R yarn:hadoop ./*

1.4 配置hadoop

配置NS:
[root@master1 hadoop]# pwd
/bdapps/hadoop/etc/hadoop
[root@master1 hadoop]# vim core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:8020</value>
        <final>true</final>
    </property>
</configuration>

配置hdfs相關屬性:
[root@master1 hadoop]# vim hdfs-site.xml 

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///data/hadoop/hdfs/nn</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///data/hadoop/hdfs/dn</value>
    </property>
    <property>
        <name>fs.checkpoint.dir</name>
        <value>file:///data/hadoop/hdfs/snn</value>
    </property>
    <property>
        <name>fs.checkpoint.edits.dir</name>
        <value>file:///data/hadoop/hdfs/snn</value>
    </property>
</configuration>

配置mapred(MapReduce)
[root@master1 hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@master1 hadoop]# vim mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

配置yarn:
[root@master1 hadoop]# vim yarn-site.xml 

<configuration>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>localhost:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>localhost:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>localhost:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>localhost:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>localhost:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>localhost:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>10.201.106.131:8088</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.auxservices.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    </property>
</configuration>

1.5 定義從節點,偽分布模式默認從節點是自己,不用定義

[root@master1 hadoop]# cat slaves 
localhost

1.6 格式化HDFS

切換hdfs用戶:
[root@master1 ~]# su - hdfs

hdfs命令查看幫助:
[hdfs@master1 ~]$ hdfs --help

格式化:
[hdfs@master1 ~]$ hdfs namenode -format
查看:
[hdfs@master1 ~]$ ls /data/hadoop/hdfs/nn/current/
fsimage_0000000000000000000      seen_txid
fsimage_0000000000000000000.md5  VERSION

1.7 啟動hadoop

1.7.1 mapreduce相關啟動

以hdfs用戶啟動相關進程:
啟動名稱節點:
[hdfs@master1 ~]$ hadoop-daemon.sh start namenode

查看java進程:
[hdfs@master1 ~]$ jps
9127 NameNode
9220 Jps
查看詳細java進程信息:
[hdfs@master1 ~]$ jps -v

啟動輔助名稱節點:
[hdfs@master1 ~]$ hadoop-daemon.sh start secondarynamenode

啟動data節點:
[hdfs@master1 ~]$ hadoop-daemon.sh start datanode

遠程上傳文件測試:
[hdfs@master1 ~]$ hdfs dfs -mkdir /test
[hdfs@master1 ~]$ hdfs dfs -put /etc/fstab /test/fstab
[hdfs@master1 ~]$ hdfs dfs -ls /test
Found 1 items
-rw-r--r--   1 hdfs supergroup       1065 2018-04-20 15:04 /test/fstab

這個就是剛才上傳fstab文件
[hdfs@master1 ~]$ cat /data/hadoop/hdfs/dn/current/BP-908063675-10.201.106.131-1524136482474/current/finalized/subdir0/subdir0/blk_1073741825

本地宿主機存放數據的目錄(文件系統):
[hdfs@master1 ~]$ ls /data/hadoop/hdfs/dn/current/
BP-908063675-10.201.106.131-1524136482474  VERSION

1.7.2 yarn集群啟動

切換到yarn用戶:
[root@master1 ~]# su - yarn

啟動resourcemanager:
[yarn@master1 ~]$ yarn-daemon.sh start resourcemanager

啟動nodemanager:
[yarn@master1 ~]$ yarn-daemon.sh start nodemanager

1.8 查看hadoop狀態

瀏覽器訪問:http://10.201.106.131:50070

技術分享圖片

瀏覽器訪問:http://10.201.106.131:8088

技術分享圖片

1.9 hadoop上提交程序並運行

1.9.1 運行mapreduce測試程序

切換用戶:
[root@master1 mapreduce]# su - hdfs

運行測試程序:
[hdfs@master1 ~]$ yarn jar /bdapps/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar 

統計單詞個數:
[hdfs@master1 ~]$ yarn jar /bdapps/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /test/fstab /test/fstab.out
查看統計結果:
[hdfs@master1 ~]$ hdfs dfs -cat /test/fstab.out/part-r-00000

Hadoop安裝雜記(1)