1. 程式人生 > >Hadoop 和 Hbase 的安裝與配置 (單機模式)

Hadoop 和 Hbase 的安裝與配置 (單機模式)

(一定要看最後我趟過的坑,如果安裝過程有問題,可參考最後我列出的問題及解決方法)

下載Hadoop安裝包

這裡安裝版本:hadoop-1.0.4.tar.gz

在安裝Hadoop之前,伺服器上一定要有安裝的jdk

jdk安裝方式之一:在官網上下載Linux下的rpm安裝包:jdk-8u181-linux-x64.rpm

上傳到伺服器:(可在Xshell使用rz命令)

執行命令rpm -ivh jdk-8u181-linux-x64.rpm

安裝完成,然後配置jdk安裝路徑:

方法一:可直接執行一些命令:

$ export JAVA_HOME=/opt/jdk1.6.0_24

$ export PATH=$JAVA_HOME/bin:${PATH}

方法二(推薦,寫個指令碼檔案:hadoop-env.sh)

touch hadoop-env.sh;

chmod a+x hadoop-env.sh;

方法一中的命令複製到hadoop-env.sh中

每次執行hadoop前直接執行指令碼檔案即可:

命令:source hadoop-env.sh;

1、解壓安裝包

tar -xvf hadoop-1.0.4.tar.gz    #我是解壓到了/opt/下

mv hadoop-1.0.4 hadoop        #hadoop的安裝路徑:/opt/hadoop

#將hadoop的安裝路徑寫到hadoop-env.sh中

export HADOOP_HOME=/usr/local/Hadoop

export PATH=$HADOOP_HOME/bin:$PATH

    我的啟動指令碼檔案:hadoop-env.sh

[[email protected] opt]# cat /root/hadoop-env.sh

export HADOOP_HOME=/home/LiuHuan/hadoop

export PATH=$HADOOP_HOME/bin:$PATH

export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/

export PATH=$JAVA_HOME/bin:${PATH}

2、建立SSH無密連線

#一路回車

[[email protected]
opt]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:NmlTPEeHk9I1XdHtJ5x5h/FmYP2jXcyJM1sWr+LZloo [email protected] The key's randomart image is: +---[RSA 2048]----+ |           ..+++*| |         ...=.+o=| |          [email protected]=| |         o o +*=#| |        S     O*=| |       o o  .o.. | |           . + . | |           .o +  | |          E .o   | +----[SHA256]-----+ #我的.ssh在路徑/root下,所以我在/root下執行一下命令: $ cp .ssh/id _rsa.pub .ssh/authorized_keys #連線本地,如果上邊配置成功,這裡不用輸入密碼,可直接ssh連線到本地 $ ssh localhost

3、修改、/opt/hadoop/conf/下的配置檔案:core-site.xml, hdfs-site.xml and mapred-site.xml.(Hadoop的版本不是1.0.*的修改的配置檔案可能不在該路徑下,找到修改即可。版本2.0一上在hadoop/etc/hadoop/)

#修改:core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>fs.default.name</name>

</property>

</configuration>

#修改:hdfs-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

#修改mapred-site.xml(2.0版本沒有這個檔案,可修改 mapred-site.xml.template)

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

4、建立Hadoop的資料儲存路徑:

mkdir /var/lib/hadoop

chmod 777 /var/lib/hadoop

5、再修改配置檔案core-site.xml,增加如下內容:

<property>

<name>hadoop.tmp.dir</name>

<value>/var/lib/hadoop</value>

</property>

5、格式化HDFS檔案系統

$ hadoop namenode -format

#以下為輸出內容

12/10/26 22:45:25 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG: host = vm193/10.0.0.193

STARTUP_MSG: args = [-format]

…1

2/10/26 22:45:25 INFO namenode.FSNamesystem:  fsOwner=hadoop,hadoop

12/10/26 22:45:25 INFO namenode.FSNamesystem:  supergroup=supergroup

12/10/26 22:45:25 INFO namenode.FSNamesystem:  isPermissionEnabled=true

12/10/26 22:45:25 INFO common.Storage: Image file of size 96  saved in 0 seconds.

12/10/26 22:45:25 INFO common.Storage: Storage directory  /var/lib/hadoophadoop/

dfs/name has been successfully formatted.

12/10/26 22:45:26 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at vm193/10.0.0.193

$

6、啟動Hadoop:

./bin/start-dfs.sh

starting namenode, logging to  /home/hadoop/hadoop/bin/../logs/hadoop-hadoopnamenode-

vm193.out

localhost: starting datanode, logging to

/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-vm193.out

localhost: starting secondarynamenode, logging to

/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-vm193.out

$ jps

9550 DataNode

9687 Jps

9638 SecondaryNameNode

9471 NameNode

$ start-mapred.sh

starting jobtracker, logging to /home/hadoop/hadoop/bin/../logs/hadoophadoop-

jobtracker-vm193.out

localhost: starting tasktracker, logging to

/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-vm193.out

$ jps

9550 DataNode

9877 TaskTracker

9638 SecondaryNameNode

9471 NameNode

9798 JobTracker

9913 Jps

啟動Hadoop可以使用一個命令:./sbin/start-all.sh(Hadoop2.0的啟動指令碼start-all.sh在hadoop/sbin/下)

然後使用命令:jps檢視啟動的程序(必須要有DateNode和NameNode)版本不一樣可能或有差別,上邊啟動的是1.0版本的程序

下邊是hadoop2.7.7版本的程序:

[[email protected] hadoop]# jps

4817 NameNode

4945 DataNode

5110 SecondaryNameNode

5560 Jps

5337 NodeManager

4173 ResourceManager

出現error:localhost: Error: JAVA_HOME is not set and could not be found.

vi /opt/hadoop/etc/hadoop/hadoop-env.sh

增加JAVA_HOME。例如畫紅線處:=後換成你jdk的安裝目錄

7、Hadoop的簡單使用:

$ hadoop dfs -mkdir /user
$ hadoop dfs -mkdir /user/hadoop
$ hadoop fs -ls /user
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:09 /user/Hadoop
$ echo "This is a test." >> test.txt
$ cat test.txt
This is a test.
$ hadoop dfs -copyFromLocal test.txt .
$ hadoop dfs -ls
Found 1 items
-rw-r--r-- 1 hadoop supergroup 16 2012-10-26
23:19/user/hadoop/test.txt
$ hadoop dfs -cat test.txt
This is a test.
$ rm test.txt
$ hadoop dfs -cat test.txt
This is a test.
$ hadoop fs -copyToLocal test.txt
$ cat test.txt
This is a test.

8、Hadoop的監控:

在瀏覽器位址列輸入:http://localhost:50030    #一般監控和Hadoop不在一臺伺服器上,需要將localhost改成安裝Hadoop的伺服器的ip地址

3.0版本的埠修改為:

hdfs的web頁面預設埠是9870 yarn的web頁面埠是8088 

非原創,步驟來之書籍《Hadoop.Data.Processing.and.Modelling》

安裝Hbase過程

第一步:

解壓:tar -xvf hbase-2.0.2-bin.tar.gz

我的hbase安裝路徑為:/opt/hbase

第二步:

把hbase的安裝路徑增加到啟動檔案中:hadoop-env.sh(或者在/etc/profile裡增加)

export HBASE_HOME=/opt/hbase-2.0.2

export PATH=$HBASE_HOME/bin:${PATH}

執行指令碼檔案:

source hadoop-env.sh

第三步:

#配置Hbase:
vi hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/
export HBASE_CLASSPATH=/opt/hbase-2.0.2/conf
export HBASE_MANAGES_ZK=true


vi hbase-site.xml

<configuration>

<property>
  <name>hbase.rootdir</name>
  <value>hdfs://localhost:9000/hbase</value>
</property>

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>localhost</value>
</property>

<property>
  <name>hbase.tmp.dir</name>
  <value>/root/hbase/tmp</value>
</property>

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>


</configuration>

增加,將Hadoop安裝目錄下的配置檔案hdfs-site.xml和core-site.xml(Hadoop 2.7.7在hadoop/etc/hadoop/)拷貝到Hbase的配置檔案目錄conf/下

第四步:

啟動Hbase:
./bin/start-hbase.sh


#需要有這些程序

[[email protected] hbase-2.0.2]# jps
1764 NameNode
6837 HQuorumPeer
7045 Jps
2246 ResourceManager
3801 HRegionServer
6905 HMaster
2075 SecondaryNameNode
1868 DataNode
2349 NodeManager



#進入Hbase Shell:

hbase shell

[[email protected] hbase-2.0.2]# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.2, r1cfab033e779df840d5612a85277f42a6a4e8172, Tue Aug 28 20:50:40 PDT 2018
Took 0.0185 seconds                                                                          

hbase(main):001:0> list
TABLE
0 row(s)
Took 2.2076 seconds                                                                          
=> []
hbase(main):002:0> create 'member', 'm_id', 'address', 'info'
Created table member
Took 2.2952 seconds                                                                          
=> Hbase::Table - member
hbase(main):003:0> list 'member'
TABLE                                                                                    
member                                                                                   
1 row(s)
Took 0.0371 seconds                                                                          
=> ["member"]
hbase(main):004:0> list
TABLE                                                                                    
member                                                                                   
1 row(s)
Took 0.0324 seconds                                                                          
=> ["member"]

hbase(main):005:0> exit
#退出Hbase Shell

總結:

遇到的問題:

error:

1、error1

[[email protected] hadoop]# ./sbin/start-dfs.sh

Starting namenodes on [localhost]

ERROR: Attempting to operate on hdfs namenode as root

ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.

Starting datanodes

ERROR: Attempting to operate on hdfs datanode as root

ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.

Starting secondary namenodes [localhost.localdomain]

ERROR: Attempting to operate on hdfs secondarynamenode as root

ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

解決1: 是因為缺少使用者定義造成的,所以分別編輯開始和關閉指令碼 

$ vim sbin/start-dfs.sh

$ vim sbin/stop-dfs.sh

在頂部空白處新增內容: 

HDFS_DATANODE_USER=root

HADOOP_SECURE_DN_USER=hdfs

HDFS_NAMENODE_USER=root

HDFS_SECONDARYNAMENODE_USER=root

或者

hadoop-env.sh中新增一下指令碼:(注:hadoop-env.sh為自己編寫的指令碼檔案,宣告一些環境變數,啟動Hadoop前需要執行hadoop-env.sh

export HDFS_NAMENODE_USER="root"

export HDFS_DATANODE_USER="root"

export HDFS_SECONDARYNAMENODE_USER="root"

export YARN_RESOURCEMANAGER_USER="root"

export YARN_NODEMANAGER_USER="root"

Qerror2:(和error1是一類問題)

Q2:

Starting resourcemanager

ERROR: Attempting to launch yarn resourcemanager as root

ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting launch.

Starting nodemanagers

ERROR: Attempting to launch yarn nodemanager as root

ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting launch.

解決2:

是因為缺少使用者定義造成的,所以分別編輯開始和關閉指令碼

$ vim sbin/start-yarn.sh

$ vim sbin/stop-yarn.sh

新增內容:

YARN_RESOURCEMANAGER_USER=root

HADOOP_SECURE_DN_USER=yarn

YARN_NODEMANAGER_USER=root

hadoop-env.sh 檔案中增加

export JAVA_HOME=你的java路徑

2、error3

[[email protected] sbin]# ./start-dfs.sh

ls: Call From localhost/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

Starting namenodes on [localhost]

Last login: Wed Oct 17 07:53:07 EDT 2018 from 172.16.7.1 on pts/1

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

Starting datanodes

Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

Starting secondary namenodes [localhost.localdomain]

Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

解決方法:

修改配置檔案hadoop-env.sh(這個是解壓後就有的,不是自己建的檔案)

我的安裝目錄是:/opt/hadoop/

hadoop-1.*.*.tar.gz此版本的檔案在:hadoop安裝目錄/conf/hadoop-env.sh

1.0以上版本檔案在:hadoop安裝目錄/etc/hadoop/hadoop-env.sh

vi /opt/hadoop/etc/hadoop/hadoop-env.sh

增加JAVA_HOME。例如畫紅線處:=後換成你jdk的安裝目錄

修改環境變數:

sudo vi ~/.bashrc

檔案的末尾追加下面內容:

#set oracle jdk environment

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_151 ## 這裡要注意目錄要換成自己解壓的jdk 目錄

export JRE_HOME=${JAVA_HOME}/jre

export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib

export PATH=${JAVA_HOME}/bin:$PATH

使環境變數馬上生效

source ~/.bashrc