Hadoop 和 Hbase 的安裝與配置 (單機模式)
(一定要看最後我趟過的坑,如果安裝過程有問題,可參考最後我列出的問題及解決方法)
下載Hadoop安裝包
這裡安裝版本:hadoop-1.0.4.tar.gz
在安裝Hadoop之前,伺服器上一定要有安裝的jdk
jdk安裝方式之一:在官網上下載Linux下的rpm安裝包:jdk-8u181-linux-x64.rpm
上傳到伺服器:(可在Xshell使用rz命令)
執行命令rpm -ivh jdk-8u181-linux-x64.rpm
安裝完成,然後配置jdk安裝路徑:
方法一:可直接執行一些命令:
$ export JAVA_HOME=/opt/jdk1.6.0_24
$ export PATH=$JAVA_HOME/bin:${PATH}
方法二(推薦,寫個指令碼檔案:hadoop-env.sh)
touch hadoop-env.sh;
chmod a+x hadoop-env.sh;
方法一中的命令複製到hadoop-env.sh中
每次執行hadoop前直接執行指令碼檔案即可:
命令:source hadoop-env.sh;
1、解壓安裝包
tar -xvf hadoop-1.0.4.tar.gz #我是解壓到了/opt/下
mv hadoop-1.0.4 hadoop #hadoop的安裝路徑:/opt/hadoop
#將hadoop的安裝路徑寫到hadoop-env.sh中
export HADOOP_HOME=/usr/local/Hadoop
export PATH=$HADOOP_HOME/bin:$PATH
我的啟動指令碼檔案:hadoop-env.sh
[[email protected] opt]# cat /root/hadoop-env.sh
export HADOOP_HOME=/home/LiuHuan/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/
export PATH=$JAVA_HOME/bin:${PATH}
2、建立SSH無密連線
#一路回車
[[email protected] opt]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:NmlTPEeHk9I1XdHtJ5x5h/FmYP2jXcyJM1sWr+LZloo [email protected]
The key's randomart image is:
+---[RSA 2048]----+
| ..+++*|
| ...=.+o=|
| [email protected]=|
| o o +*=#|
| S O*=|
| o o .o.. |
| . + . |
| .o + |
| E .o |
+----[SHA256]-----+
#我的.ssh在路徑/root下,所以我在/root下執行一下命令:
$ cp .ssh/id _rsa.pub .ssh/authorized_keys
#連線本地,如果上邊配置成功,這裡不用輸入密碼,可直接ssh連線到本地
$ ssh localhost
3、修改、/opt/hadoop/conf/下的配置檔案:core-site.xml, hdfs-site.xml and mapred-site.xml.(Hadoop的版本不是1.0.*的修改的配置檔案可能不在該路徑下,找到修改即可。版本2.0一上在hadoop/etc/hadoop/)
#修改:core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
</property>
</configuration>
#修改:hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
#修改mapred-site.xml(2.0版本沒有這個檔案,可修改 mapred-site.xml.template)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
4、建立Hadoop的資料儲存路徑:
mkdir /var/lib/hadoop
chmod 777 /var/lib/hadoop
5、再修改配置檔案core-site.xml,增加如下內容:
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop</value>
</property>
5、格式化HDFS檔案系統
$ hadoop namenode -format
#以下為輸出內容
12/10/26 22:45:25 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = vm193/10.0.0.193
STARTUP_MSG: args = [-format]
…1
2/10/26 22:45:25 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
12/10/26 22:45:25 INFO namenode.FSNamesystem: supergroup=supergroup
12/10/26 22:45:25 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/10/26 22:45:25 INFO common.Storage: Image file of size 96 saved in 0 seconds.
12/10/26 22:45:25 INFO common.Storage: Storage directory /var/lib/hadoophadoop/
dfs/name has been successfully formatted.
12/10/26 22:45:26 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vm193/10.0.0.193
$
6、啟動Hadoop:
./bin/start-dfs.sh
starting namenode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoopnamenode-
vm193.out
localhost: starting datanode, logging to
/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-vm193.out
localhost: starting secondarynamenode, logging to
/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-vm193.out
$ jps
9550 DataNode
9687 Jps
9638 SecondaryNameNode
9471 NameNode
$ start-mapred.sh
starting jobtracker, logging to /home/hadoop/hadoop/bin/../logs/hadoophadoop-
jobtracker-vm193.out
localhost: starting tasktracker, logging to
/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-vm193.out
$ jps
9550 DataNode
9877 TaskTracker
9638 SecondaryNameNode
9471 NameNode
9798 JobTracker
9913 Jps
啟動Hadoop可以使用一個命令:./sbin/start-all.sh(Hadoop2.0的啟動指令碼start-all.sh在hadoop/sbin/下)
然後使用命令:jps檢視啟動的程序(必須要有DateNode和NameNode)版本不一樣可能或有差別,上邊啟動的是1.0版本的程序
下邊是hadoop2.7.7版本的程序:
[[email protected] hadoop]# jps
4817 NameNode
4945 DataNode
5110 SecondaryNameNode
5560 Jps
5337 NodeManager
4173 ResourceManager
出現error:localhost: Error: JAVA_HOME is not set and could not be found.
vi /opt/hadoop/etc/hadoop/hadoop-env.sh
增加JAVA_HOME。例如畫紅線處:=後換成你jdk的安裝目錄
7、Hadoop的簡單使用:
$ hadoop dfs -mkdir /user
$ hadoop dfs -mkdir /user/hadoop
$ hadoop fs -ls /user
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:09 /user/Hadoop
$ echo "This is a test." >> test.txt
$ cat test.txt
This is a test.
$ hadoop dfs -copyFromLocal test.txt .
$ hadoop dfs -ls
Found 1 items
-rw-r--r-- 1 hadoop supergroup 16 2012-10-26
23:19/user/hadoop/test.txt
$ hadoop dfs -cat test.txt
This is a test.
$ rm test.txt
$ hadoop dfs -cat test.txt
This is a test.
$ hadoop fs -copyToLocal test.txt
$ cat test.txt
This is a test.
8、Hadoop的監控:
在瀏覽器位址列輸入:http://localhost:50030 #一般監控和Hadoop不在一臺伺服器上,需要將localhost改成安裝Hadoop的伺服器的ip地址
3.0版本的埠修改為:
hdfs的web頁面預設埠是9870 yarn的web頁面埠是8088
非原創,步驟來之書籍《Hadoop.Data.Processing.and.Modelling》
安裝Hbase過程
第一步:
解壓:tar -xvf hbase-2.0.2-bin.tar.gz
我的hbase安裝路徑為:/opt/hbase
第二步:
把hbase的安裝路徑增加到啟動檔案中:hadoop-env.sh(或者在/etc/profile裡增加)
export HBASE_HOME=/opt/hbase-2.0.2
export PATH=$HBASE_HOME/bin:${PATH}
執行指令碼檔案:
source hadoop-env.sh
第三步:
#配置Hbase:
vi hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/
export HBASE_CLASSPATH=/opt/hbase-2.0.2/conf
export HBASE_MANAGES_ZK=true
vi hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/root/hbase/tmp</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
增加,將Hadoop安裝目錄下的配置檔案hdfs-site.xml和core-site.xml(Hadoop 2.7.7在hadoop/etc/hadoop/)拷貝到Hbase的配置檔案目錄conf/下
第四步:
啟動Hbase:
./bin/start-hbase.sh
#需要有這些程序
[[email protected] hbase-2.0.2]# jps
1764 NameNode
6837 HQuorumPeer
7045 Jps
2246 ResourceManager
3801 HRegionServer
6905 HMaster
2075 SecondaryNameNode
1868 DataNode
2349 NodeManager
#進入Hbase Shell:
hbase shell
[[email protected] hbase-2.0.2]# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.2, r1cfab033e779df840d5612a85277f42a6a4e8172, Tue Aug 28 20:50:40 PDT 2018
Took 0.0185 seconds
hbase(main):001:0> list
TABLE
0 row(s)
Took 2.2076 seconds
=> []
hbase(main):002:0> create 'member', 'm_id', 'address', 'info'
Created table member
Took 2.2952 seconds
=> Hbase::Table - member
hbase(main):003:0> list 'member'
TABLE
member
1 row(s)
Took 0.0371 seconds
=> ["member"]
hbase(main):004:0> list
TABLE
member
1 row(s)
Took 0.0324 seconds
=> ["member"]
hbase(main):005:0> exit
#退出Hbase Shell
總結:
遇到的問題:
error:
1、error1
[[email protected] hadoop]# ./sbin/start-dfs.sh
Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [localhost.localdomain]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
解決1: 是因為缺少使用者定義造成的,所以分別編輯開始和關閉指令碼
$ vim sbin/start-dfs.sh
$ vim sbin/stop-dfs.sh
在頂部空白處新增內容:
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
或者
在hadoop-env.sh中新增一下指令碼:(注:hadoop-env.sh為自己編寫的指令碼檔案,宣告一些環境變數,啟動Hadoop前需要執行hadoop-env.sh)
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
Qerror2:(和error1是一類問題)
Q2:
Starting resourcemanager
ERROR: Attempting to launch yarn resourcemanager as root
ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting launch.
Starting nodemanagers
ERROR: Attempting to launch yarn nodemanager as root
ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting launch.
解決2:
是因為缺少使用者定義造成的,所以分別編輯開始和關閉指令碼
$ vim sbin/start-yarn.sh
$ vim sbin/stop-yarn.sh
新增內容:
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
hadoop-env.sh 檔案中增加
export JAVA_HOME=你的java路徑
2、error3
[[email protected] sbin]# ./start-dfs.sh
ls: Call From localhost/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Starting namenodes on [localhost]
Last login: Wed Oct 17 07:53:07 EDT 2018 from 172.16.7.1 on pts/1
/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found
ERROR: JAVA_HOME is not set and could not be found.
Starting datanodes
Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0
/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found
ERROR: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [localhost.localdomain]
Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0
/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found
ERROR: JAVA_HOME is not set and could not be found.
解決方法:
修改配置檔案hadoop-env.sh(這個是解壓後就有的,不是自己建的檔案)
我的安裝目錄是:/opt/hadoop/
hadoop-1.*.*.tar.gz此版本的檔案在:hadoop安裝目錄/conf/hadoop-env.sh
1.0以上版本檔案在:hadoop安裝目錄/etc/hadoop/hadoop-env.sh
vi /opt/hadoop/etc/hadoop/hadoop-env.sh
增加JAVA_HOME。例如畫紅線處:=後換成你jdk的安裝目錄
修改環境變數:
sudo vi ~/.bashrc
檔案的末尾追加下面內容:
#set oracle jdk environment
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_151 ## 這裡要注意目錄要換成自己解壓的jdk 目錄
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
使環境變數馬上生效
source ~/.bashrc