1. 程式人生 > >Hadoop-1.2.1/1.0.1 install on Ubuntu

Hadoop-1.2.1/1.0.1 install on Ubuntu

1.hadoop-1.2.1官方包:點我

   Hadoop-1.0.1官方包:點我

1.1 解壓到home目錄中的Hadoop下

sudo tar -zxvf 包名

cd hadoop-1.2.1

2.jdk 1.6 點我 

  jdk 1.8 點我

jdk 移到usr/local/java中去後直接./執行(bin檔案)就解壓了 環境變數配置:vi ~/.bashrc 或者vi /etc/profile中新增java_home啥的記得用source重置一下。ps:新手如果發現執行某些操作出現permitted等關於許可權的詞時,在這個操作前面加上sudo即可,加上sudo就意味著以管理員許可權執行 (JDK1.8的也能用)

~/.bashrc示例:

#hadoop環境變數

export HADOOP_PREFIX=/home/root1/Hadoop/hadoop-1.2.1
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

#jdk環境變數
export JAVA_HOME=/usr/local/java/jdk1.6.0_45
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
export PATH=${JAVA_HOME}/bin:$PATH

3.Hadoop配置

3.1 在$Hadoop_HOME/conf/hadoop-env.sh中配置

sudo gedit conf/hadoop-env.sh

去掉export JAVA_HOME=/usr/local/java/jdk1.6.0_45前面的註釋號,並將路徑換成你的jdk路徑

3.2 配置conf/core-site.xml

sudo gedit conf/core-site.xml
<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://localhost:9000</value>
     </property>
</configuration>

3.3 配置conf/hdfs-site.xml

<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
</configuration>

3.4 配置conf/mapred-site.xml

<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>

4.1 ssh配置並將其設定為免密登陸

#安裝gedit和ssh服務

sudo apt-get update
sudo apt-get install gedit

sudo apt-get install ssh
sudo apt-get install openssh-server

#安裝後,可以使用如下命令登陸本機

[email protected]:~$ ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is a6:34:ed:64:8b:7b:2d:6e:6e:0c:97:c3:dc:33:ba:ae.
Are you sure you want to continue connecting (yes/no)?     yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
[email protected]'s password: //輸入你Linux的登陸使用者的密碼
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-31-generic x86_64)
* Documentation: https://help.ubuntu.com/
277 packages can be updated.
183 updates are security updates.

上一步的登入需要輸入密碼,我們需要設定成無密登入。
首先退出剛才的 ssh,就回到了我們原先的終端視窗,然後利用 ssh-keygen 生成金鑰,並將金鑰加入到授權中:

[email protected]:~$ exit
logout
Connection to localhost closed.
[email protected]:~$ cd ~/.ssh/
[email protected]:~/.ssh$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): //回車
Enter passphrase (empty for no passphrase)://回車
Enter same passphrase again: //回車
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
f4:6b:33:97:43:2d:e4:f0:96:ca:e9:79:b2:f6:51:6c [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| . . . |
| . . = + |
| S . B E |
| . * = |
| O = |
| oo=.o |
| .+=. |
+-----------------+
[email protected]:~/.ssh$ cat ./id_rsa.pub >> authorized_keys
[email protected]:~/.ssh$

#再次用ssh localhost登入,就不需要輸入密碼了,如下所示

[email protected]:~/Desktop$ ssh localhost
Welcome to Ubuntu 17.10 (GNU/Linux 4.13 .0-21-generic x86_64)

5. 開始執行

格式化節點

$ bin/hadoop namenode -format(在bin外面的目錄執行)

開啟Hadoop程序

$ bin/start-all.sh

錯誤:

chown: changing ownership of '/home/root1/Hadoop/hadoop-1.2.1/libexec/../logs': Operation not permitted
starting namenode, logging to /home/root1/Hadoop/hadoop-1.2.1/libexec/../logs/hadoop-root1-namenode-ubuntu.out
/home/root1/Hadoop/hadoop-1.2.1/bin/hadoop-daemon.sh: line 137: /home/root1/Hadoop/hadoop-1.2.1/libexec/../logs/hadoop-root1-namenode-ubuntu.out: Permission denied

這是因為你的使用者許可權不夠,此時可以使用sudo chown -hR Eddie(當前使用者名稱) hadoop-xxx(當前版本)增加許可權

也能去外面把hadoop目錄下的資料夾全部變成“平民資料夾”啊哈哈sudo chmod 777 -R ./hadoop-1.2.1(/後面的就是你要改變資料夾的子檔案)

再次執行bin/start-all.sh 即可成功。

jps

6847 SecondaryNameNode
6517 NameNode
6935 JobTracker
7123 TaskTracker
6310 RunJar
7282 Jps
6682 DataNode

上面是偽分散式

單機版本的:

預設情況下,Hadoop配置是以非分散式模式執行,作為一個Java程序,利於除錯

The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory. 
$ mkdir input 
$ cp conf/*.xml input 
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 
$ cat output/*

問題:

INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
18/10/17 23:34:20 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
待解決。晚安