1. 程式人生 > >hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz的叢集搭建(單節點)(Ubuntu系統)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz的叢集搭建(單節點)(Ubuntu系統)

前言

關於幾個疑問和幾處心得!

a.用NAT,還是橋接,還是only-host模式?

b.用static的ip,還是dhcp的?

答:static

c.別認為快照和克隆不重要,小技巧,比別人靈活用,會很節省時間和大大減少錯誤。

d.重用起來指令碼語言的程式設計,如paython或shell程式設計。

 對於用scp -r命令或deploy.conf(配置檔案),deploy.sh(實現檔案複製的shell指令碼檔案),runRemoteCdm.sh(在遠端節點上執行命令的shell指令碼檔案)。

e.重要Vmare Tools增強工具,或者,rz上傳、sz下載。

f.

大多數人常用

用到的所需:

  1、VMware-workstation-full-11.1.2.61471.1437365244.exe

  2、ubuntukylin-14.04-desktop-amd64.iso

  3、jdk-8u60-linux-x64.tar.gz

  4、hadoop-2.6.0.tar.gz

  5、scala-2.10.4.tgz

  6、spark-1.5.2-bin-hadoop2.6.tgz

機器規劃:

  192.168.80.128   ----------------  SparkSignleNode

目錄規劃:

  1、下載目錄  

   /home/spark/Downloads/Spark_Cluster_Software  ----------------    存放所有安裝軟體

2、新建目錄

3、安裝目錄

  jdk-8u60-linux-x64.tar.gz  --------------------------------------------------  /usr/local/jdk/jdk1.8.0_60

  hadoop-2.6.0.tar.gz ----------------------------------------------------------  /usr/local/hadoop/hadoop-2.6.0

  scala-2.10.4.tgz --------------------------------------------------------------- /usr/local/scala/scala-2.10.4

      spark-1.5.2-bin-hadoop2.6.tgz ---------------------------------------------- /usr/local/spark/spark-1.5.2-bin-hadoop2.6

4、快照步驟

  快照一:

    剛安裝完畢,且能連上網

     快照二:

     root使用者的開啟、vim編輯器的安裝、ssh的安裝、靜態IP的設定、/etc/hostname和/etc/hosts和永久關閉防火牆

     SSH安裝完之後的免密碼配置,放在後面
       靜態IP是192.168.80.128
          /etc/hostname是SparkSingleNode
          /etc/hosts是
          192.168.80.128  SparkSingleNode

     快照三:

    安裝jdk、安裝scala、配置SSH免密碼登入、安裝python及ipython (這裡,選擇跳過也可以,ubuntu系統自帶安裝了python)

    新建spark使用者,(即用spark使用者,去安裝jdk、scala、配置SSH免密碼、安裝hadoop、安裝spark...)

   快照四:

    安裝hadoop(沒格式化)、安裝lrzsz、將自己寫好的替換掉預設的配置檔案、建立好目錄

  快照五:

    安裝hadoop(格式化)成功、程序啟動正常

   快照六:

    spark的安裝和配置工作完成

  快照七:

    啟動hadoop、spark叢集成功、檢視50070、8088、8080、4040頁面

第一步:

    安裝VMware-workstation虛擬機器,我這裡是VMware-workstation11版本。

       詳細見 -> 

 第二步:

    安裝ubuntukylin-14.04-desktop系統 (最好安裝英文系統)

    詳細見 ->

第三步:VMware Tools增強工具安裝

    詳細見 ->

  第四步:準備小修改(學會用快照和克隆,根據自身要求情況,合理位置快照) 

    詳細見 ->   

    1、root使用者的開啟(Ubuntu系統,安裝之後預設是沒有root使用者)

    2、vim編輯器的安裝

    3、ssh的安裝(SSH安裝完之後的免密碼配置,放在後面)

    4、靜態IP的設定

           5、/etc/hostname和/etc/hosts

[email protected]:~# sudo cat /etc/hostname

SparkSingleNode

[email protected]:~# sudo cat /etc/hosts

127.0.0.1 localhost

127.0.1.1 zhouls-virtual-machine

192.168.80.128  SparkSingleNode

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

  6、永久關閉防火牆

        一般,在搭建hadoop/spark叢集時,最好是永久關閉防火牆,因為,防火牆會保護其程序間通訊。

[email protected]:~# sudo ufw status

Status: inactive

[email protected]:~#

由此,表明Ubuntu14.04是預設沒開啟防火牆的。

 

      三臺機器都照做!

 這個知識點,模糊了好久。!!!

  生產中,習慣如下:

  useradd,預設會將自身新建使用者,新增到同名的使用者組中。如,useradd zhouls,執行此命令後,預設就新增到同名的zhouls使用者組中。

 但是,在生產中,一般都不這麼幹。通常是,useradd -m -g 。否則,出現到時,使用者建立出來了,但出現家目錄沒有哦。慎重!!!(重要的話,說三次)

###################Ubuntu系統裡###########################

第一步:sudo groupadd  新建使用者組

sudo  groupadd spark   這是建立spark使用者組

第二步:sudo useradd  -m -g   已建立使用者組   新建使用者

sudo  useradd  -m -g spark  spark   這是新建spark使用者和家目錄也建立,並增加到spark組中

第三步:sudo passwd  已建立使用者   

passwd spark    spark使用者密碼

Changing password  for user spark 

New password :

Retype new password:

###################################

  

[email protected]:~# sudo groupadd spark
[email protected]:~# sudo useradd -m -g spark spark
[email protected]:~# sudo passwd spark
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
[email protected]:~# su spark
[email protected]:/root$ cd
[email protected]:~$ pwd
/home/spark
[email protected]:~$

安裝前的思路梳理:

  ***********************************************************************************

  *                                                    *

  *              程式語言  ->   hadoop 叢集  -> spark 叢集                            *

  * 1、安裝jdk                                                 *

  * 2、安裝scala                                              *

  * 3、配置SSH免密碼登入(SparkSingleNode自身)

  * 4、安裝python及ipython (這裡,選擇跳過也可以,ubuntu系統自帶安裝了python)

  * 5、安裝hadoop                                             *

  * 6、安裝spark                                             *

  * 7、啟動叢集                                              *

  * 8、檢視頁面                                              *

  * 9、成功(記得快照)                                                *

  *******************************************************

    用wget命令線上下載,養成習慣,放到/home/spark/Downloads/Spark_Cluster_Software/目錄下,或者,安裝了Vmare增強工具Tools,直接拖進去。也可以。

 

 

一、安裝jdk  

  jdk-8u60-linux-x64.tar.gz  --------------------------------------------------  /usr/local/jdk/jdk1.8.0_60

    1、jdk-8u60-linux-x64.tar.gz的下載

    

    2、jdk-8u60-linux-x64.tar.gz的上傳

  

  

  三臺機器都照做!

  3、首先,檢查Ubuntu系統的自帶openjdk

  

[email protected]:~$ java -version
The program 'java' can be found in the following packages:
* default-jre
* gcj-4.8-jre-headless
* openjdk-7-jre-headless
* gcj-4.6-jre-headless
* openjdk-6-jre-headless
Ask your administrator to install one of them
[email protected]:~$ sudo apt-get purge openjdk*
[sudo] password for spark:
spark is not in the sudoers file. This incident will be reported.
[email protected]:~$

  由此,可見,此Ubuntu系統,沒有自帶的openjdk。

出現了, XXX 使用者 is not in the sudoers file. This incident will be reported 的問題?

     解決辦法:

[email protected]:~$ sudo apt-get purge openjdk*
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'openjdk-jre' for regex 'openjdk*'
Note, selecting 'openjdk-6-jre-lib' for regex 'openjdk*'
Note, selecting 'openjdk-7' for regex 'openjdk*'
Note, selecting 'openjdk-6-jdk' for regex 'openjdk*'
Note, selecting 'openjdk-7-jre-zero' for regex 'openjdk*'
Note, selecting 'openjdk-6-source' for regex 'openjdk*'
Note, selecting 'openjdk-6-jre-headless' for regex 'openjdk*'
Note, selecting 'openjdk-6-dbg' for regex 'openjdk*'
Note, selecting 'openjdk-7-jdk' for regex 'openjdk*'
Note, selecting 'openjdk-7-jre-headless' for regex 'openjdk*'
Note, selecting 'openjdk-6-jre' for regex 'openjdk*'
Note, selecting 'openjdk-7-dbg' for regex 'openjdk*'
Note, selecting 'openjdk-7-jre-lib' for regex 'openjdk*'
Note, selecting 'uwsgi-plugin-jvm-openjdk-6' for regex 'openjdk*'
Note, selecting 'uwsgi-plugin-jvm-openjdk-7' for regex 'openjdk*'
Note, selecting 'openjdk-6-doc' for regex 'openjdk*'
Note, selecting 'openjdk-7-jre' for regex 'openjdk*'
Note, selecting 'openjdk-7-source' for regex 'openjdk*'
Note, selecting 'openjdk-6-jre-zero' for regex 'openjdk*'
Note, selecting 'openjdk-7-demo' for regex 'openjdk*'
Note, selecting 'openjdk-7-doc' for regex 'openjdk*'
Note, selecting 'openjdk-6-demo' for regex 'openjdk*'
Note, selecting 'uwsgi-plugin-jwsgi-openjdk-6' for regex 'openjdk*'
Note, selecting 'uwsgi-plugin-jwsgi-openjdk-7' for regex 'openjdk*'
Package 'openjdk-7' is not installed, so not removed
Package 'openjdk-jre' is not installed, so not removed
Package 'uwsgi-plugin-jvm-openjdk-6' is not installed, so not removed
Package 'uwsgi-plugin-jvm-openjdk-7' is not installed, so not removed
Package 'uwsgi-plugin-jwsgi-openjdk-6' is not installed, so not removed

4、現在,新建/usr/loca/下的jdk目錄

  

[email protected]:~$ su root
Password:
[email protected]:/home/spark# cd
[email protected]:~# mkdir -p /usr/local/jdk
[email protected]:~# cd /usr/local/jdk/
[email protected]:/usr/local/jdk# ls
[email protected]:/usr/local/jdk#

 5、將下載的jdk檔案移到剛剛建立的/usr/local/jdk下

  

[email protected]:/usr/local/jdk# su spark
[email protected]:/usr/local/jdk$ sudo cp /home/spark/Downloads/Spark_Cluster_Software/jdk-8u60-linux-x64.tar.gz /usr/local/jdk/
[email protected]:/usr/local/jdk$ cd /usr/local/jdk/
[email protected]:/usr/local/jdk$ ls
jdk-8u60-linux-x64.tar.gz
[email protected]:/usr/local/jdk$

   最好用cp,不要輕易要mv

  6、解壓jdk檔案

    

[email protected]:/usr/local/jdk$ ll
total 177000
drwxr-xr-x 2 root root 4096 9月 9 09:34 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
-rwxr--r-- 1 root root 181238643 9月 9 09:34 jdk-8u60-linux-x64.tar.gz*
[email protected]:/usr/local/jdk$ su root
Password:
[email protected]:/usr/local/jdk# ll
total 177000
drwxr-xr-x 2 root root 4096 9月 9 09:34 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
-rwxr--r-- 1 root root 181238643 9月 9 09:34 jdk-8u60-linux-x64.tar.gz*
[email protected]:/usr/local/jdk# ls
jdk-8u60-linux-x64.tar.gz
[email protected]:/usr/local/jdk# tar -zxvf jdk-8u60-linux-x64.tar.gz

  7、刪除解壓包,留下解壓完成的檔案目錄,並修改許可權(這是最重要的!)

  

[email protected]:/usr/local/jdk# ll
total 177004
drwxr-xr-x 3 root root 4096 9月 9 09:54 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
drwxr-xr-x 8 uucp 143 4096 8月 5 2015 jdk1.8.0_60/
-rwxr--r-- 1 root root 181238643 9月 9 09:34 jdk-8u60-linux-x64.tar.gz*
[email protected]:/usr/local/jdk# ls
jdk1.8.0_60 jdk-8u60-linux-x64.tar.gz
[email protected]:/usr/local/jdk# rm -rf jdk-8u60-linux-x64.tar.gz
[email protected]:/usr/local/jdk# ls
jdk1.8.0_60
[email protected]:/usr/local/jdk# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 09:55 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
drwxr-xr-x 8 uucp 143 4096 8月 5 2015 jdk1.8.0_60/
[email protected]:/usr/local/jdk#

[email protected]:/usr/local/jdk# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 09:55 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
drwxr-xr-x 8 uucp 143 4096 8月 5 2015 jdk1.8.0_60/
[email protected]:/usr/local/jdk# chown -R spark:spark jdk1.8.0_60/
[email protected]:/usr/local/jdk# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 09:55 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
drwxr-xr-x 8 spark spark 4096 8月 5 2015 jdk1.8.0_60/
[email protected]:/usr/local/jdk# su spark
[email protected]:/usr/local/jdk$ ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 09:55 ./
drwxr-xr-x 11 root root 4096 9月 9 09:07 ../
drwxr-xr-x 8 spark spark 4096 8月 5 2015 jdk1.8.0_60/
[email protected]:/usr/local/jdk$

***********************************************

  chown -R 使用者組:使用者  檔案

  一般,我們也可以在之前,新建使用者組時,為sparkuser,然後,它裡面的使用者,有spark1,spark2...

  那麼,對應就是, chown -R sparkuser:spark1 jdk1.8.0_60

  如,對hadoop-2.6.0.tar.gz的被解壓檔案,做許可權修改。

  chown -R hduser:hadoop hadoop-2.6.0

**********************************************

8、修改環境變數 

  vim ~./bash_profile   或 vim /etc/profile

  配置在這個檔案~/.bash_profile,或者也可以,配置在那個全域性的檔案裡,也可以喲。/etc/profile。

  這裡,我vim /etc/profile

  

#java
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin

  

[email protected]:/usr/local/jdk/jdk1.8.0_60# vim /etc/profile
[email protected]e:/usr/local/jdk/jdk1.8.0_60# source /etc/profile
[email protected]:/usr/local/jdk/jdk1.8.0_60# java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
[email protected]:/usr/local/jdk/jdk1.8.0_60#

[email protected]:/usr/local/jdk/jdk1.8.0_60$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
[email protected]:/usr/local/jdk/jdk1.8.0_60$

  至此,表明java安裝結束。

  其他兩臺都照做!

 二、安裝scala

    scala-2.10.4.tgz --------------------------------------------------------------- /usr/local/scala/scala-2.10.4

  1、scala的下載

  

  2、scala-2.10.4.tgz 的上傳

   

  其他兩臺都照做!

  3、現在,新建/usr/loca/下的sacla目錄

    

[email protected]:/usr/local# pwd
/usr/local
[email protected]:/usr/local# mkdir -p /usr/local/scala
[email protected]:/usr/local#

  4、將下載的scala檔案移到剛剛建立的/usr/local/scala下

  

[email protected]:/usr/local/scala# pwd
/usr/local/scala
[email protected]:/usr/local/scala# ls
[email protected]:/usr/local/scala# sudo cp /home/spark/Downloads/Spark_Cluster_Software/scala-2.10.4.tgz /usr/local/scala/
[email protected]:/usr/local/scala# ls
scala-2.10.4.tgz
[email protected]:/usr/local/scala#

最好用cp,不要輕易要mv

   5、解壓scala檔案

  

[email protected]:/usr/local/scala# pwd
/usr/local/scala
[email protected]:/usr/local/scala# ls
scala-2.10.4.tgz
[email protected]:/usr/local/scala# ll
total 29244
drwxr-xr-x 2 root root 4096 9月 9 10:15 ./
drwxr-xr-x 12 root root 4096 9月 9 10:14 ../
-rwxr--r-- 1 root root 29937534 9月 9 10:15 scala-2.10.4.tgz*
[email protected]:/usr/local/scala# tar -zxvf scala-2.10.4.tgz

  6、刪除解壓包,留下解壓完成的檔案目錄,並修改許可權(這是最重要的!!!

  

[email protected]:/usr/local/scala# ls
scala-2.10.4 scala-2.10.4.tgz
[email protected]:/usr/local/scala# ll
total 29248
drwxr-xr-x 3 root root 4096 9月 9 10:17 ./
drwxr-xr-x 12 root root 4096 9月 9 10:14 ../
drwxrwxr-x 9 2000 2000 4096 3月 18 2014 scala-2.10.4/
-rwxr--r-- 1 root root 29937534 9月 9 10:15 scala-2.10.4.tgz*
[email protected]:/usr/local/scala# rm -rf scala-2.10.4.tgz
[email protected]:/usr/local/scala# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 10:18 ./
drwxr-xr-x 12 root root 4096 9月 9 10:14 ../
drwxrwxr-x 9 2000 2000 4096 3月 18 2014 scala-2.10.4/
[email protected]:/usr/local/scala# chown -R spark:spark scala-2.10.4/
[email protected]:/usr/local/scala# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 10:18 ./
drwxr-xr-x 12 root root 4096 9月 9 10:14 ../
drwxrwxr-x 9 spark spark 4096 3月 18 2014 scala-2.10.4/
[email protected]:/usr/local/scala#

 7、修改環境變數 

  vim ~./bash_profile   或 vim /etc/profile

  配置在這個檔案~/.bash_profile,或者也可以,配置在那個全域性的檔案裡,也可以喲。/etc/profile。

  這裡,我vim /etc/profile

  #scala
  export SCALA_HOME=/usr/local/scala/scala-2.10.4
  export PATH=$PATH:$SCALA_HOME/bin

[email protected]:/usr/local/scala/scala-2.10.4# vim /etc/profile
[email protected]:/usr/local/scala/scala-2.10.4# source /etc/profile
[email protected]:/usr/local/scala/scala-2.10.4# scala -version
Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL
[email protected]:/usr/local/scala/scala-2.10.4#

至此,表明scala安裝結束。

其他兩臺都照做!

8、輸入scala命令,可直接進入scala的命令列互動介面。

[email protected]:/usr/local/scala/scala-2.10.4# scala
Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60).
Type in expressions to have them evaluated.
Type :help for more information.

scala> 9*9
res0: Int = 81

scala> exit;
warning: there were 1 deprecation warning(s); re-run with -deprecation for details
[email protected]:/usr/local/scala/scala-2.10.4#

三、配置免密碼登入

  1、配置SSH實現無密碼驗證配置,首先切換到剛建立的spark使用者下。

    因為,我後續,是先搭建hadoop叢集,在其基礎上,再搭建spark叢集,目的,是在spark使用者下操作進行的。

  所以,在這裡,要梳理下的是,root和zhouls,都是管理員許可權。在生產環境裡,一般是不會動用這兩個管理員使用者的。

  由於spark需要無密碼登入作為worker的節點,而由於部署單節點的時候,當前節點既是master又是worker,所以此時需要生成無密碼登入的ssh。方法如下:

   

[email protected]:/usr/local/scala/scala-2.10.4# cd
[email protected]:~# su spark
[email protected]:/root$ cd
[email protected]:~$ pwd
/home/spark
[email protected]:~$

  2 、建立.ssh目錄,生成金鑰

  mkdir .ssh

  ssh-keygen -t rsa    注意,ssh與keygen之間是沒有空格的

 

[email protected]:~$ mkdir .ssh
[email protected]:~$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/spark/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/spark/.ssh/id_rsa.
Your public key has been saved in /home/spark/.ssh/id_rsa.pub.
The key fingerprint is:
85:28:3f:f3:5b:47:3a:1d:bb:ed:6c:59:af:3e:9f:6b [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| |
| . . |
| . . . . |
| o . |
| + S o |
| + + o .|
| . + + o.|
| o o ++Eo|
| . .+B*o|
+-----------------+
[email protected]:~$

  3 、切換到.ssh目錄下,進行檢視公鑰和私鑰

  cd .ssh

  ls

  

[email protected]:~$ cd .ssh
[email protected]:~/.ssh$ ls
id_rsa id_rsa.pub
[email protected]:~/.ssh$

  4、將公鑰複製到日誌檔案裡。檢視是否複製成功

  cp id_rsa.pub authorized_keys             

  ls

   

[email protected]:~/.ssh$ cp id_rsa.pub authorized_keys
[email protected]:~/.ssh$ ls
authorized_keys id_rsa id_rsa.pub
[email protected]:~/.ssh$

  5、檢視日記檔案具體內容

[email protected]:~/.ssh$ cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCqLwZVCWJOQT57Y9MAYw8YJtzqvJTnBob656jvKgLSaM5X8/cikS0HHGlfNqzldbP03+Z6ZrpaF2hyEV1v43kOhlqA9SFwTVhzbPzou2K0e7mgCjJlM4PQMOSZY+DUlHn08hDxdbgAhczj6pix4VNSORg2nBRLvk1CDFYSiviv+FRTxy4IhYfG0M74fOE/9jHnbXKNRmryexzSwEylVqISQFmt5X5ksqurTsIxc2M70mGnkoTAVNOMC/qNVw98FsTBwFLT9J8X3vtic7nn5PjLNi/Khyc/vOhiDpzRsJJ7r7BuaKvd/ENIu9WAjvSGvJKLfqx6SSGcociom7ol1S/Z [email protected]
[email protected]:~/.ssh$

  6、退回到/home/spark/,來賦予許可權

  cd ..

  chmod 700 .ssh    將.ssh資料夾的許可權賦予700

  chmod 600 .ssh/*      將.ssh資料夾裡面的檔案(id_rsa、id_rsa.pub、authorized_keys)的許可權賦予600

  

[email protected]:~/.ssh$ cd ..
[email protected]:~$ pwd
/home/spark
[email protected]:~$ chmod 700 .ssh
[email protected]:~$ chmod 600 .ssh/*
[email protected]:~$

  7、測試ssh無密碼訪問

[email protected]:~$ ssh SparkSingleNode
The authenticity of host 'sparksinglenode (192.168.80.128)' can't be established.
ECDSA key fingerprint is c7:ae:2f:38:e6:88:6f:ed:ee:f0:14:d8:98:f4:9e:3b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparksinglenode,192.168.80.128' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

Last login: Fri Sep 9 08:51:53 2016 from 192.168.80.1
$ pwd
/home/spark

[email protected]:~$ ssh SparkSingleNode
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

Last login: Fri Sep 9 10:35:48 2016 from sparksinglenode
$ exit;
Connection to sparksinglenode closed.
[email protected]:~$

  四、安裝python及ipython (這裡,選擇跳過也可以,ubuntu系統自帶安裝了python)

    預設的安裝目錄,是在/usr/lib/下

      

[email protected]:~$ sudo apt-get install python ipython -y
Reading package lists... Done
Building dependency tree
Reading state information... Done
python is already the newest version.
The following extra packages will be installed:
python-decorator python-simplegeneric
Suggested packages:
ipython-doc ipython-notebook ipython-qtconsole python-matplotlib python-numpy python-zmq
The following NEW packages will be installed:
ipython python-decorator python-simplegeneric
0 upgraded, 3 newly installed, 0 to remove and 740 not upgraded.
Need to get 619 kB of archives.
After this operation, 3,436 kB of additional disk space will be used.
Get:1 http://cn.archive.ubuntu.com/ubuntu/ trusty/main python-decorator all 3.4.0-2build1 [19.2 kB]
Get:2 http://cn.archive.ubuntu.com/ubuntu/ trusty/main python-simplegeneric all 0.8.1-1 [11.5 kB]
Get:3 http://cn.archive.ubuntu.com/ubuntu/ trusty/universe ipython all 1.2.1-2 [588 kB]
Fetched 619 kB in 31s (19.8 kB/s)
Selecting previously unselected package python-decorator.
(Reading database ... 147956 files and directories currently installed.)
Preparing to unpack .../python-decorator_3.4.0-2build1_all.deb ...
Unpacking python-decorator (3.4.0-2build1) ...
Selecting previously unselected package python-simplegeneric.
Preparing to unpack .../python-simplegeneric_0.8.1-1_all.deb ...
Unpacking python-simplegeneric (0.8.1-1) ...
Selecting previously unselected package ipython.
Preparing to unpack .../ipython_1.2.1-2_all.deb ...
Unpacking ipython (1.2.1-2) ...
Processing triggers for man-db (2.6.7.1-1) ...
Processing triggers for hicolor-icon-theme (0.13-1) ...
Processing triggers for gnome-menus (3.10.1-0ubuntu2) ...
Processing triggers for desktop-file-utils (0.22-1ubuntu1) ...
Processing triggers for bamfdaemon (0.5.1+14.04.20140409-0ubuntu1) ...

Rebuilding /usr/share/applications/bamf-2.index...
Processing triggers for mime-support (3.54ubuntu1) ...
Setting up python-decorator (3.4.0-2build1) ...
Setting up python-simplegeneric (0.8.1-1) ...
Setting up ipython (1.2.1-2) ...
[email protected]:~$

   測試是否安裝成功

  

[email protected]:~$ python --version
Python 2.7.6
[email protected]:~$ ipython --version
1.2.1
[email protected]:~$

  同時,對ipython,想說的是。

  IPYTHON and IPYTHON_OPTS are removed in Spark 2.0+ . Remove these from the environment and set PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS instead .

   在任何路徑下,都可以執行python。

[email protected]:~$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
[email protected]:~$ cd /usr/local/
[email protected]:/usr/local$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
[email protected]:/usr/local$ cd /usr/lib/
[email protected]:/usr/lib$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
[email protected]:/usr/lib$

  

   五、安裝hadoop

    hadoop-2.6.0.tar.gz ----------------------------------------------------------  /usr/local/hadoop/hadoop-2.6.0

  1、hadoop的下載

  

  2、hadoop-2.6.0.tar.gz的上傳

        

  3、現在,新建/usr/loca/下的hadoop目錄

          

[email protected]:/usr/local# pwd
/usr/local
[email protected]:/usr/local# mkdir -p /usr/local/hadoop
[email protected]:/usr/local# ls
bin etc games hadoop include jdk lib man sbin scala share src
[email protected]:/usr/local# cd hadoop/
[email protected]:/usr/local/hadoop# pwd
/usr/local/hadoop
[email protected]:/usr/local/hadoop# ls
[email protected]:/usr/local/hadoop#

  4、將下載的hadoop檔案移到剛剛建立的/usr/local/hadoop下

   

  最好用cp,不要輕易要mv

[email protected]:/usr/local/hadoop# sudo cp /home/spark/Downloads/Spark_Cluster_Software/hadoop-2.6.0.tar.gz /usr/local/hadoop/
[email protected]:/usr/local/hadoop# ls
hadoop-2.6.0.tar.gz
[email protected]:/usr/local/hadoop#

  5、解壓hadoop檔案

  

[email protected]:/usr/local/hadoop# ls
hadoop-2.6.0.tar.gz
[email protected]:/usr/local/hadoop# tar -zxvf hadoop-2.6.0.tar.gz

6、刪除解壓包,留下解壓完成的檔案目錄

      並修改所屬的使用者組和使用者(這是最重要的!)

  

[email protected]ingleNode:/usr/local/hadoop# ls
hadoop-2.6.0 hadoop-2.6.0.tar.gz
[email protected]:/usr/local/hadoop# rm -rf hadoop-2.6.0.tar.gz
[email protected]:/usr/local/hadoop# ls
hadoop-2.6.0
[email protected]:/usr/local/hadoop# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 11:33 ./
drwxr-xr-x 13 root root 4096 9月 9 11:28 ../
drwxr-xr-x 9 20000 20000 4096 11月 14 2014 hadoop-2.6.0/
[email protected]:/usr/local/hadoop# chown -R spark:spark hadoop-2.6.0/
[email protected]:/usr/local/hadoop# ll
total 12
drwxr-xr-x 3 root root 4096 9月 9 11:33 ./
drwxr-xr-x 13 root root 4096 9月 9 11:28 ../
drwxr-xr-x 9 spark spark 4096 11月 14 2014 hadoop-2.6.0/
[email protected]:/usr/local/hadoop#

7、修改環境變數 

    vim ~./bash_profile   或 vim /etc/profile

    配置在這個檔案~/.bash_profile,或者也可以,配置在那個全域性的檔案裡,也可以喲。/etc/profile。

    這裡,我vim /etc/profile

  

#hadoop
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

[email protected]:/usr/local/hadoop# vim /etc/profile
[email protected]:/usr/local/hadoop# source /etc/profile
[email protected]:/usr/local/hadoop# hadoop version
Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a
This command was run using /usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar
[email protected]:/usr/local/hadoop#

 至此,表明hadoop安裝結束。

  配置hadoop的配置檔案

    經驗起見,一般都是在NotePad++裡,弄好,丟上去。

  

  在windows裡解壓,開啟它的配置,寫好。

核心

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://SparkSingleNode:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/hadoop-2.6.0/tmp</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
  </property>
</configuration>

    上面配置的是,因為在hadoop1.0中引入了安全機制,所以從客戶端發出的作業提交者全變成了hadoop,不管原始提交者是哪個使用者,為了解決該問題,引入了安全違章功能,允許一個超級使用者來代替其他使用者來提交作業或者執行命令,而對外來看,執行者仍然是普通使用者。所以 ,配置設為任意客戶端 和  配置設為任意使用者組 。

 儲存

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>dfs.permissions</name>
    <value>false</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/usr/local/hadoop/hadoop-2.6.0/dfs/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/usr/local/hadoop/hadoop-2.6.0/dfs/data</value>
  </property>
 </configuration>

計算

  變成  

相關推薦

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz叢集搭建節點Ubuntu系統

前言 關於幾個疑問和幾處心得! a.用NAT,還是橋接,還是only-host模式? b.用static的ip,還是dhcp的? 答:static c.別認為快照和克隆不重要,小技巧,比別人靈活用,會很節省時間和大大減少錯誤。 d.重用起來指令碼語言的程式設計,如paython

hadoop-2.6.0.tar.gz + spark-1.6.1-bin-hadoop2.6.tgz叢集搭建節點CentOS系統

前言 關於幾個疑問和幾處心得! a.用NAT,還是橋接,還是only-host模式? b.用static的ip,還是dhcp的? 答:static c.別認為快照和克隆不重要,小技巧,比別人靈活用,會很節省時間和大大減少錯誤。 d.重用起來指令碼語言

Spark on YARN模式的安裝spark-1.6.1-bin-hadoop2.6.tgz + hadoop-2.6.0.tar.gzmaster、slave1和slave2博主推薦

說白了   Spark on YARN模式的安裝,它是非常的簡單,只需要下載編譯好Spark安裝包,在一臺帶有Hadoop YARN客戶端的的機器上執行即可。    Spark on YARN分為兩種: YARN cluster(YARN standalone,0.9版本以前)和 YA

Spark專案之環境搭建單機四 sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 安裝

上傳解壓 sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz,重新命名 tar -zxf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop 進入sqoop

hadoop-2.6.0.tar.gz叢集搭建3節點不含zookeeper叢集安裝

前言 關於幾個疑問和幾處心得! a.用NAT,還是橋接,還是only-host模式? b.用static的ip,還是dhcp的? 答:static c.別認為快照和克隆不重要,小技巧,比別人靈活用,會很節省時間和大大減少錯誤。 d.重用起來指令碼語言的程式設計,如paython或s

CentOS7.0安裝配置hadoop2.7.0 資源準備 資源下載: hadoop-2.7.0.tar.gz 密碼:727y jdk-8u45-linux-x64.tar.gz 密碼:d8bm

CentOS7.0安裝配置hadoop2.7.0 資源準備 資源下載: 注意事項: 如果自己下載資源的話,注意hadoop,jdk,centos都應該是64位或者32位的,以免出現無法預料的錯誤,上面的資源都是64位的我是在mac下配置的,virtual box是ios x系統的,如果是其它系統的另

Spark 1.5.2 on yarn升級問題總結

1    升級背景 standlone 生產叢集運行了半年,出現資源瓶頸;另外多使用者資源管理問題也凸顯,將spark 遷移到 yarn 上面是目前比較理想的方案。 spark on yarn 有如下兩個優點:

Spark 1.5.2(Scala 2.11版本的編譯與安裝

Spark於11月9號又將幾個BUG解決之後,release一個較新的版本。作為spark的追隨者,於是開始重新進行spark的編譯。 有了前面的編譯經驗和之前下載好的java類包,花了大概一分鐘就編譯妥當,於是重新部署配置一下,馬上OK。簡直是高效率。 對於scala

Hadoop-2.6.0+Zookeeper-3.4.6+Spark-1.5.0+Hbase-1.1.2+Hive-1.2.0叢集搭建

前言 本部落格目的在於跟大家分享大資料平臺搭建過程,是筆者半年的結晶。在大資料搭建過程中,希望能給大家提過一些幫助,這也是本部落格的

用pycharm + python寫sparkspark-2.0.1-bin-hadoop2.6

一、將pyspark放入: 該目錄位置(我的是mac): /Library/Python/2.7/site-packages 二、env配置: 步驟1: 步驟2: 步驟3: SPARK_CLASSPATH /Users/Chave

Tomcat version 6.0 only supports J2EE 1.2, 1.3, 1.4, and Java EE 5 Web modules

time module clip modules 搜索 set 版本信息 ace 發現 本周開發中遇到了一個項目無法發布的問題 網上搜索到http://www.cnblogs.com/chanedi/articles/2112477.html這位同行的博客,順利解決問題,

搭建Ambari 2.6.0 tar 解壓縮報錯

pos span 使用 解決方案 centos7 .gz res val unzip 背景:我們使用的方式不是wget 去下載ambari的源碼包,而是在windows 的 firefox 下直接下載,將文件存儲到本地。 執行 tar -zxvf HDP-2.6.3.0-c

Win10 安裝Oracle資料庫出現報錯異常——正在檢查作業系統要求...要求結果:5.0,5.1,5.2.....實際結果:6.2

出現的問題: 我前後兩次在win10系統下安裝oracle 10g 時,都出現這個異常:  正在檢查作業系統要求…  要求的結果: 5.0,5.1,5.2,6.0,6.1 之一  實際結果: 6.2  檢查完成。此次檢查的總體結果為: 失敗 <<<&

win64bit安裝oracle 10g版本檢查未通過解決 提示要求的結果: 5.0,5.1,5.2,6.0 之一 實際結果: 6.1

在WIN7上安裝oracle 10g時,提示如下資訊: 正在檢查作業系統要求...  要求的結果: 5.0,5.1,5.2,6.0 之一  實際結果: 6.1  檢查完成。此次檢查的總體結果為: 失敗 <<<<  問題: Oracle Databas

安裝spark-1.5.0-cdh5.5.2所踩過的坑

我一開始想安裝spark-1.5.0-cdh5.5.2的standalone模式,於是乎(已安裝有hadoop叢集):[[email protected] ~]$ tar -zxvf spark-1.5.0-cdh5.5.2.tar.gz[[email p

jsoncpp-src-0.5.0.tar.gz 源碼錯誤!!!!

post 定位 tar.gz 浪費 試用 cpp 解析 能夠 error 近期在做畢設,使用到了JsonCpp0.5.0版本號的源碼!依照網上的安裝配置教程,搭建好環境後就能夠使用了! 在這裏就不浪費空間去將怎樣搭建開發環境了!請大家去google一下就好了!

phpmyadmin+mysql-5.6.16.tar.gz使用

phpmyadmin+mysql-5.6Phpmyadmin的使用首先需要lamp或者是lnmp環境: 一.首先搭建phpmyadmin運行環境 現在以lamp環境為例:安裝部署lamp環境 1.安裝Apache下載安裝 yum install zlib-devel -y wget http://mirror

以yarn client和分散式叢集方式執行spark-2.3.2-bin-hadoop2.6

一以分散式叢集執行 修改配置檔案/hadoop_opt/spark-2.3.2-bin-hadoop2.6/conf/spark-env.sh export HADOOP_CONF_DIR=/hadoop_opt/hadoop-2.6.1/etc/hadoop expo

springboot 1.5.2升級2.0.4 mongodb中QueryBuilder中DBObject被棄用,改為Document構造及解決方式

今天封裝mongo工具包,發現QueryBuilder中DBObject被棄用,改為Document構造。為什麼,我們稍微分析一下  Document實現Map,與基本的DBObject相比,可編寫的程式碼更少 DBObject雖然不被推薦,那些從2.x驅動程式系列可能繼續使用DB

spark-2.3.2-bin-hadoop2.6執行在yarn client上

修改配置檔案/hadoop_opt/spark-2.3.2-bin-hadoop2.6/conf/spark-env.sh export HADOOP_CONF_DIR=/hadoop_opt/hadoop-2.6.1/etc/hadoop export J