1. 程式人生 > >Spark Standalone Mode安裝配置

Spark Standalone Mode安裝配置

一、Spark下載安裝

官網地址:http://spark.apache.org/downloads.html

[email protected]:/usr/local# tar -zxvf spark-1.6.0-bin-hadoop2.6.tgz
[email protected]:/usr/local# cd spark-1.6.0-bin-hadoop2.6

二、Scala下載安裝

官網地址:http://www.scala-lang.org/download/2.11.7.html

[email protected]:/usr/local# tar -zxvf scala-2.11.7.tgz
配置環境變數:

[email protected]:/usr/local# vi /etc/profile
# 新增下面語句
export SCALA_HOME=/usr/local/scala-2.11.7
export PATH=$SCALA_HOME/bin:$PATH
執行下面命令使其生效:

[email protected]:/usr/local# source /etc/profile

檢查安裝版本:

[email protected]:/usr/local# scala -version
Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL
三、Spark配置

[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6# cd conf
[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6/conf# ls
因為目錄下都是模板檔案,需要從模板複製相應的配置檔案,比如:

[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6/conf# cp spark-env.sh.template spark-env.sh
根據需要可以修改配置檔案內容。

四、啟動Master

[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6# sbin/start-master.sh

預設可以通過:http://localhost:8080開啟Web UI。


五、啟動Worker

同樣地,可以通過下面命令啟動1個或多個workers連線到master:

./sbin/start-slave.sh <master-spark-URL>,比如:

[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6# sbin/start-slave.sh spark://ubuntu:7077

這時重新整理Web介面會看到下面變化:



六、測試

執行下面命令,進入互動控制檯:

./bin/spark-shell --master spark://IP:PORT
[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6# bin/spark-shell --master spark://ubuntu:7077

分別輸入下面語句:

scala> val textFile = sc.textFile("hdfs://hadoop:9000/user/root/input/a.txt")
scala> val counts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
scala> counts.collect()
可以看到輸出結果:

res1: Array[(String, Int)] = Array((iceBox,1), (config,2), (text,1), (world.,1), (ice,2), (hello,2))

注意:必須保證hdfs服務已啟動,並且有上面目錄和檔案。

執行下面程式儲存結果到hdfs:

scala> counts.saveAsTextFile("hdfs://hadoop:9000/user/root/output/test")

七、停止

[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6$ sbin/stop-master.sh 
[email protected]:/usr/local/spark-1.6.0-bin-hadoop2.6# sbin/stop-slave.sh spark://ubuntu:7077