1. 程式人生 > >在docker上安裝 Spark 1.2.0

在docker上安裝 Spark 1.2.0

好久沒有寫部落格了,最近有點時間打算寫點。

1.什麼docker

Docker 是一個開源專案,誕生於 2013年初,最初是 dotCloud 公司內部的一個業餘專案。它基於 Google 公司推出的 Go 語言實現。 專案後來加入了 Linux 基金會,遵從了 Apache 2.0 協議,專案程式碼在 GitHub上進行維護。

Docker 自開源後受到廣泛的關注和討論,以至於 dotCloud公司後來都改名為 Docker Inc。Redhat 已經在其 RHEL6.5 中集中支援 Docker;Google 也在其 PaaS 產品中廣泛應用。

Docker 專案的目標是實現輕量級的作業系統虛擬化解決方案。 Docker

的基礎是 Linux 容器(LXC)等技術。

在 LXC 的基礎上 Docker 進行了進一步的封裝,讓使用者不需要去關心容器的管理,使得操作更為簡便。使用者操作 Docker 的容器就像操作一個快速輕量級的虛擬機器一樣簡單。

下面的圖片比較了 Docker 和傳統虛擬化方式的不同之處,可見容器是在作業系統層面上實現虛擬化,直接複用本地主機的作業系統,而傳統方式則是在硬體層面實現。

 

2.為什麼要用docder

作為一種新興的虛擬化方式,Docker 跟傳統的虛擬化方式相比具有眾多的優勢。

首先,Docker 容器的啟動可以在秒級實現,這相比傳統的虛擬機器方式要快得多。 其次,Docker對系統資源的利用率很高,一臺主機上可以同時執行數千個 Docker 容器。

容器除了執行其中應用外,基本不消耗額外的系統資源,使得應用的效能很高,同時系統的開銷儘量小。傳統虛擬機器方式執行 10 個不同的應用就要起 10 個虛擬機器,而Docker 只需要啟動 10 個隔離的應用即可。

具體說來,Docker 在如下幾個方面具有較大的優勢。

更快速的交付和部署

對開發和運維(devop)人員來說,最希望的就是一次建立或配置,可以在任意地方正常執行。

開發者可以使用一個標準的映象來構建一套開發容器,開發完成之後,運維人員可以直接使用這個容器來部署程式碼。 Docker 可以快速建立容器,快速迭代應用程式,並讓整個過程全程可見,使團隊中的其他成員更容易理解應用程式是如何建立和工作的。 Docker 容器很輕很快!容器的啟動時間是秒級的,大量地節約開發、測試、部署的時間。

更高效的虛擬化

Docker 容器的執行不需要額外的 hypervisor 支援,它是核心級的虛擬化,因此可以實現更高的效能和效率。

更輕鬆的遷移和擴充套件

Docker 容器幾乎可以在任意的平臺上執行,包括物理機、虛擬機器、公有云、私有云、個人電腦、伺服器等。這種相容性可以讓使用者把一個應用程式從一個平臺直接遷移到另外一個。

更簡單的管理

使用 Docker,只需要小小的修改,就可以替代以往大量的更新工作。所有的修改都以增量的方式被分發和更新,從而實現自動化並且高效的管理。

對比傳統虛擬機器總結

特性

容器

虛擬機器

啟動

秒級

分鐘級

硬碟使用

一般為 MB

一般為 GB

效能

接近原生

弱於

系統支援量

單機支援上千個容器

一般幾十個


3.CentOS 系列安裝 Docker

CentOS7系統 CentOS-Extras 庫中已帶 Docker,可以直接安裝:

$sudo yum install docker

安裝之後啟動 Docker 服務,並讓它隨系統啟動自動載入。

$sudo service docker start

$sudo chkconfig docker on

4.安裝Spark

在當前的文章中,我們想幫助你開始用docker安裝最新的是Spark- 1.2.0。

Docker和Spark是最近炒作非常火的兩種技術。所以我們把Spark和Docker放在一起,容器的程式碼是我們的GitHub庫中找到.

4.1從Docker倉庫拉取映象

[[email protected] ~]# docker pullsequenceiq/spark:1.2.0

Pulling repository sequenceiq/spark

334aabfef5f1: Pulling dependent layers

89b52f216c6c: Download complete

0dd5f7a357f5: Download complete

ae2537991743: Download complete

b38f87063c35: Download complete

36bf8ea12ad2: Download complete

c605a0ffb1d4: Download complete

0bd9464ce7fd: Download complete

7b5528f018cf: Download complete

e8f8ccba56cc: Download complete

d3808d6c73c4: Download complete

36fa609d2102: Download complete

5258b4da874d: Download complete

0bd02d3d7a4b: Download complete

bbad7d38a70e: Download complete

c6fbec816602: Download complete

3f5e48be180b: Download complete

ef4e09c06ac5: Download complete

334aabfef5f1: Download complete

ee2f8cf16677: Download complete

70c2821718e6: Download complete

0b0f13b6c16b: Download complete

8a17a79e13f5: Download complete

d2d8a13706fd: Download complete

dde2d8f01c66: Download complete

0165d67b327e: Download complete

afcddf83915d: Download complete

e0786d842672: Download complete

5c3542c1d6d2: Download complete

c04119d3b78c: Download complete

e2a6f40fbee4: Download complete

7c5e5f584526: Download complete

bbfe93940f8c: Download complete

0dae8995a865: Download complete

bd0a4bca6161: Download complete

5c09c81ffffd: Download complete

89b0655a34d7: Download complete

d2ca8f2c26eb: Download complete

aced545fc0a4: Download complete

82a5db38e8f3: Download complete

cc7d6c137a30: Download complete

f52a6540835d: Download complete

aa33b1563fe1: Download complete

944a6e9c3824: Download complete

f0ec3c14378c: Download complete

48ac51d3df99: Download complete

abfbbcb93f01: Download complete

e1f3493e6f14: Download complete

83ca5ab18a47: Download complete

63966e034d6e: Download complete

8aebb7338718: Download complete

0da4a51ce952: Download complete

2e2ffaf055bc: Download complete

dcdbfb337435: Download complete

865c6212c08c: Download complete

8c791638517c: Download complete

8ef7e34a3049: Download complete

873131a3f2d7: Download complete

944971358eb0: Download complete

d828dda7ad02: Download complete

04cecee6f836: Download complete

42460f40dc71: Download complete

4f1e85a3c877: Download complete

3f212fb7286c: Download complete

5b4955b94732: Download complete

83308e1cae94: Download complete

55bf8341ea4d: Download complete

2f5f4034cbe9: Download complete

6a2c6e8b5d08: Download complete

6047f6052c38: Download complete

Status: Downloaded newer image forsequenceiq/spark:1.2.0

這個過程時間比較長,映象檔案大概2G左右。我打算將映象匯出到本地檔案,然後上傳百度盤,方便大家下載。

然後可以使用 docker load 從匯出的本地檔案中再匯入到本地映象庫,例如

 sudo docker load --input spark.tar

4.2執行映象

一旦從docker倉庫拉取完了映象就可以執行啦。 [[email protected] ~]# docker run -i -t -h sandbox sequenceiq/spark:1.2.0 /etc/bootstrap.sh -bash
/
Starting sshd:                                             [  OK  ]
Starting namenodes on [sandbox]
sandbox: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-sandbox.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-sandbox.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-sandbox.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-sandbox.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-sandbox.out
bash-4.1# jps
304 SecondaryNameNode
625 Jps
505 ResourceManager
188 DataNode
112 NameNode
588 NodeManager

4.3測試

當這些都執行完了,我測試一下看看是不是安裝好了。 bash-4.1# cd /usr/local/spark
bash-4.1# ./bin/spark-shell --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/11 20:56:58 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:56:58 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:56:59 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:56:59 INFO spark.HttpServer: Starting HTTP Server
15/02/11 20:56:59 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:56:59 INFO server.AbstractConnector: Started [email protected]:45752
15/02/11 20:56:59 INFO util.Utils: Successfully started service 'HTTP class server' on port 45752.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.2.0
      /_/


Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
Type in expressions to have them evaluated.
Type :help for more information.
15/02/11 20:57:17 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:57:17 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:57:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:57:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/11 20:57:19 INFO Remoting: Starting remoting
15/02/11 20:57:20 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:58553]
15/02/11 20:57:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 58553.
15/02/11 20:57:20 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/11 20:57:20 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/11 20:57:20 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150211205720-f7e6
15/02/11 20:57:20 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB
15/02/11 20:57:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/11 20:57:24 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-d90ad2bb-e82f-4446-8ccd-e79ff4c6d076
15/02/11 20:57:24 INFO spark.HttpServer: Starting HTTP Server
15/02/11 20:57:24 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:57:24 INFO server.AbstractConnector: Started [email protected]:52012
15/02/11 20:57:24 INFO util.Utils: Successfully started service 'HTTP file server' on port 52012.
15/02/11 20:57:25 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:57:25 INFO server.AbstractConnector: Started Sel[email protected]:4040
15/02/11 20:57:25 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/11 20:57:25 INFO ui.SparkUI: Started SparkUI at http://sandbox:4040
15/02/11 20:57:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/11 20:57:27 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
15/02/11 20:57:27 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/02/11 20:57:27 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/02/11 20:57:27 INFO yarn.Client: Setting up container launch context for our AM
15/02/11 20:57:27 INFO yarn.Client: Preparing resources for our AM container
15/02/11 20:57:31 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 20:57:31 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/spark/spark-assembly-1.2.0-hadoop2.4.0.jar
15/02/11 20:57:31 INFO yarn.Client: Setting up the launch environment for our AM container
15/02/11 20:57:31 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 20:57:31 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:57:31 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:57:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:57:31 INFO yarn.Client: Submitting application 1 to ResourceManager
15/02/11 20:57:32 INFO impl.YarnClientImpl: Submitted application application_1423706171480_0001
15/02/11 20:57:33 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:33 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1423706251906
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0001/
         user: root
15/02/11 20:57:34 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:35 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:36 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:37 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:38 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:40 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:41 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:42 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:43 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:44 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:45 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:46 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:47 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:48 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:49 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:50 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:51 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:52 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:53 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:54 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:55 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:56 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:57 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:58 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:59 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:00 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:01 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:04 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:05 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:06 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:07 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:08 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:10 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:11 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:13 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:14 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:15 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:16 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:17 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:18 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:19 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:19 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://[email protected]:54672/user/YarnAM#-192648481]
15/02/11 20:58:19 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sandbox, PROXY_URI_BASES -> http://sandbox:8088/proxy/application_1423706171480_0001), /proxy/application_1423706171480_0001
15/02/11 20:58:19 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/02/11 20:58:20 INFO yarn.Client: Application report for application_1423706171480_0001 (state: RUNNING)
15/02/11 20:58:20 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: sandbox
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1423706251906
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0001/
         user: root
15/02/11 20:58:20 INFO cluster.YarnClientSchedulerBackend: Application application_1423706171480_0001 has started running.
15/02/11 20:58:20 INFO netty.NettyBlockTransferService: Server created on 60949
15/02/11 20:58:20 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/11 20:58:20 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:60949 with 530.3 MB RAM, BlockManagerId(<driver>, sandbox, 60949)
15/02/11 20:58:20 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/11 20:58:21 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
15/02/11 20:58:21 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.


scala> 15/02/11 20:58:43 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:37188/user/Executor#375257054] with ID 1
15/02/11 20:58:43 INFO util.RackResolver: Resolved sandbox to /default-rack
15/02/11 20:58:43 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:52808/user/Executor#1782772186] with ID 2
15/02/11 20:58:45 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:55768 with 530.3 MB RAM, BlockManagerId(1, sandbox, 55768)
15/02/11 20:58:45 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:41242 with 530.3 MB RAM, BlockManagerId(2, sandbox, 41242)




scala> sc.parallelize(1 to 1000).count()
15/02/11 20:59:45 INFO spark.SparkContext: Starting job: count at <console>:13
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Got job 0 (count at <console>:13) with 2 output partitions (allowLocal=false)
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at <console>:13)
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Missing parents: List()
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Submitting Stage 0 (ParallelCollectionRDD[0] at parallelize at <console>:13), which has no missing parents
15/02/11 20:59:45 INFO storage.MemoryStore: ensureFreeSpace(1088) called with curMem=0, maxMem=556038881
15/02/11 20:59:45 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1088.0 B, free 530.3 MB)
15/02/11 20:59:45 INFO storage.MemoryStore: ensureFreeSpace(842) called with curMem=1088, maxMem=556038881
15/02/11 20:59:45 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 842.0 B, free 530.3 MB)
15/02/11 20:59:45 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:60949 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:45 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/11 20:59:45 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:838
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at <console>:13)
15/02/11 20:59:45 INFO cluster.YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks
15/02/11 20:59:45 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, sandbox, PROCESS_LOCAL, 1260 bytes)
15/02/11 20:59:45 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox, PROCESS_LOCAL, 1260 bytes)
15/02/11 20:59:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:41242 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:55768 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:52 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 6625 ms on sandbox (1/2)
15/02/11 20:59:52 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 6691 ms on sandbox (2/2)
15/02/11 20:59:52 INFO scheduler.DAGScheduler: Stage 0 (count at <console>:13) finished in 6.695 s
15/02/11 20:59:52 INFO cluster.YarnClientClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/02/11 20:59:52 INFO scheduler.DAGScheduler: Job 0 finished: count at <console>:13, took 7.036182 s
res0: Long = 1000


scala> exit

4.4執行spark樣例程式

bash-4.1# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1 ./lib/spark-examples-1.2.0-hadoop2.4.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/11 21:09:37 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 21:09:37 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 21:09:37 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 21:09:38 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/11 21:09:38 INFO Remoting: Starting remoting
15/02/11 21:09:38 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:46836]
15/02/11 21:09:38 INFO util.Utils: Successfully started service 'sparkDriver' on port 46836.
15/02/11 21:09:38 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/11 21:09:38 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/11 21:09:38 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150211210938-ba7a
15/02/11 21:09:38 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB
15/02/11 21:09:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/11 21:09:39 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-590b1b3d-95b6-4d7c-bef4-36b0cafeafe9
15/02/11 21:09:39 INFO spark.HttpServer: Starting HTTP Server
15/02/11 21:09:39 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 21:09:39 INFO server.AbstractConnector: Started [email protected]:60161
15/02/11 21:09:39 INFO util.Utils: Successfully started service 'HTTP file server' on port 60161.
15/02/11 21:09:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 21:09:40 INFO server.AbstractConnector: Started [email protected]:4040
15/02/11 21:09:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/11 21:09:40 INFO ui.SparkUI: Started SparkUI at http://sandbox:4040
15/02/11 21:09:41 INFO spark.SparkContext: Added JAR file:/usr/local/spark-1.2.0-bin-hadoop2.4/./lib/spark-examples-1.2.0-hadoop2.4.0.jar at http://172.17.0.2:60161/jars/spark-examples-1.2.0-hadoop2.4.0.jar with timestamp 1423706981078
15/02/11 21:09:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/11 21:09:41 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
15/02/11 21:09:41 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/02/11 21:09:41 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/02/11 21:09:41 INFO yarn.Client: Setting up container launch context for our AM
15/02/11 21:09:41 INFO yarn.Client: Preparing resources for our AM container
15/02/11 21:09:42 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 21:09:42 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/spark/spark-assembly-1.2.0-hadoop2.4.0.jar
15/02/11 21:09:42 INFO yarn.Client: Setting up the launch environment for our AM container
15/02/11 21:09:42 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 21:09:42 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 21:09:42 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 21:09:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 21:09:42 INFO yarn.Client: Submitting application 3 to ResourceManager
15/02/11 21:09:43 INFO impl.YarnClientImpl: Submitted application application_1423706171480_0003
15/02/11 21:09:44 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:44 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1423706982964
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0003/
         user: root
15/02/11 21:09:45 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:46 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:47 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:48 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://[email protected]:36886/user/YarnAM#250082351]
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sandbox, PROXY_URI_BASES -> http://sandbox:8088/proxy/application_1423706171480_0003), /proxy/application_1423706171480_0003
15/02/11 21:09:49 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/02/11 21:09:49 INFO yarn.Client: Application report for application_1423706171480_0003 (state: RUNNING)
15/02/11 21:09:49 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: sandbox
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1423706982964
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0003/
         user: root
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: Application application_1423706171480_0003 has started running.
15/02/11 21:09:49 INFO netty.NettyBlockTransferService: Server created on 56981
15/02/11 21:09:49 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/11 21:09:49 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:56981 with 530.3 MB RAM, BlockManagerId(<driver>, sandbox, 56981)
15/02/11 21:09:49 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/11 21:10:03 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:56995/user/Executor#-1663552722] with ID 2
15/02/11 21:10:04 INFO util.RackResolver: Resolved sandbox to /default-rack
15/02/11 21:10:04 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:35154/user/Executor#1336228035] with ID 1
15/02/11 21:10:04 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
15/02/11 21:10:04 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:35
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:35) with 2 output partitions (allowLocal=false)
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35)
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Missing parents: List()
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at SparkPi.scala:31), which has no missing parents
15/02/11 21:10:04 INFO storage.MemoryStore: ensureFreeSpace(1728) called with curMem=0, maxMem=556038881
15/02/11 21:10:05 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1728.0 B, free 530.3 MB)
15/02/11 21:10:05 INFO storage.MemoryStore: ensureFreeSpace(1235) called with curMem=1728, maxMem=556038881
15/02/11 21:10:05 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1235.0 B, free 530.3 MB)
15/02/11 21:10:05 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:56981 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:05 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/11 21:10:05 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:838
15/02/11 21:10:05 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[1] at map at SparkPi.scala:31)
15/02/11 21:10:05 INFO cluster.YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks
15/02/11 21:10:05 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, sandbox, PROCESS_LOCAL, 1335 bytes)
15/02/11 21:10:05 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox, PROCESS_LOCAL, 1335 bytes)
15/02/11 21:10:06 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:48023 with 530.3 MB RAM, BlockManagerId(2, sandbox, 48023)
15/02/11 21:10:06 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:46354 with 530.3 MB RAM, BlockManagerId(1, sandbox, 46354)
15/02/11 21:10:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:46354 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:48023 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:24 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 18997 ms on sandbox (1/2)
15/02/11 21:10:24 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 19283 ms on sandbox (2/2)
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 19.324 s
15/02/11 21:10:24 INFO cluster.YarnClientClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:35, took 20.226582 s
Pi is roughly 3.143
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null}
15/02/11 21:10:24 INFO ui.SparkUI: Stopped Spark web UI at http://sandbox:4040
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Stopped
15/02/11 21:10:25 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/02/11 21:10:25 INFO storage.MemoryStore: MemoryStore cleared
15/02/11 21:10:25 INFO storage.BlockManager: BlockManager stopped
15/02/11 21:10:25 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/02/11 21:10:25 INFO spark.SparkContext: Successfully stopped SparkContext
15/02/11 21:10:25 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/02/11 21:10:25 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
bash-4.1#