1. 程式人生 > >spark學習14之使用maven快速切換本地除錯的spark版本

spark學習14之使用maven快速切換本地除錯的spark版本

1解釋
有時候叢集裝了某個版本的spark,想再裝一個版本,想簡單點,可以選擇本地使用idea中的maven。
本文主要是從spark1.5.2切換到spark1.6.1

2.程式碼:

spark-1.5.2:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
>
<modelVersion>4.0.0</modelVersion> <groupId>org.apache.spark.version</groupId> <artifactId>sparkVersion</artifactId> <version>1.0-SNAPSHOT</version> <properties> <scala.version>2.10.4</scala.version> <spark.version
>
1.5.2</spark.version> <scala.version.prefix>2.10</scala.version.prefix> </properties> <dependencies> <!--<dependency>--> <!--<groupId>org.</groupId>--> <!--<artifactId></artifactId>-->
<!--</dependency>--> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.version.prefix}</artifactId> <version>${spark.version}</version> <!--<scope>provided</scope>--> </dependency> </dependencies> </project>

spark-1.6.2:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.apache.spark.version</groupId>
    <artifactId>sparkVersion</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <scala.version>2.10.4</scala.version>
        <spark.version>1.6.1</spark.version>
        <scala.version.prefix>2.10</scala.version.prefix>
    </properties>

    <dependencies>
        <!--<dependency>-->
        <!--<groupId>org.</groupId>-->
        <!--<artifactId></artifactId>-->
        <!--</dependency>-->
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.version.prefix}</artifactId>
            <version>${spark.version}</version>
            <!--<scope>provided</scope>-->
        </dependency>
    </dependencies>

</project>

程式碼:

import org.apache.spark.{SparkConf, SparkContext}

/**
  * Created by xubo on 2016/5/23.
  */
object sparkTest {
  def main(args: Array[String]) {
    //    val conf = new SparkConf().setAppName(this.getClass().getSimpleName().filter(!_.equals('$'))).setMaster("local[4]")
    val conf = new SparkConf().setAppName("test").setMaster("local[4]")
    val sc = new SparkContext(conf)
    val rdd = sc.parallelize(Array((1, 2), (3, 1), (3, 3)))
    rdd.foreach(println)
    println(rdd.partitions.length)

    //since 1.6.0
    println(rdd.getNumPartitions)
    sc.stop()
  }
}

3.結果:
執行結果:
spark1.5.2

不支援

//since 1.6.0
    println(rdd.getNumPartitions)

spark1.6.1:

com.intellij.rt.execution.application.AppMain sparkTest
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/05/23 11:28:20 INFO SparkContext: Running Spark version 1.6.1
16/05/23 11:28:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/23 11:28:43 INFO SecurityManager: Changing view acls to: xubo
16/05/23 11:28:43 INFO SecurityManager: Changing modify acls to: xubo
16/05/23 11:28:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xubo); users with modify permissions: Set(xubo)
16/05/23 11:29:48 INFO Utils: Successfully started service 'sparkDriver' on port 55625.
16/05/23 11:30:20 INFO Slf4jLogger: Slf4jLogger started
16/05/23 11:30:23 INFO Remoting: Starting remoting
16/05/23 11:30:25 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]211.86.159.133:55638]
16/05/23 11:30:25 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55638.
16/05/23 11:30:27 INFO SparkEnv: Registering MapOutputTracker
16/05/23 11:30:31 INFO SparkEnv: Registering BlockManagerMaster
16/05/23 11:30:32 INFO DiskBlockManager: Created local directory at C:\Users\xubo\AppData\Local\Temp\blockmgr-5e5515f9-540a-4b6c-98f4-3a3775f5e72f
16/05/23 11:30:34 INFO MemoryStore: MemoryStore started with capacity 789.7 MB
16/05/23 11:30:38 INFO SparkEnv: Registering OutputCommitCoordinator
16/05/23 11:30:55 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/05/23 11:30:55 INFO SparkUI: Started SparkUI at http://211.86.159.133:4040
16/05/23 11:31:03 INFO Executor: Starting executor ID driver on host localhost
16/05/23 11:31:04 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55674.
16/05/23 11:31:04 INFO NettyBlockTransferService: Server created on 55674
16/05/23 11:31:05 INFO BlockManagerMaster: Trying to register BlockManager
16/05/23 11:31:05 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55674 with 789.7 MB RAM, BlockManagerId(driver, localhost, 55674)
16/05/23 11:31:05 INFO BlockManagerMaster: Registered BlockManager
16/05/23 11:31:30 INFO SparkContext: Starting job: foreach at sparkTest.scala:12
16/05/23 11:31:32 INFO DAGScheduler: Got job 0 (foreach at sparkTest.scala:12) with 4 output partitions
16/05/23 11:31:32 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at sparkTest.scala:12)
16/05/23 11:31:32 INFO DAGScheduler: Parents of final stage: List()
16/05/23 11:31:33 INFO DAGScheduler: Missing parents: List()
16/05/23 11:31:34 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at sparkTest.scala:11), which has no missing parents
16/05/23 11:31:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1112.0 B, free 1112.0 B)
16/05/23 11:31:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 783.0 B, free 1895.0 B)
16/05/23 11:31:56 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55674 (size: 783.0 B, free: 789.7 MB)
16/05/23 11:31:56 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
16/05/23 11:31:57 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at sparkTest.scala:11)
16/05/23 11:31:57 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks
16/05/23 11:31:59 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2076 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, partition 2,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, partition 3,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/05/23 11:31:59 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
16/05/23 11:31:59 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
16/05/23 11:31:59 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
(1,2)
(3,3)
(3,1)
16/05/23 11:32:01 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 915 bytes result sent to driver
16/05/23 11:32:01 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 1828 ms on localhost (1/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3055 ms on localhost (2/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1938 ms on localhost (3/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 1936 ms on localhost (4/4)
16/05/23 11:32:01 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/05/23 11:32:01 INFO DAGScheduler: ResultStage 0 (foreach at sparkTest.scala:12) finished in 3.961 s
16/05/23 11:32:02 INFO DAGScheduler: Job 0 finished: foreach at sparkTest.scala:12, took 31.532340 s
4
4
16/05/23 11:32:02 WARN QueuedThreadPool: 6 threads could not be stopped
16/05/23 11:32:02 INFO SparkUI: Stopped Spark web UI at http://211.86.159.133:4040
16/05/23 11:32:04 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/23 11:32:04 INFO MemoryStore: MemoryStore cleared
16/05/23 11:32:04 INFO BlockManager: BlockManager stopped
16/05/23 11:32:04 INFO BlockManagerMaster: BlockManagerMaster stopped
16/05/23 11:32:04 INFO SparkContext: Successfully stopped SparkContext
16/05/23 11:32:04 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/23 11:32:04 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/23 11:32:04 INFO ShutdownHookManager: Shutdown hook called
16/05/23 11:32:04 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/05/23 11:32:04 INFO ShutdownHookManager: Deleting directory C:\Users\xubo\AppData\Local\Temp\spark-69d1558f-c3a5-4820-863b-9b7b8986b668

Process finished with exit code 0