1. 程式人生 > >cdh 執行spark yarn-cluster

cdh 執行spark yarn-cluster

1.如果用cdh安裝sparn on yarn
直接用叢集模式執行

spark-submit --class org.apache.spark.examples.SparkPi  \
   --master yarn-cluster  \
   --num-executors 3  \
   --driver-memory 4g \
   --executor-memory 2g   \
   --executor-cores 1   \
   --queue thequeue   \
    ./spark-examples*.jar

出現以下錯誤

Exit code: 15
Stack trace: ExitCodeException exitCode=15: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
.launchContainer(DefaultContainerExecutor.java:197) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util
.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

接下來用命令追蹤錯誤

yarn logs -applicationId application_1429759514549_0001

發現錯誤根源

Exception in thread "Driver" java.io.IOException: Error in creating log directory: file:/user/spark/applicationHistory/application_1429759514549_0001
    at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:133)
    at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
    at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:353)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)

發現以下錯誤

Error in creating log directory:

解決辦法

進入到spark的配置檔案看到spark-defaults.conf

spark.eventLog.dir=/user/spark/applicationHistory
spark.eventLog.enabled=true
spark.yarn.historyServer.address=http://slave3.hadoop.gitv.we:18088
spark.driver.extraLibraryPath=/opt/soft/BI/cloudera/cm/cm5.3.1/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/soft/BI/cloudera/cm/cm5.3.1/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hadoop/lib/native

修該配置 spark.eventLog.dir 變為hdfs目錄 (hdfs://nameservice1為我的hdfs 名稱空間,因為配置了HA)

spark.eventLog.dir=hdfs://nameservice1/user/spark/applicationHistory