1. 程式人生 > >Linux 下 maven 編譯 spark 原始碼

Linux 下 maven 編譯 spark 原始碼

1. 安裝maven 


1)將安裝包解壓到指定目錄:

[[email protected] apache-maven-3.5.3]# tar -zxf /opt/maven/apache-maven-3.5.3-bin.tar.gz  -C /usr/local/

2)配置maven環境變數,並測試maven是否安裝成功

[[email protected] apache-maven-3.5.3]# vi /etc/profile
#maven 
export MAVEN_HOME=/usr/local/apache-maven-3.5.3
export PATH=$PATH:$MAVEN_HOME/bin
export MAVEN_OPTS="-Xmx2048m -XX:MetaspaceSize=1024m -XX:MaxMetaspaceSize=1524m -Xss2m"
export PATH=$PATH:$MAVEN_HOME/bin
[[email protected] apache-maven-3.5.3]# source /etc/profile
[[email protected] apache-maven-3.5.3]# mvn -version
Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T11:49:05-08:00)
Maven home: /usr/local/apache-maven-3.5.3
Java version: 1.8.0_171, vendor: Oracle Corporation
Java home: /usr/local/jdk1.8.0_171/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-123.el7.x86_64", arch: "amd64", family: "unix"

2.下載Spark原始碼


1)掛載到/opt目錄


2)解壓到工作目錄

[[email protected] home]# tar -zxf /opt/spark/spark-2.3.1.tgz  -C /home/andy/work
[[email protected] home]# cd /home/andy/work
[[email protected] work]# ll
total 4
drwxrwxr-x. 29 andy andy 4096 Jun  1 13:34 spark-2.3.1
[[email protected] work]# cd spark-2.3.1/
[
[email protected]
spark-2.3.1]# ll total 228 -rw-rw-r--.  1 andy andy   2318 Jun  1 13:34 appveyor.yml drwxrwxr-x.  3 andy andy     43 Jun  1 13:34 assembly drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 bin drwxrwxr-x.  2 andy andy     75 Jun  1 13:34 build drwxrwxr-x.  9 andy andy   4096 Jun  1 13:34 common drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 conf -rw-rw-r--.  1 andy andy    995 Jun  1 13:34 CONTRIBUTING.md drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 core drwxrwxr-x.  5 andy andy     47 Jun  1 13:34 data drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 dev drwxrwxr-x.  9 andy andy   4096 Jun  1 13:34 docs drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 examples drwxrwxr-x. 15 andy andy   4096 Jun  1 13:34 external drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 graphx drwxrwxr-x.  2 andy andy     20 Jun  1 13:34 hadoop-cloud drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 launcher -rw-rw-r--.  1 andy andy  18045 Jun  1 13:34 LICENSE drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 licenses drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 mllib drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 mllib-local -rw-rw-r--.  1 andy andy  24913 Jun  1 13:34 NOTICE -rw-rw-r--.  1 andy andy 101718 Jun  1 13:34 pom.xml drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 project drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 python drwxrwxr-x.  3 andy andy   4096 Jun  1 13:34 R -rw-rw-r--.  1 andy andy   3809 Jun  1 13:34 README.md drwxrwxr-x.  5 andy andy     64 Jun  1 13:34 repl drwxrwxr-x.  5 andy andy     46 Jun  1 13:34 resource-managers drwxrwxr-x.  2 andy andy   4096 Jun  1 13:34 sbin -rw-rw-r--.  1 andy andy  17624 Jun  1 13:34 scalastyle-config.xml drwxrwxr-x.  6 andy andy   4096 Jun  1 13:34 sql drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 streaming drwxrwxr-x.  3 andy andy     30 Jun  1 13:34 tools

3.編譯Spark原始碼

本本編譯Spark原始碼是接著上一篇CentOS7安裝spark2.0叢集來寫的,所以下圖中的工具配置都已經完成:

#scala
export SCALA_HOME=/usr/local/scala-2.12.6
export PATH=$PATH:$SCALA_HOME/bin

#jdk
export JAVA_HOME=/usr/local/jdk1.8.0_171
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

#spark
export SPARK_HOME=/usr/local/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export SPARK_EXAMPLES_JAR=$SPARK_HOME/examples/jars/spark-examples_2.11-2.3.1.jar

1) 設定Maven記憶體使用,您需要通過MAVEN_OPTS配置Maven的記憶體使用量,官方推薦配置如下:

export MAVEN_OPTS="-Xmx2048m -XX:MetaspaceSize=1024m -XX:MaxMetaspaceSize=1524m -Xss2m"
export PATH=$PATH:$MAVEN_OPTS/bin

虛擬機器推薦設定記憶體4G,一定要大於MAVEN_OPTS中設定的最大記憶體。本人一開始給虛擬機器設定的記憶體為1G,編譯程序總是會被卡死。

2)編譯

[[email protected] spark-2.3.1]# mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -Phadoop-provided -Phive -Phive-thriftserver -Pnetlib-lgpl -DskipTests clean package
[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/apache/18/apache-18.pom (16 kB at 4.8 kB/s)
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] Spark Project Parent POM                                           [pom]
[INFO] Spark Project Tags                                                 [jar]
[INFO] Spark Project Sketch                                               [jar]
[INFO] Spark Project Local DB                                             [jar]
[INFO] Spark Project Networking                                           [jar]
[INFO] Spark Project Shuffle Streaming Service                            [jar]
[INFO] Spark Project Unsafe                                               [jar]
[INFO] Spark Project Launcher                                             [jar]
[INFO] Spark Project Core                                                 [jar]
[INFO] Spark Project ML Local Library                                     [jar]
[INFO] Spark Project GraphX                                               [jar]
[INFO] Spark Project Streaming                                            [jar]
[INFO] Spark Project Catalyst                                             [jar]
[INFO] Spark Project SQL                                                  [jar]
[INFO] Spark Project ML Library                                           [jar]
[INFO] Spark Project Tools                                                [jar]
[INFO] Spark Project Hive                                                 [jar]
[INFO] Spark Project REPL                                                 [jar]
[INFO] Spark Project YARN Shuffle Service                                 [jar]
[INFO] Spark Project YARN                                                 [jar]
[INFO] Spark Project Hive Thrift Server                                   [jar]
[INFO] Spark Project Assembly                                             [pom]
[INFO] Spark Integration for Kafka 0.10                                   [jar]
[INFO] Kafka 0.10 Source for Structured Streaming                         [jar]
[INFO] Spark Project Examples                                             [jar]
[INFO] Spark Integration for Kafka 0.10 Assembly                          [jar]
[INFO] 
[INFO] -----------------< org.apache.spark:spark-parent_2.11 >-----------------

3)編譯成功