1. 程式人生 > >Windows下使用sbt打包scala工程

Windows下使用sbt打包scala工程

1.windows下安裝sbt及scala的IDE:https://blog.csdn.net/weixin_42247685/article/details/80390858

2.新建scala_sbt工程

3.新建例項scala指令碼:

指令碼內容:

import java.io.File
import org.apache.spark.sql.{Row, SaveMode, SparkSession}

object helloWorld {
  def main(args:Array[String]): Unit = {
    //val warehouseLocation = new File("spark-warehouse").getAbsolutePath

    val spark = SparkSession
      .builder()
      .appName("Spark Hive Example")
      //.config("spark.sql.warehouse.dir", warehouseLocation)
      .enableHiveSupport()
      .getOrCreate()

    import spark.implicits._
    import spark.sql

    sql("SELECT count(*) FROM dwb.dwb_trde_cfm_ordr_goods_i_d where pt = '2018-07-15'").show()
  }
}

上面內容複製後會一堆報錯,不用管,因為依賴還沒有新增。

4.IDE中sbt相關的設定修改下:

5.在build.sbt檔案中新增下面的程式碼:

name := "Graph"

version := "0.1"

scalaVersion := "2.11.9"

updateOptions := updateOptions.value.withCachedResolution(true)

fullResolvers := Seq(
  "Pdd" at "http://maven-pdd.corp.yiran.com:8081/repository/maven-public/",
  "Local Maven" at Path.userHome.asFile.toURI.toURL + ".m2/repository",
  "Ali" at "http://maven.aliyun.com/nexus/content/groups/public/",
  "Repo1" at "http://repo1.maven.org/maven2/"
)

libraryDependencies += "org.rogach" %% "scallop" % "3.1.1"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.1.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.1.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.1.1" % "provided"

//libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.5.6"
//libraryDependencies += "net.liftweb" %% "lift-json" % "3.3.0"

libraryDependencies += "org.testng" % "testng" % "6.14.3" % Test
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.5" % Test

test in assembly := {}
//mainClass in assembly := Some("com.pdd.bigdata.risk.rimo.feature.Application")
assemblyMergeStrategy in assembly := {
  case PathList(
[email protected]
_*) if ps.last endsWith "Log$Logger.class" => MergeStrategy.first case PathList([email protected]_*) if ps.last endsWith "Log.class" => MergeStrategy.first case PathList("org", "jfree", [email protected]_*) => MergeStrategy.first case PathList("jfree", [email protected]
_*) => MergeStrategy.first case "application.conf" => MergeStrategy.concat case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }

上面程式碼中註釋的地方改成自己的類名和工程名。

這是右下角會彈出是否import的提示,選擇自動import,等待載入完畢。

6.在project下新增紅框處的file檔案

檔案內容:

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.7")

完畢後所有報錯會消失

7.讓IDE顯示Tool Buttons(左圖),在IDE右側雙擊箭頭處的assembly自動打包(右圖),打包完成後在sbt_shell中會提示打包路徑。

8.將打包的jar包上傳至spark叢集,然後執行下面命令:

spark-submit \
--class  work._01_Graph_mallid_buyerid.step01_buildGraph \
--master yarn \
--deploy-mode cluster \
--files /etc/bigdata/conf/spark/hive-site.xml \
/home/buming/work/spark_scala/HelloScala-assembly-0.1.jar

注意:1.class後面是自己的類名。2.最後一行是jar包在spark上的路徑(pwd可以檢視)3.--deploy-mode 指定執行模式。