SparkSQL(一)SQLContext/HiveContext/SparkSession使用和區別
阿新 • • 發佈:2018-12-12
一、SQLContext
1.適用spark版本:spark1.x
2.新增依賴
<dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.11.8</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.1.0</version> <scope>compile</scope> </dependency>
3.程式碼
(1)建立Context
(2)進行相關處理(載入資料)
(3)關閉連線
package MoocSparkSQL import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.sql.SQLContext /** * spark context的使用 */ object SQLContextApp { def main(args: Array[String]): Unit = { val path=args(0) //1)建立相應的Context val sparkConf=new SparkConf() .setAppName("SQLContextApp").setMaster("local[2]") val sc =new SparkContext(sparkConf) val sqlContext=new SQLContext(sc) //2)進行相關處理 val people=sqlContext.read.format("json").load(path) people.printSchema() people.show() //3)關閉資源 //每個sparkContext關閉 sc.stop() } }
二、HiveContext
1.適用spark版本:spark1.x
2.前提:
(1)不需要hive環境
(2)需要hive-site.xml
將hive-site.xml拷貝到專案的資源目錄下面:...\src\sources\hive-site.xml
3.引入依賴包
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.11</artifactId> <version>2.1.0</version> </dependency>
4.程式碼
package SparkSQL
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
/**
* Hive Comtext 的使用
*/
object HiveContextApp {
def main(args: Array[String]): Unit = {
// val path=args(0)
//1)建立相應的Context
val sparkConf=new SparkConf()
//生產環境把下面的註釋掉
.setAppName("HiveContextApp").setMaster("local[2]")
val sc =new SparkContext(sparkConf)
val hiveContext=new HiveContext(sc)
//2)進行相關處理
hiveContext.table("emp").show() //這個是可以的
//3)關閉context
sc.stop()
}
}
三、SparkSession
1.適用spark版本:spark2.x
2.程式碼
package SparkSQL
import org.apache.spark.sql.SparkSession
/**
* sparksession
*/
object SparkSessionApp {
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder()
.appName("SparkSessionApp")
.master("local[2]")
.getOrCreate()
val people=spark.read.json("datas/people.json")
people.show()
spark.stop()
}
}