1. 程式人生 > >Spark GraphX 屬性圖操作

Spark GraphX 屬性圖操作

val 元組 連接 string parent ase 限制 apach appname

package Spark_GraphX

import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}

object 屬性圖 {
  def main(args: Array[String]): Unit = {
    val conf=new SparkConf().setAppName("SimpleGraphX").setMaster("local[4]")
    val sc=new SparkContext(conf)
   
//定義頂點 val users:RDD[(VertexId,(String,String))]=sc.parallelize(Array((3L,("soyo","student")),(7L,("soyo2","postdoc")),(5L,("xiaozhou","professor")),(2L,("xiaocui","professor")))) //定義邊 val relationships:RDD[Edge[String]]=sc.parallelize(Array(Edge(3L,7L,"collab"),Edge(5L,3L,"advisor"),Edge(2L
,5L,"colleague"),Edge(5L,7L,"parent"))) //定義默認的作者,以防與不存在的作者有邊 val defaultUser=("Jone","Dance") val graph=Graph(users,relationships,defaultUser) println("*****************") println("找到圖中屬性是student的點") graph.vertices.filter{case (id,(name,occupation))=>occupation=="student"}.collect.foreach
{case(id,(name,occupation))=>println(s"$name is $occupation")} println("--------------------------") println("找到途中邊的屬性是advisor的邊") graph.edges.filter(x=>x.attr=="advisor").collect().foreach(x=>println(s"${x.srcId} to ${x.dstId} 屬性為 ${x.attr}")) println("--------------------------") println("找到圖中的最大出度,入度,度數") println("最大的出度:"+graph.outDegrees.reduce(max)) println("最大的入度:"+graph.inDegrees.reduce(max)) println("最大的度數:"+graph.degrees.reduce(max)) //Scala 可直接調用Java程序 // System.out.print("hello word") //屬性操作 println("------------------------") println("給圖中每個頂點的職業屬性上加上“spark字符串") graph.mapVertices{case (id,(name,occupation))=>(id,(name,occupation+"Spark"))}.vertices.collect.foreach(x=>println(s"${x._2._1} is ${x._2._2} : ${x._1} : ${x._2}")) println("------------------------") println("給途中每個元組的Edge的屬性值設置為源頂點屬性值+邊的屬性值+目標定點屬性值:") graph.mapTriplets(x=>x.srcAttr._2+"+"+x.attr+"+"+x.dstAttr._2).edges.collect().foreach(println) //可以證明:屬性操作下,圖的結構都不受影響. graph.mapTriplets(x=>x.srcId+x.dstId).edges.collect().foreach(println) //結構操作 :triplets(表示邊) /* reverse操作返回一個所有邊方向取反的新圖.該反轉操作並沒有修改圖中頂點,邊的屬性,更沒有增加邊的數量. subgraph操作主要利用頂點和邊進行判斷,返回的新圖中包含滿足判斷要求的頂點,邊.該操作常用於一些情景,比如:限制感興趣的圖頂點和邊,刪除損壞連接. */ println("------結構操作---------") graph.triplets.map(x=>x.srcAttr._1+" is the "+x.attr+" of "+x.dstAttr._1).foreach(println) println("-------刪除職業是postdoc的節點,構建子圖----------") val validGraph=graph.subgraph(vpred=(id,attr)=>attr._2!="postdoc") validGraph.vertices.foreach(println) validGraph.triplets.map(x=>x.srcAttr._1+" is the "+x.attr+" of "+x.dstAttr._1).foreach(println) println("----------構建職業是professor的子圖,並打印子圖的頂點--------") val subGraph=graph.subgraph(vpred = (id,attr)=>attr._2=="professor") subGraph.vertices.collect().foreach(x=>println(s"${x._2._1} is ${x._2._2}")) } //VertexId:頂點,Int:度數 def max(a:(VertexId,Int),b:(VertexId,Int)):(VertexId,Int)={ if(a._2>b._2)a else b } }

結果:

*****************
找到圖中屬性是student的點
soyo is student
--------------------------
找到途中邊的屬性是advisor的邊
5 to 3 屬性為 advisor
--------------------------
找到圖中的最大出度,入度,度數
最大的出度:(5,2)
最大的入度:(7,2)
最大的度數:(5,3)
------------------------
給圖中每個頂點的職業屬性上加上“spark字符串
5 is (xiaozhou,professorSpark) : 5 : (5,(xiaozhou,professorSpark))
2 is (xiaocui,professorSpark) : 2 : (2,(xiaocui,professorSpark))
3 is (soyo,studentSpark) : 3 : (3,(soyo,studentSpark))
7 is (soyo2,postdocSpark) : 7 : (7,(soyo2,postdocSpark))
------------------------
給途中每個元組的Edge的屬性值設置為源頂點屬性值+邊的屬性值+目標定點屬性值:
Edge(3,7,student+collab+postdoc)
Edge(5,3,professor+advisor+student)
Edge(2,5,professor+colleague+professor)
Edge(5,7,professor+parent+postdoc)
Edge(3,7,10)
Edge(5,3,8)
Edge(2,5,7)
Edge(5,7,12)
------結構操作---------
xiaozhou is the parent of soyo2
soyo is the collab of soyo2
xiaozhou is the advisor of soyo
xiaocui is the colleague of xiaozhou
-------刪除職業是postdoc的節點,構建子圖----------
(5,(xiaozhou,professor))
(2,(xiaocui,professor))
(3,(soyo,student))
xiaozhou is the advisor of soyo
xiaocui is the colleague of xiaozhou
----------構建職業是professor的子圖,並打印子圖的頂點--------
xiaozhou is professor
xiaocui is professor

Spark GraphX 屬性圖操作