1. 程式人生 > >Scala入門到精通——第十三節 高階函式

Scala入門到精通——第十三節 高階函式

本節主要內容

  1. 高階函式簡介
  2. Scala中的常用高階函式
  3. SAM轉換
  4. 函式柯里化
  5. 部分應用函式

1. 高階函式簡介

高階函式主要有兩種:一種是將一個函式當做另外一個函式的引數(即函式引數);另外一種是返回值是函式的函式。這兩種在本教程的第五節 函式與閉包中已經有所涉及,這裡簡單地回顧一下:
(1)函式引數

//函式引數,即傳入另一個函式的引數是函式
//((Int)=>String)=>String
scala> def convertIntToString(f:(Int)=>String)=f(4)
convertIntToString: (f: Int => String)
String scala> convertIntToString((x:Int)=>x+" s") res32: String = 4 s

(2)返回值是函式的函式

//高階函式可以產生新的函式,即我們講的函式返回值是一個函式
//(Double)=>((Double)=>Double)
scala>  def multiplyBy(factor:Double)=(x:Double)=>factor*x
multiplyBy: (factor: Double)Double => Double

scala> val x=multiplyBy(10
) x: Double => Double = <function1> scala> x(50) res33: Double = 500.0

Scala中的高階函式可以說是無處不在,這點可以在Scala中的API文件中得到驗證,下圖給出的是Array陣列的需要函式作為引數的API:
這裡寫圖片描述
例如flatMap方法,下面是其API的詳細內容:

def
flatMap[B](f: (A) ⇒ GenTraversableOnce[B]): Array[B]
[use case]
Builds a new collection by applying a function
to all elements of this array and using the elements of the resulting collections. //下面的程式碼給出了該函式的用法 For example: def getWords(lines: Seq[String]): Seq[String] = lines flatMap (line => line split "\\W+") The type of the resulting collection is guided by the static type of array. This might cause unexpected results sometimes. For example: // lettersOf will return a Seq[Char] of likely repeated letters, instead of a Set def lettersOf(words: Seq[String]) = words flatMap (word => word.toSet) // lettersOf will return a Set[Char], not a Seq def lettersOf(words: Seq[String]) = words.toSet flatMap (word => word.toSeq) // xs will be a an Iterable[Int] val xs = Map("a" -> List(11,111), "b" -> List(22,222)).flatMap(_._2) // ys will be a Map[Int, Int] val ys = Map("a" -> List(1 -> 11,1 -> 111), "b" -> List(2 -> 22,2 -> 222)).flatMap(_._2) //下面幾行對該函式的引數進行了說明 B the element type of the returned collection. //指明f是函式,該函式傳入的引數型別是A,返回型別是GenTraversableOnce[B] f the function to apply to each element. returns a new array resulting from applying the given collection-valued function f to each element of this array and concatenating the results.

2. Scala中的常用高階函式

1 map函式
所有集合型別都存在map函式,例如Array的map函式的API具有如下形式:

def map[B](f: (A) ⇒ B): Array[B]
用途:Builds a new collection by applying a function to all elements of this array.
B的含義:the element type of the returned collection.
f的含義:the function to apply to each element.
返回:a new array resulting from applying the given function f to each element of this array and collecting the results.
//這裡面採用的是匿名函式的形式,字串*n得到的是重複的n個字串,這是scala中String操作的一個特點
scala> Array("spark","hive","hadoop").map((x:String)=>x*2)
res3: Array[String] = Array(sparkspark, hivehive, hadoophadoop)

//在函式與閉包那一小節,我們提到,上面的程式碼還可以簡化
//省略匿名函式引數型別
scala> Array("spark","hive","hadoop").map((x)=>x*2)
res4: Array[String] = Array(sparkspark, hivehive, hadoophadoop)

//單個引數,還可以省去括號
scala> Array("spark","hive","hadoop").map(x=>x*2)
res5: Array[String] = Array(sparkspark, hivehive, hadoophadoop)

//引數在右邊只出現一次的話,還可以用佔位符的表示方式
scala> Array("spark","hive","hadoop").map(_*2)
res6: Array[String] = Array(sparkspark, hivehive, hadoophadoop)

List型別:

scala> val list=List("Spark"->1,"hive"->2,"hadoop"->2)
list: List[(String, Int)] = List((Spark,1), (hive,2), (hadoop,2))

//寫法1
scala> list.map(x=>x._1)
res20: List[String] = List(Spark, hive, hadoop)
//寫法2
scala> list.map(_._1)
res21: List[String] = List(Spark, hive, hadoop)

scala> list.map(_._2)
res22: List[Int] = List(1, 2, 2)

Map型別:

//寫法1
scala> Map("spark"->1,"hive"->2,"hadoop"->3).map(_._1)
res23: scala.collection.immutable.Iterable[String] = List(spark, hive, hadoop)

scala> Map("spark"->1,"hive"->2,"hadoop"->3).map(_._2)
res24: scala.collection.immutable.Iterable[Int] = List(1, 2, 3)

//寫法2
scala> Map("spark"->1,"hive"->2,"hadoop"->3).map(x=>x._2)
res25: scala.collection.immutable.Iterable[Int] = List(1, 2, 3)

scala> Map("spark"->1,"hive"->2,"hadoop"->3).map(x=>x._1)
res26: scala.collection.immutable.Iterable[String] = List(spark, hive, hadoop)

2 flatMap函式

//寫法1
scala> List(List(1,2,3),List(2,3,4)).flatMap(x=>x)
res40: List[Int] = List(1, 2, 3, 2, 3, 4)

//寫法2
scala> List(List(1,2,3),List(2,3,4)).flatMap(x=>x.map(y=>y))
res41: List[Int] = List(1, 2, 3, 2, 3, 4)

3 filter函式

scala> Array(1,2,4,3,5).filter(_>3)
res48: Array[Int] = Array(4, 5)

scala> List("List","Set","Array").filter(_.length>3)
res49: List[String] = List(List, Array)

scala> Map("List"->3,"Set"->5,"Array"->7).filter(_._2>3)
res50: scala.collection.immutable.Map[String,Int] = Map(Set -> 5, Array -> 7)

4 reduce函式

//寫法1
scala> Array(1,2,4,3,5).reduce(_+_)
res51: Int = 15

scala> List("Spark","Hive","Hadoop").reduce(_+_)
res52: String = SparkHiveHadoop

//寫法2
scala> Array(1,2,4,3,5).reduce((x:Int,y:Int)=>{println(x,y);x+y})
(1,2)
(3,4)
(7,3)
(10,5)
res60: Int = 15

scala> Array(1,2,4,3,5).reduceLeft((x:Int,y:Int)=>{println(x,y);x+y})
(1,2)
(3,4)
(7,3)
(10,5)
res61: Int = 15

scala> Array(1,2,4,3,5).reduceRight((x:Int,y:Int)=>{println(x,y);x+y})
(3,5)
(4,8)
(2,12)
(1,14)
res62: Int = 15

5 fold函式

scala> Array(1,2,4,3,5).foldLeft(0)((x:Int,y:Int)=>{println(x,y);x+y})
(0,1)
(1,2)
(3,4)
(7,3)
(10,5)
res66: Int = 15

scala> Array(1,2,4,3,5).foldRight(0)((x:Int,y:Int)=>{println(x,y);x+y})
(5,0)
(3,5)
(4,8)
(2,12)
(1,14)
res67: Int = 15

scala> Array(1,2,4,3,5).foldLeft(0)(_+_)
res68: Int = 15

scala> Array(1,2,4,3,5).foldRight(10)(_+_)
res69: Int = 25

// /:相當於foldLeft
scala> (0 /: Array(1,2,4,3,5)) (_+_)
res70: Int = 15


scala> (0 /: Array(1,2,4,3,5)) ((x:Int,y:Int)=>{println(x,y);x+y})
(0,1)
(1,2)
(3,4)
(7,3)
(10,5)
res72: Int = 15

6 scan函式

//從左掃描,每步的結果都儲存起來,執行完成後生成陣列
scala> Array(1,2,4,3,5).scanLeft(0)((x:Int,y:Int)=>{println(x,y);x+y})
(0,1)
(1,2)
(3,4)
(7,3)
(10,5)
res73: Array[Int] = Array(0, 1, 3, 7, 10, 15)

//從右掃描,每步的結果都儲存起來,執行完成後生成陣列
scala> Array(1,2,4,3,5).scanRight(0)((x:Int,y:Int)=>{println(x,y);x+y})
(5,0)
(3,5)
(4,8)
(2,12)
(1,14)
res74: Array[Int] = Array(15, 14, 12, 8, 5, 0)

3. SAM轉換

在java的GUI程式設計中,在設定某個按鈕的監聽器的時候,我們常常會使用下面的程式碼(利用scala進行程式碼開發):

var counter=0;
val button=new JButton("click")
button.addActionListener(new ActionListener{
   override def actionPerformed(event:ActionEvent){
   counter+=1
   }
})

上面程式碼在addActionListener方法中定義了一個實現了ActionListener介面的匿名內部類,程式碼中

new ActionListener{
   override def actionPerformed(event:ActionEvent){

 }

這部分稱為樣板程式碼,即在任何實現該介面的類中都需要這樣用,重複性較高,由於ActionListener介面只有一個actionPerformed方法,它被稱為simple abstract method(SAM)。SAM轉換是指只給addActionListener方法傳遞一個引數

button.addActionListener((event:ActionEvent)=>counter+=1)

//並提供一個隱式轉換,我們後面會具體講隱式轉換
implict def makeAction(action:(event:ActionEvent)=>Unit){
   new ActionListener{
     override def actionPerformed(event:ActionEvent){action(event)}
}

這樣的話,在進行GUI程式設計的時候,可以省略非常多的樣板程式碼,使程式碼更簡潔。

4. 函式柯里化

在函式與閉包那一節中,我們定義了下面這樣的一個函式

//mutiplyBy這個函式的返回值是一個函式
//該函式的輸入是Doulbe,返回值也是Double
scala>  def multiplyBy(factor:Double)=(x:Double)=>factor*x
multiplyBy: (factor: Double)Double => Double

//返回的函式作為值函式賦值給變數x
scala> val x=multiplyBy(10)
x: Double => Double = <function1>

//變數x現在可以直接當函式使用
scala> x(50)
res33: Double = 500.0   

上述程式碼可以像這樣使用:

scala> def multiplyBy(factor:Double)=(x:Double)=>factor*x
multiplyBy: (factor: Double)Double => Double

//這是高階函式呼叫的另外一種形式
scala> multiplyBy(10)(50)
res77: Double = 500.0

那函式柯里化(curry)是怎麼樣的呢?其實就是將multiplyBy函式定義成如下形式

scala> def multiplyBy(factor:Double)(x:Double)=x*factor
multiplyBy: (factor: Double)(x: Double)Double

即通過(factor:Double)(x:Double)定義函式引數,該函式的呼叫方式如下:

//柯里化的函式呼叫方式
scala> multiplyBy(10)(50)
res81: Double = 500.0

//但此時它不能像def multiplyBy(factor:Double)=(x:Double)=>factor*x函式一樣,可以輸入單個引數進行呼叫
scala> multiplyBy(10)

<console>:10: error: missing arguments for method multiplyBy;
follow this method with `_' if you want to treat it as a partially applied funct
ion
              multiplyBy(10)
                        ^

錯誤提示函式multiplyBy缺少引數,如果要這麼做的話,需要將其定義為偏函式

scala> multiplyBy(10)_
res79: Double => Double = <function1>

那現在我們接著對偏函式進行介紹

5. 部分應用函式

在陣列那一節中,我們講到,Scala中的陣列可以通過foreach方法將其內容打印出來,程式碼如下:

scala>Array("Hadoop","Hive","Spark")foreach(x=>println(x))
Hadoop
Hive
Spark
//上面的程式碼等價於下面的程式碼
scala> def print(x:String)=println(x)
print: (x: String)Unit

scala> Array("Hadoop","Hive","Spark")foreach(print)
Hadoop
Hive
Spark

那什麼是部分應用函式呢,所謂部分應用函式就是指,當函式有多個引數,而在我們使用該函式時我們不想提供所有引數(假設函式有3個函式),只提供0~2個引數,此時得到的函式便是部分應用函式,定義上述print函式的部分應用函式程式碼如下:

//定義print的部分應用函式
scala> val p=print _
p: String => Unit = <function1>

scala> Array("Hadoop","Hive","Spark")foreach(p)
Hadoop
Hive
Spark

scala> Array("Hadoop","Hive","Spark")foreach(print _)
Hadoop
Hive
Spark

在上面的簡化輸出程式碼中,下劃線_並不是佔位符的作用,而是作為部分應用函式的定義符。前面我演示了一個引數的函式部分應用函式的定義方式,現在我們定義一個多個輸入引數的函式,程式碼如下:

//定義一個求和函式
scala> def sum(x:Int,y:Int,z:Int)=x+y+z
sum: (x: Int, y: Int, z: Int)Int

//不指定任何引數的部分應用函式
scala> val s1=sum _
s1: (Int, Int, Int) => Int = <function3>

scala> s1(1,2,3)
res91: Int = 6

 //指定兩個引數的部分應用函式
scala> val s2=sum(1,_:Int,3)
s2: Int => Int = <function1>

scala> s2(2)
res92: Int = 6

//指定一個引數的部分應用函式
scala> val s3=sum(1,_:Int,_:Int)
s3: (Int, Int) => Int = <function2>

scala> s3(2,3)
res93: Int = 6

在函式柯里化那部分,我們提到柯里化的multiplyBy函式輸入單個引數,它並不會像沒有柯里化的函式那樣返回一個函式,而是會報錯,如果需要其返回函式的話,需要定義其部分應用函式,程式碼如下:

//定義multiplyBy函式的部分應用函式,它返回的是一個函式
scala> val m=multiplyBy(10)_
m: Double => Double = <function1>

scala> m(50)
res94: Double = 500.0

新增公眾微訊號,可以瞭解更多最新Spark、Scala相關技術資訊
這裡寫圖片描述