1. 程式人生 > >Scala中正則表示式以及與模式匹配結合

Scala中正則表示式以及與模式匹配結合

正則表示式

    //"""原生表達
    val regex="""([0-9]+)([a-z]+)""".r
    val numPattern="[0-9]+".r
    val numberPattern="""\s+[0-9]+\s+""".r
  • 說明:.r()方法簡介:Scala中將字串轉換為正則表示式
  /** You can follow a string with `.r`, turning it into a `Regex`. E.g.
   *
   *  `"""A\w*""".r`   is the regular expression for identifiers starting with
`A`. */ def r: Regex = r()

模式匹配一

    //findAllIn()方法返回遍歷所有匹配項的迭代器
    for(matchString <- numPattern.findAllIn("99345 Scala,22298 Spark"))
      println(matchString)
  • 說明:findAllIn(…)函式簡介
  /** Return all non-overlapping matches of this `Regex` in the given character 
   *  sequence as a [[scala.util.matching.Regex.MatchIterator]]
, * which is a special [[scala.collection.Iterator]] that returns the * matched strings but can also be queried for more data about the last match, * such as capturing groups and start position. * * A `MatchIterator` can also be converted into an iterator * that returns objects of type
[[scala.util.matching.Regex.Match]], * such as is normally returned by `findAllMatchIn`. * * Where potential matches overlap, the first possible match is returned, * followed by the next match that follows the input consumed by the * first match: * * {{{ * val hat = "hat[^a]+".r * val hathaway = "hathatthattthatttt" * val hats = (hat findAllIn hathaway).toList // List(hath, hattth) * val pos = (hat findAllMatchIn hathaway map (_.start)).toList // List(0, 7) * }}} * * To return overlapping matches, it is possible to formulate a regular expression * with lookahead (`?=`) that does not consume the overlapping region. * * {{{ * val madhatter = "(h)(?=(at[^a]+))".r * val madhats = (madhatter findAllMatchIn hathaway map { * case madhatter(x,y) => s"$x$y" * }).toList // List(hath, hatth, hattth, hatttt) * }}} * * Attempting to retrieve match information before performing the first match * or after exhausting the iterator results in [[java.lang.IllegalStateException]]. * See [[scala.util.matching.Regex.MatchIterator]] for details. * * @param source The text to match against. * @return A [[scala.util.matching.Regex.MatchIterator]] of matched substrings. * @example {{{for (words <- """\w+""".r findAllIn "A simple example.") yield words}}} */ def findAllIn(source: CharSequence) = new Regex.MatchIterator(source, this, groupNames)

這裡寫圖片描述

模式匹配二

    //找到首個匹配項
    println(numberPattern.findFirstIn("99ss java, 222 spark,333 hadoop"))

這裡寫圖片描述

模式匹配三

    //數字和字母的組合正則表示式
    val numitemPattern="""([0-9]+) ([a-z]+)""".r
    val numitemPattern(num, item)="99 hadoop"

這裡寫圖片描述

模式匹配四

    //數字和字母的組合正則表示式
    val numitemPattern="""([0-9]+) ([a-z]+)""".r
    val line="93459 spark"
    line match{
      case numitemPattern(num,blog)=> println(num+"\t"+blog)
      case _=>println("hahaha...")
    }

這裡寫圖片描述

val line="93459h spark"
    line match{
      case numitemPattern(num,blog)=> println(num+"\t"+blog)
      case _=>println("hahaha...")
    }

這裡寫圖片描述

本節所有程式原始碼

package kmust.hjr.learningScala19


/**
 * Created by Administrator on 2015/10/17.
 */
object RegularExpressOps {
  def main(args:Array[String]):Unit={
    val regex="""([0-9]+)([a-z]+)""".r//"""原生表達
    val numPattern="[0-9]+".r
    val numberPattern="""\s+[0-9]+\s+""".r

    //findAllIn()方法返回遍歷所有匹配項的迭代器
    for(matchString <- numPattern.findAllIn("99345 Scala,22298 Spark"))
      println(matchString)

    //找到首個匹配項
    println(numberPattern.findFirstIn("99ss java, 222 spark,333 hadoop"))

    //數字和字母的組合正則表示式
    val numitemPattern="""([0-9]+) ([a-z]+)""".r

    val numitemPattern(num, item)="99 hadoop"

    val line="93459h spark"
    line match{
      case numitemPattern(num,blog)=> println(num+"\t"+blog)
      case _=>println("hahaha...")
    }
  }
}

附錄

這裡寫圖片描述