Python3 re(正則表示式)

阿新 • • 發佈：2019-02-11

#coding=utf-8
# regular.py 正則表示式
import re # 正則模組

def regular():
    data = "She is more than pretty. 520"

    # --- 正則 ---
    reg = r"mo" # 指定字元 => span=(7, 9), match='mo'
    reg = r"." # (.)單個字元 => span=(0, 1), match='S'
    reg = r"\." # (\)轉義符 => span=(23, 24), match='.'
    reg = r"[.]" 
 # ([])字元集合(注意:部分特殊字元失去特殊意義) => span=(23, 24), match='.'
    reg = r"[love]" # []內任意字元 => span=(2, 3), match='e'
    reg = r"[i-u]" # (-)範圍 => span=(4, 5), match='i'
    reg = r"t{2}" # {}內為長度(3個6) => span=(20, 22), match='tt'
    reg = r"t{1,3}" # {M,} / {.N} / {N} => span=(12, 13), match='t' 

    reg = r"(i|o|u){1}" # (())組 => span=(4, 5), match='i'
    reg = r"^S" # (^)開頭 => span=(0, 1), match='S'
    reg = r"[^S]" # ([^])取反(不含H) => span=(1, 2), match='h'
    reg = r"520$" # ($)結尾 => span=(25, 28), match='520'
    reg = r"et*" # (*)匹配{0,}個表示式 => ['e', 'e', 'ett']
    reg = r"et+" 
 # (+)匹配{1,}個表示式 => ['ett']
    reg = r"et?" # (?)匹配{0,1}個表示式 => ['e', 'e', 'et']
    reg = r".+?e" # (?)非貪婪模式(span=(0, 20), match='She is more than pre' => span=(0, 3), match='She')

    reg = r"\145" # ascii標的8進位制數(145=101=e) => span=(2, 3), match='e'
    reg = r"\d" # (\d)單個數字 => span=(25, 26), match='5' (推薦:[0-9])
    reg = r"\D" # (\D)非數字 => span=(0, 1), match='S' (推薦:[^0-9])
    reg = r"\s" # (\s)空白字元 => span=(3, 4), match=' ' (推薦:[\t\n\r\f\v])
    reg = r"\S" # (\S)非空白字元 => span=(0, 1), match='S' (推薦:[^\t\n\r\f\v])
    reg = r"\w" # (\w)單詞 => span=(0, 1), match='S' (推薦:[a-zA-Z0-9_])
    reg = r"\W" # (\W)非單詞 => span=(3, 4), match=' ' (推薦:[^a-zA-Z0-9_])
    reg = r"\AS" # (\A)開頭 => span=(0, 1), match='S'
    reg = r"520\Z" # (\Z)結尾 => span=(25, 28), match='520'
    reg = r"y\b" # (\b)單詞邊界(Hello) => span=(22, 23), match='y'
    reg = r"o\B" # (\B)非單詞邊界(world) => span=(8, 9), match='o'
    reg = r"[01]\d\d|2[0-4]\d|25[0-5]" # 或(|) 多位數(匹配0 - 255 直接的數字)


    index = re.search(reg, data) # 查詢單個匹配項
    index = re.match(r"She", data) # 匹配開頭 => span=(0, 3), match='She'
    index = re.fullmatch(r".+", data) # 匹配全部 => span=(0, 28), match='She is more than pretty. 520'

    lists = re.findall(reg, data) # 查詢所有匹配項(列表)
    lists = re.split(r"o", data, maxsplit=1) # 根據正則分割字串(maxsplit分割次數) => ['She is m', 're than pretty. 520']

    strs = re.sub(r"\.", r"!", data, count=1) # 替換(count:替換次數)(匹配替換,未匹配原樣) => She is more than pretty! 520

    re.purge() # 清除正則表示式快取



    # --- 正則表示式物件 ---
    pat = re.compile(r"e") # 編譯成正則物件

    index = pat.search(data) # 查詢單個匹配項 => span=(2, 3), match='e'
    index = pat.search(data, 5) # => span=(10, 11), match='e'
    index = pat.search(data, 1, 10)
    index = pat.match(data) # 匹配開頭 => None
    index = pat.match(data, 2) # => span=(2, 3), match='e'
    index = pat.match(data, 1, 10)
    index = pat.fullmatch(data) # 匹配全部 => None
    index = pat.fullmatch(data, 2) # => None
    index = pat.fullmatch(data, 2, 3) # span=(2, 3), match='e'

    lists = pat.split(data, maxsplit=0) # 分割 => ['Sh', ' is mor', ' than pr', 'tty. 520']
    lists = pat.findall(data) # 查詢全部 => ['e', 'e', 'e']
    lists = pat.findall(data, 5) # => ['e', 'e']
    lists = pat.findall(data, 1, 10) # => ['e']

    strs = pat.sub(r"o", data, count=0) # 替換 => Sho is moro than protty. 520


    # --- Match ---
    match = index;
    # span=(2, 3), match='e'
    strs = match.string # 被匹配的資料 => She is more than pretty. 520
    strs = match.group() # 獲取 match 資料 => e
    pos = match.pos # => 2
    pos = match.endpos # => 3



if __name__ == "__main__":
    regular()

Python3 re(正則表示式)

#coding=utf-8 # regular.py 正則表示式 import re # 正則模組 def regular(): data = "She is more than pretty. 520" # --- 正則 ---

python3爬蟲——正則表示式re詳解（1）

（一）什麼是正則表示式還早呢過這表示式是對字串操作的一種邏輯公式，就是用實現定義好的一些特定的字元，及這些特定的字元的組合，組成一個“規則字串”，這個“規則字串”用來表達對字串的一種過濾邏輯 ps：正則表示式非python獨有，使用re模組即可實現（二

python3 re正則模塊

python基礎 python正則表達式使用 python re模塊一、常用的正則表達式：1、"."：默認匹配除\n之外的任意一個字符，若指定flag DOTALL，則匹配任意字符，包括換行2、"^"：匹配字符開頭，若指定flag MULTILINE，這種

RE正則表示式總結（一）

一、概念正則表示式，又稱規則表示式。（英語：Regular Expression，在程式碼中常簡寫為regex、regexp或RE），電腦科學的一個概念。正則表示式是對字元串（包括普通字元（例如，a 到 z 之間的字母）和特殊字元（稱為“

logging日誌模組，re正則表示式模組，hashlib hash演算法相關的庫，

logging：功能完善的日誌模組 import logging #日誌的級別 logging.debug("這是個除錯資訊")#級別10 #常規資訊 logging.info("常規資訊")#20 #警告資訊 logging.warning("警告

python學習 re正則表示式

一、正則的常用符號： . 匹配任一字元，換行符\n除外 * 匹配前一個字元0次或無限次？匹配前一個字元0次或1次 .* 貪心演算法（儘可能多的匹配） .*? &nb

python模組-re正則表示式

元字元 . * + ? ^ $ { } [ ] - &n

python html抓取，並用re正則表示式解析（一）

html抓取，並用re進行解析 #coding=utf-8 import urllib.request import re ''' url :"http://money.163.com/special/pinglun/" 抓取第一頁的新聞資訊，並按照以下規格輸出。 [ {'ti

python html抓取，並用re正則表示式解析（二）

需求： url: “http://search.jd.com/Search?keyword=幼貓貓糧&enc=utf-8#filter” 給出一個jd_search(keyword)方法，keyword為你要查詢的東西，比如：貓糧、手機，替換上面url中的keyword，得到一個新網

Python庫-re(正則表示式)

re庫是python的一個標準庫，不需要自己用pip額外下載，直接呼叫即可。下面介紹以下庫中函式的作用。 1.re.compile(patter, flags=0) patter是一個正則表示式字串，例如"[0-9]+"，該函式返回一個模式物件(patter object)，str型別 2

python RE正則表示式基本知識

1． Python正則式的基本用法 1.1基本規則 1.2重複 1.2.1最小匹配與精確匹配 1.3前向界定與後向界定 1.4組的基本知識 2． re模組的基本函式 2.1使用compile加速 2.2 match和sear

python26 re正則表示式

#coding:utf-8 #/usr/bin/python """ 2018-11-25 dinghanhua re """ import re teststr = '"id":"2994925","publisher":"Yahoo Press","isbn10":"05965177

Python 常用模組之re 正則表示式的使用

re模組用來使用正則表示式。正則表示式用來對字串進行搜尋的工作。我們最應該掌握正則表示式的查詢，更改，刪除的功能。特別是做爬蟲的時候，re模組就顯得格外重要。 1.查詢 1 import re 2 a = re.match("abc","aabccc") 3 b = re.search("abc",

re正則表示式模組

import re print(re.match('^chenrong','chenronghua123')) # ^ 表示開頭 #<_sre.SRE_Match object; span=(0, 8), match='chenrong'> # span=(匹配到的結果第幾位開始,

Python中re(正則表示式)常用函式總結

1 re.match #嘗試從字串的開始匹配一個模式 re.match的函式原型為：re.match(pattern, string, flags) 第一個引數是正則表示式，這裡為"(\w+)\s"，如果匹配成功，則返回一個Match，否則返

python 66：re正則表示式8（全- tcy）

目錄： 1.re-概述 https://mp.csdn.net/postedit/851568392.re-函式 https://mp.csdn.net/postedit/851569933.re-Pattern https://mp.csdn.net/postedit/85157

用python3.x正則表示式匹配中文字串

re.match('^[\u4e00-\u9fa5|，。；？]+\?$','你好哈人日你，媽我。我？；們我為啥說在張志這?') 這演示了簡體，繁體，中文標點符號等等。可以看出python3.x對於中文字串匹配是可以執行得很好滴<pre name="code" cla

python----使用re正則表示式刷選資料，去重，列表，取特定行資料（適用於web的html回包資料提取）

python—-使用re正則表示式刷選資料，去重，列表，取特定行資料（適用於web的html回包資料提取）環境配置：對目標伺服器的日誌檔案進行刷選特定資料（192.168.4.27） /usr/

模組3 re + 正則表示式

1. 正則表示式匹配字串元字元 . 除了換行 \w 數字, 字母, 下劃線 \d 數字 \s 空白符 \n \t \b 單詞的邊界 \W 非xxx \D

Python3中正則表示式使用方法（崔慶才）

正則表示式本節我們看一下正則表示式的相關用法，正則表示式是處理字串的強大的工具，它有自己特定的語法結構，有了它，實現字串的檢索、替換、匹配驗證都不在話下。當然對於爬蟲來說，有了它，我們從HTML裡面提取我們想要的資訊就非常方便了。

Python3 re(正則表示式)

相關推薦