re模塊詳細介紹

阿新 • • 發佈：2017-07-19

ret 表達數據精確 tro F12 -s pat ice

\w  匹配字母、數字及下劃線
\W  匹配非字母、數字及下劃線
\s  匹配任意空白字符
\S  匹配任意非空白字符
\d  匹配任意數字，等價於[0-9]
\D  匹配任意非數字
\A  匹配字符串開始
\Z  匹配字符串結束
\n  匹配一個換行符
\t  匹配一個制表符
^   匹配字符串的開頭
$   匹配字符串的結尾
.   匹配任意字符，除了換行符，當re.DOTALL標記被指定時，則可以匹配包括換行符的任意字符。
[...] 用來表示一組字符，單獨列出：[amk]匹配‘a‘或‘m‘或‘k‘
[^...]不在[]中的字符：[^amk]匹配除了a,m,k之外的字符
*   匹配0個或多個的表達式
+   匹配1個或多個的表達式
?   匹配0個或1個由前面的正則表達式定義的片段，非貪婪方式
{n}   精確匹配n個前面表達式
{n,m} 精確匹配n到m次由前面的正則表達式定義的片段，貪婪方式
a|b   匹配a或b
()    匹配括號內的表達式，也表示一個組
.*  默認為貪婪匹配
.*? 為非貪婪匹配：推薦使用
總結：盡量精簡，詳細的如下
    盡量使用泛型匹配模式：.*
    盡量使用非貪婪模式：.*?
    使用括號得到匹配目標：用group(n)去取得結果
    有換行符就用re.S修改模式
------------------------------------------------------------------------------------------------------------------
例子：

# coding=utf-8

import re
----- /w  /W
ret = re.findall(‘\w‘,‘hello egon 123‘)
print(ret)
ret = re.findall(‘\W‘,‘hello egon 123‘)
print(ret)

------/s  /S
ret = re.findall(‘\s‘,‘hello egon 123‘)
print(ret)
ret = re.findall(‘\S‘,‘hello egon 123‘)
print(ret)

-----\d  \D
ret = re.findall(‘\d‘,‘hello egon 123‘)
print(ret)
ret = re.findall(‘\D‘,‘hello egon 123‘)
print(ret)

-----\A  \Z
ret = re.findall(‘\Ah‘,‘hello egon 123‘)
print(ret)
ret = re.findall(‘123\Z‘,‘hello egon 123‘)
print(ret)

-----\n  \t
ret = re.findall(r‘\n‘,‘hello egon \n123‘)
print(ret)
ret= re.findall(r‘\t‘,‘hello egon \t123‘)
print(ret)

----^  $
print(re.findall(‘^h‘,‘hello egon 123‘))
print(re.findall(‘123$‘,‘hello egon 123‘))

---- .
print(re.findall(‘a.b‘,‘alb‘))

----？
print(re.findall(‘ab?‘,‘a‘))
print(re.findall(‘ab?‘,‘abbb‘))

匹配包含小數在內的數字
print(re.findall(‘\d+\.?\d*‘,‘asdfasdf123as1.13dfa12adsf1asdf3‘))
[‘123‘, ‘1.13‘, ‘12‘, ‘1‘, ‘3‘]

---- .* 默認為貪婪匹配
print(re.findall(‘a.*b‘,‘a1b22222222b‘))#[‘a1b22222222b‘]

----- .*?為非貪婪匹配：推薦使用
print(re.findall(‘a(.*?)b‘,‘a1b22222222b‘))#[‘l‘]

---- +
print(re.findall(‘ab+‘,‘a‘))#[]
print(re.findall(‘ab+‘,‘abbbb123bbb‘))#[‘abbbb‘]

---- {n,m}
print(re.findall(‘ab{2}‘,‘abbbb‘))#[‘abb‘]
print(re.findall(‘ab{2,4}‘,‘abbbb‘))#[‘abbbb‘]
print(re.findall(‘ab{1,}‘,‘abbbb‘))#[‘abbbb‘]
print(re.findall(‘ab{2}‘,‘abbbb‘))#[‘abb‘]

----- []
print(re.findall(‘a[l*-]b‘,‘alb a*b a-b‘))#[‘alb‘, ‘a*b‘, ‘a-b‘]#[]內的都為普通字符了，且如果-沒有被轉意的話，應該放到[]的開頭或結尾
print(re.findall(‘a[^1*-]b‘,‘a1b a*b a-b a=b‘))#[‘a=b‘]#[]內的^代表的意思是取反
print(re.findall(‘a[a-z]b‘,‘alb a*b a-b a=b aeb‘))#[‘alb‘, ‘aeb‘]
print(re.findall(‘a[a-zA-Z]b‘,‘a2b a*b a-b a=b aeb aEb‘))#[‘aeb‘, ‘aEb‘]

------- \
print(re.findall(r‘a\\c‘,‘a\c‘))#[‘a\\c‘]

re_str_patt = "\\\\d\\+"
print(re_str_patt) #\\d\+
reObj = re.compile(re_str_patt)
print(reObj.findall("\\d+"))#[‘\\d+‘]

-------- ():分組
print(re.findall(‘(ab)+123‘,‘ababab123‘))#[‘ab‘],匹配到末尾的ab123中的ab
print(re.findall(‘(?:ab)+123‘,‘ababab123‘))#[‘ababab123‘],findall的結果不是匹配的全部內容，而是組內的內容,?:可以讓結果為匹配的全部內容

--------- |
print(re.findall(‘compan(?:y|ies)‘,‘Too many companies have gone bankrupt, and the next one is my company‘))
[‘companies‘, ‘company‘]


------re模塊提供的方法介紹
findall
print(re.findall(‘e‘,‘alex make love‘))#[‘e‘, ‘e‘, ‘e‘]],返回所有滿足匹配條件的結果,放在列表裏
search
print(re.search(‘e‘,‘alex make love‘).group())#e,只到找到第一個匹配然後返回一個包含匹配信息的對象,該對象可以通過調用group()方法得到匹配的字符串,如果字符串沒有匹配，則返回None。
match
print(re.match(‘e‘,‘alex make love‘))#None,同search，不過在字符串開始處進行匹配，完全可以用search+^ 代替match
split
print(re.split(‘[ab]‘,‘abcd‘))#[‘‘, ‘‘, ‘cd‘]，先按‘a‘分割得到‘‘和‘bcd‘,再對‘‘和‘bcd‘分別按‘b‘分割
sub
print(re.sub(‘a‘,‘A‘,‘alex make love‘))#Alex mAke love

print(re.sub(‘a‘,‘A‘,‘alex make love‘,1))#Alex make love

print(re.sub(‘a‘,‘A‘,‘alex make love‘,2))#Alex mAke love

print(re.sub(‘^(\w+)(.*?\s)(\w+)(.*?\s)(\w+)(.*?)$‘,r‘\5\2\3\4\1‘,‘alex make love‘))#love make alex

print(re.subn(‘a‘,‘A‘,‘alex make love‘))#(‘Alex mAke love‘, 2)結果帶有總替換個數

compile
obj = re.compile(‘\d{2}‘)
s = ‘abc123eeee‘
print(obj.findall(s))#[‘12‘]
print(obj.search(s).group())#12

--------補充

print(re.findall(‘<(?P<tag_name>\w+)>\w+</(?P=tag_name)>‘,‘<h1>hello</h1>‘))#[‘h1‘]

print(re.search(‘<(?P<tag_name>\w+)>\w+</(?P=tag_name)>‘,‘<h1>hello</h1>‘).group())#<h1>hello</h1>

print(re.search(r‘<(\w+)>\w+</(\w+)>‘,‘<h1>hello</h1>‘).group())#<h1>hello</h1>
print(re.search(r‘<(\w+)>\w+</\1>‘,‘<h1>hello</h1>‘).group())#<h1>hello</h1>

print(re.findall(‘-?\d+\.\d*|(-?\d+)‘,‘1-2*(60+(-40.35/5)-(-4*3))‘))
找出所有整數[‘1‘, ‘-2‘, ‘60‘, ‘‘, ‘5‘, ‘-4‘, ‘3‘]
print(re.findall(‘-?\d+\.?\d*‘,‘1-2*(60+(-40.35/5)-(-4*3))‘))#[‘1‘, ‘-2‘, ‘60‘, ‘-40.35‘, ‘5‘, ‘-4‘, ‘3‘]

expression=‘1-2*((60+2*(-3-40.0/5)*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))‘
print(re.search(‘\(([\+\-\*/]\d+\.?\d*)+\)‘,expression).group())##(-3-40.0/5)

最常規匹配
content=‘Hello 123 456 World_This is a Regex Demo‘
res=re.match(‘Hello\s\d\d\d\s\d{3}\s\w{10}.*Demo‘,content)
print(res)
print(res.group())
print(res.span())

泛匹配
content=‘Hello 123 456 World_This is a Regex Demo‘
res=re.match(‘^Hello.*‘,content)
print(res.group())

匹配目標,獲得指定數據

content=‘Hello 123 456 World_This is a Regex Demo‘
res=re.match(‘^Hello\s(\d+)\s(\d+)\s.*Demo‘,content)
print(res.group()) #取所有匹配的內容
print(res.group(1)) #取匹配的第一個括號內的內容
print(res.group(2)) #去匹配的第二個括號內的內容

貪婪匹配：.*代表匹配盡可能多的字符
content=‘Hello 123 456 World_This is a Regex Demo‘
res = re.match(‘^He.*(\d+).*Demo$‘,content)
print(res.group(1))#6,因為.*會盡可能多的匹配，然後後面跟至少一個數字

非貪婪匹配： ？匹配盡可能少的字符
content=‘Hello 123 456 World_This is a Regex Demo‘
res = re.match(‘^He.*?(\d+).*Demo$‘,content)
print(res.group(1))#123

匹配模式：不能匹配換行符
content=‘‘‘Hello 123 456 World_This
is a Regex Demo
‘‘‘
res = re.match(‘He.*?(\d+).*?Demo$‘,content)
print(res)#None
res = re.match(‘He.*?(\d+).*?Demo$‘,content,re.S)
print(res.group(1))#123

轉義:\
content=‘price is $5.00‘
res=re.match(‘price is $5.00‘,content)
print(res)#None

res=re.match(‘price is \$5\.00‘,content)
print(res.group())#price is $5.00

re模塊詳細介紹

ret 表達數據精確 tro F12 -s pat ice \w 匹配字母、數字及下劃線\W 匹配非字母、數字及下劃線\s 匹配任意空白字符\S 匹配任意非空白字符\d 匹配任意數字，等價於[0-9]\D 匹配任意非數字\A 匹配字符串開始\Z 匹配字符

（數據科學學習手劄32）Python中re模塊的詳細介紹

簡介 print 兩種 clas 就是 HERE 每次 str 通過一、簡介　　關於正則表達式，我在前一篇（數據科學學習手劄31）中已經做了詳細介紹，本篇將對Python中自帶模塊re的常用功能進行總結；　　re作為Python中專為正則表達式相關功能做出支持的模

Python中re模塊函數使用介紹

表達式方式字符串的匹配 re.sub earch mic 多語言第一個元素不同 Python中通過re模塊實現了正則表達式的功能。re模塊提供了一些根據正則表達式進行查找、替換、分隔字符串的函數。本文主要介紹re模塊中常用的函數和函數常用場景。 re模塊

Python中正則表達式（re模塊）的使用

python中正則表達式Python中正則表達式（re模塊）的使用1、正則表達式的概述（1）概述：正則表達式是一些由字符和特殊符號組成的字符串，他們描述了模式的重復或者表示多個字符，正則表達式能按照某種模式匹配一系列有相似特征的字符串。正則表達式是一種小型的、高度的專業化的編程語言，（2）Python語言中的

Linux課程筆記 Apache常用模塊的介紹

啟用 data nbsp lai 設置 tom error borde 本地 1. mod_expires模塊介紹 1.1 mod_expires介紹 mod_expires允許通過apache配置文件控制HTTP的”Expires:”和&rdq

JavaWeb網上圖書商城完整項目--day03-1.圖書模塊功能介紹及相關類創建

class default package ren 書籍 logs main java getc 1 前兩天我們學習了user用戶模塊和圖書的分類模塊，接下來我們學習圖書模塊圖書模塊的功能主要是下面的功能： 2 接下來我們創建對應的包我們來看看對應的數據庫表t_bo

python學習-正則表示式及re模塊

我只 com 返回現在輸出 -1 完全匹配 group clu python中的所有正則表達式函數都在re模塊中。import re導入該模塊。 1，創建正則表達式對象想re.compile()傳入一個字符串值，表示正則表達式，它將返回一個Regex模式對象。創建一

python re模塊記錄

findall else 包括 none 第一個 port bsp search re.search import re‘‘‘re模塊 compile match search findall group groups正則表達式常用格式：　　字符：\d \

python基礎學習日誌day5--re模塊

基礎學習多行 nor 反斜杠 ... re.search () bbc 匹配常用正則表達式符號 ‘.‘ 默認匹配除\n之外的任意一個字符，若指定flag DOTALL,則匹配任意字符，包括換行 ‘^‘ 匹配字符開頭，若指定flags MULTILINE

Python標準庫--re模塊

spa 編程斜杠不能當前對象需要 sum pri re:正則表達式 __all__ = [ "match", "fullmatch", "search", "sub", "subn", "split", "findall", "finditer"

Python基礎（13）_python模塊之re模塊(正則表達式)

取反 clas 執行 true dha blog strong 邊界 .com 8、re模塊：正則表達式　　就其本質而言，正則表達式（或 RE）是一種小型的、高度專業化的編程語言，（在Python中）它內嵌在Python中，並通過 re 模塊實現。正則表達式模式被編譯

正則表達式&re模塊

最小 [ ] 一個 pil 字母開始調用 arch style 正則表達式：功能：字符串模糊匹配查詢元字符：. ,^, $, *, +, ？, {}, [ ],| ( ),\ . 通配符點：匹配除換行符以外的任意一個符號 ^：只匹配字符串的開始位置 $：只匹配字符

Python基礎----正則表達式和re模塊

去除 [ ] 在一起 asd 編程語言 strong 優先級詳細說明 call 正則表達式就其本質而言，正則表達式（或 re）是一種小型的、高度專業化的編程語言，（在Python中）它內嵌在Python中，並通過 re 模塊實現。正則表達式模式被編譯成一系列的字節碼，

Python中的正則表達式-re模塊

最大的語法詳細 ict over emp 則表達式 regular mpi 有時候我們需要模糊查找我們需要的字符串等值，這個時候需要用到正則表達式。正則表達式的使用，在python中需要引入re包 import re 1、首先了解下正則表達式的常用語

re模塊

數字出錯 dfs match ear 字符若是 blog dot 常用正則表達式符號 ‘.‘ 默認匹配除\n之外的任意一個字符，若指定flag DOTALL,則匹配任意字符，包括換行 ‘^‘ 匹配字符開頭，若指定flags MULTILINE,這種也可

python re模塊

影響 code 行為則表達式正則表達 dota 換行符編譯模塊 re.Sre.DOTALL影響‘.‘的行為，平時‘.‘匹配除換行符以外的所有字符，指定了本標誌以後，會匹配所有字符包括換行符。 re.compile 使用re.compile()函數，將正則表達式的字符

爬蟲——正則表達式re模塊

編碼範圍爬蟲步驟利用 world 返回操作 tor pat 為什麽要學習正則表達式實際上爬蟲一共就四個主要步驟：明確目標：需清楚目標網站爬：將所有的目標網站的內容全部爬下來取：在爬下來的網站內容中去掉對我們沒有用處的數據，只留取我們需要的數據處理數據：按

正則表達式和re模塊

arch imp 模式 ret tor 元字符進行 -h pat 正則表達式和re模塊 1、正則表達式（1）定義：匹配字符串內容的一種規則。正則表達式是對字符串操作的一種邏輯公式，就是用事先定義好的一些特定字符、及這些特定字符的組合，組成一個“規則字符串”，這個“規則字

Metasploit 一些重要模塊使用介紹

sftp ace nss left eating corrupt server closed sed 　　本文是"T00LS Metasploit(第一季)"的文檔版，是個人在觀看視頻動手操作的一個記錄，僅供學習。文中會介紹Metasploit的一些基本使用：端口掃描、sm

node 模塊部分介紹

生成 nod 介紹 super 報告 gen 服務框架 agent chai 斷言框架 mocha mochawesome 對mocha 定制報告，生成完整成熟的報告。 node-fetch 服務器版fetch superagent 是node 客戶端請求

re模塊詳細介紹

相關推薦