re模塊正則表達式

阿新 • • 發佈：2018-02-15

標誌位輸入加減乘 nor 元素 char 上海開頭必須

引子

請從以下文件裏取出所有的手機號

姓名        地區    身高    體重    電話
況詠蜜     北京    171    48    13651054608
王心顏     上海    169    46    13813234424
馬纖羽     深圳    173    50    13744234523
喬亦菲     廣州    172    52    15823423525
羅夢竹     北京    175    49    18623423421
劉諾涵     北京    170    48    18623423765
嶽妮妮     深圳    177    54    18835324553
賀婉萱     深圳    174    52    18933434452
葉梓萱    上海    171    49    18042432324
杜姍姍   北京    167    49       13324523342

你能想到的辦法是什麽？

必然是下面這種吧？

f = open("兼職白領學生空姐模特護士聯系方式.txt",‘r‘,encoding="gbk")

phones = []

for line in f:
    name,city,height,weight,phone = line.split()
    if phone.startswith(‘1‘) and len(phone) == 11:
        phones.append(phone)

print(phones)

有沒有更簡單的方式？

手機號是有規則的，都是數字且是11位，再嚴格點，就都是1開頭，如果能把這樣的規則寫成代碼，直接拿規則代碼匹配文件內容不就行了？技術分享圖片

這麽nb的玩法是什麽？它的名字叫正則表達式！

re模塊

正則表達式就是字符串的匹配規則，在多數編程語言裏都有相應的支持，python裏對應的模塊是re

常用的表達式規則

‘.‘     默認匹配除\n之外的任意一個字符，若指定flag DOTALL,則匹配任意字符，包括換行
‘^‘     匹配字符開頭，若指定flags MULTILINE,這種也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
‘$‘     匹配字符結尾， 若指定flags MULTILINE ,re.search(‘foo.$‘,‘foo1\nfoo2\n‘,re.MULTILINE).group() 會匹配到foo1
‘*‘     匹配*號前的字符0次或多次， re.search(‘a*‘,‘aaaabac‘)  結果‘aaaa‘
‘+‘     匹配前一個字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 結果[‘ab‘, ‘abb‘]
‘?‘     匹配前一個字符1次或0次 ,re.search(‘b?‘,‘alex‘).group() 匹配b 0次
‘{m}‘   匹配前一個字符m次 ,re.search(‘b{3}‘,‘alexbbbs‘).group()  匹配到‘bbb‘
‘{n,m}‘ 匹配前一個字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 結果‘abb‘, ‘ab‘, ‘abb‘]
‘|‘     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 結果‘ABC‘
‘(...)‘ 分組匹配， re.search("(abc){2}a(123|45)", "abcabca456c").group() 結果為‘abcabca45‘


‘\A‘    只從字符開頭匹配，re.search("\Aabc","alexabc") 是匹配不到的，相當於re.match(‘abc‘,"alexabc") 或^
‘\Z‘    匹配字符結尾，同$ 
‘\d‘    匹配數字0-9
‘\D‘    匹配非數字
‘\w‘    匹配[A-Za-z0-9]
‘\W‘    匹配非[A-Za-z0-9]
‘s‘     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 ‘\t‘

‘(?P<name>...)‘ 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{‘province‘: ‘3714‘, ‘city‘: ‘81‘, ‘birthday‘: ‘1993‘}

re的匹配語法有以下幾種

re.match 從頭開始匹配
re.search 匹配包含
re.findall 把所有匹配到的字符放到以列表中的元素返回
re.split 以匹配到的字符當做列表分隔符
re.sub 匹配字符並替換
re.fullmatch 全部匹配

re.compile(pattern, flags=0)

Compile a regular expression pattern into a regular expression object, which can be used for matching using its match(), search() and other methods, described below.

The sequence

prog = re.compile(pattern)
result = prog.match(string)

is equivalent to

result = re.match(pattern, string)

but using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program.

re.match(pattern, string, flags=0)

從起始位置開始根據模型去字符串中匹配指定內容，匹配單個

pattern 正則表達式
string 要匹配的字符串
flags 標誌位，用於控制正則表達式的匹配方式

import re
obj = re.match(‘\d+‘, ‘123uuasf‘)
if obj:
    print obj.group()

Flags標誌符

re.I(re.IGNORECASE): 忽略大小寫（括號內是完整寫法，下同）
M(MULTILINE): 多行模式，改變‘^‘和‘$‘的行為
S(DOTALL): 改變‘.‘的行為,make the ‘.‘ special character match any character at all, including a newline; without this flag, ‘.‘ will match anything except a newline.
X(re.VERBOSE) 可以給你的表達式寫註釋，使其更可讀，下面這2個意思一樣

a = re.compile(r"""\d + # the integral part
                \. # the decimal point
                \d * # some fractional digits""", 
                re.X)

b = re.compile(r"\d+\.\d*")

re.search(pattern, string, flags=0)

根據模型去字符串中匹配指定內容，匹配單個

re的匹配語法有以下幾種

re.match 從頭開始匹配
re.search 匹配包含
re.findall 把所有匹配到的字符放到以列表中的元素返回
re.split 以匹配到的字符當做列表分隔符
re.sub 匹配字符並替換
re.fullmatch 全部匹配

re.compile(pattern, flags=0)

Compile a regular expression pattern into a regular expression object, which can be used for matching using its match(), search() and other methods, described below.

The sequence

prog = re.compile(pattern)
result = prog.match(string)

is equivalent to

result = re.match(pattern, string)

but using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program.

re.match(pattern, string, flags=0)

從起始位置開始根據模型去字符串中匹配指定內容，匹配單個

pattern 正則表達式
string 要匹配的字符串
flags 標誌位，用於控制正則表達式的匹配方式

import re
obj = re.match(‘\d+‘, ‘123uuasf‘)
if obj:
    print obj.group()

Flags標誌符

re.I(re.IGNORECASE): 忽略大小寫（括號內是完整寫法，下同）
M(MULTILINE): 多行模式，改變‘^‘和‘$‘的行為
S(DOTALL): 改變‘.‘的行為,make the ‘.‘ special character match any character at all, including a newline; without this flag, ‘.‘ will match anything except a newline.
X(re.VERBOSE) 可以給你的表達式寫註釋，使其更可讀，下面這2個意思一樣

a = re.compile(r"""\d + # the integral part
                \. # the decimal point
                \d * # some fractional digits""", 
                re.X)

b = re.compile(r"\d+\.\d*")

re.search(pattern, string, flags=0)

根據模型去字符串中匹配指定內容，匹配單個

import re
obj = re.search(‘\d+‘, ‘u123uu888asf‘)
if obj:
    print obj.group()

re.findall(pattern, string, flags=0)

match and search均用於匹配單值，即：只能匹配字符串中的一個，如果想要匹配到字符串中所有符合條件的元素，則需要使用 findall。

import re
obj = re.findall(‘\d+‘, ‘fa123uu888asf‘)
print obj

re.sub(pattern, repl, string, count=0, flags=0)

用於替換匹配的字符串

>>>re.sub(‘[a-z]+‘,‘sb‘,‘武配齊是abc123‘,)

>>> re.sub(‘\d+‘,‘|‘, ‘alex22wupeiqi33oldboy55‘,count=2)
‘alex|wupeiqi|oldboy55‘

相比於str.replace功能更加強大

re.split(pattern, string, maxsplit=0, flags=0)

>>>s=‘9-2*5/3+7/3*99/4*2998+10*568/14‘
>>>re.split(‘[\*\-\/\+]‘,s)
[‘9‘, ‘2‘, ‘5‘, ‘3‘, ‘7‘, ‘3‘, ‘99‘, ‘4‘, ‘2998‘, ‘10‘, ‘568‘, ‘14‘]

>>> re.split(‘[\*\-\/\+]‘,s,3)
[‘9‘, ‘2‘, ‘5‘, ‘3+7/3*99/4*2998+10*568/14‘]

re.fullmatch(pattern, string, flags=0)

整個字符串匹配成功就返回re object, 否則返回None

re.fullmatch(‘\w+@\w+\.(com|cn|edu)‘,"[email protected]")

練習：

1.驗證手機號是否合法

2.驗證郵箱是否合法

3.開發一個簡單的python計算器，實現加減乘除及拓號優先級解析

用戶輸入 1 - 2 * ( (60-30 +(-40/5) * (9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2) )等類似公式後，必須自己解析裏面的(),+,-,*,/符號和公式(不能調用eval等類似功能偷懶實現)，運算後得出結果，結果必須與真實的計算器所得出的結果一致

hint:

re.search(r‘\([^()]+\)‘,s).group()#可拿到最裏層的括號中的值 

‘(-40/5)

re模塊正則表達式

re模塊正則表達式

引子

re模塊

常用的表達式規則

re的匹配語法有以下幾種

re的匹配語法有以下幾種

練習：

Python基礎（13）_python模塊之re模塊(正則表達式)

Python的學習之旅———re 模塊正則表達式

re模塊正則表達式

re模塊正則表達式

Python re模塊,正則表達式

python - re模塊(正則表達式)

python re模塊正則表達式

re模塊正則匹配

Python: 字符串搜索和匹配,re.compile() 編譯正則表達式字符串，然後使用match() , findall() 或者finditer() 等方法

re庫、正則表達式基本使用

re模塊-正則模塊

re模塊(正則)

Python中正則表達式（re模塊）的使用

正則表達式&re模塊

Python基礎----正則表達式和re模塊

Python中的正則表達式-re模塊

爬蟲——正則表達式re模塊

正則表達式和re模塊

re模塊和正則表達式

Python正則表達式模塊re

re模塊正則表達式

引子

re模塊

常用的表達式規則

re的匹配語法有以下幾種

re的匹配語法有以下幾種

練習：

相關推薦