1. 程式人生 > >Python-字符串解析-正則-re

Python-字符串解析-正則-re

rep pytho lac 表達 循環 取反 BE 情況 一個

正則表達式

  特殊字符序列,匹配檢索和替換文本

  普通字符 + 特殊字符 + 數量,普通字符用來定邊界

更改字符思路

  字符串函數 > 正則 > for循環

元字符  匹配一個字符

  # 元字符大寫,一般都是取小寫的反

  1. 0~9 整數          \d      取反  \D

import re

example_str = "Beautiful is better than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r"\d", example_str))
print(re.findall(r"\D", example_str))

  2. 字母、數字、下劃線     \w      取反  \W

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘\w‘, example_str))
print(re.findall(r‘\W‘, example_str))

  3. 空白字符(空格、\t、\t、\n)   \s      取反  \S

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘\s‘, example_str))
print(re.findall(r‘\S‘, example_str))

  4. 字符集中出現任意一個    []    0-9 a-z A-Z  取反  [^]

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘[0-9]‘, example_str))
print(re.findall(r‘[^0-9]‘, example_str))

  5. 除 \n 之外任意字符

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r".", example_str))

數量詞  指定前面一個字符出現次數

  1. 貪婪和非貪婪

    a. 默認情況下是貪婪匹配,盡可能最大匹配直至某個字符不滿足條件才會停止(最大滿足匹配)

    b. 非貪婪匹配, 在數量詞後面加上 ? ,最小滿足匹配

    c. 貪婪和非貪婪的使用,是程序引起bug重大原因

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘.*u‘, example_str))
print(re.findall(r‘.*?u‘, example_str))

  2. 重復指定次數        {n} {n, m}

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘\d{3}‘, example_str))

  3. 0次和無限多次        *

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘.*‘, example_str))

  4. 1次和無限多次         +  

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘\d+‘, example_str))

  5. 0次或1次           ?     使用思路: 去重

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘7896?‘, example_str))

邊界匹配

  1. 從字符串開頭匹配 ^

  2. 從字符串結尾匹配 $

正則表達式或關系    | 

  滿足 | 左邊或者右邊的正則表達式

import re

example_str = "Beautiful is better_ than ugly 78966828 $ \r \r\n ^Explicit is better than implicit"

print(re.findall(r‘\d+|\w+‘, example_str))

  () 括號內的正則表達式當作單個字符,並且返回()內正則匹配的內容,可以多個,與關系

Python-正則相關模塊-re

  1. 從字符中找到匹配正則的字符 findall()

import re
name = "Hello Python 3.7, 123456789"

total = re.findall(r"\d+", name)
print(total)

  2. 替換正則匹配者字符串 sub()

import re


def replace(value):
    return str(int(value.group()) + 1)


result_str = re.sub(r"\d", replace, name, 0)
print(result_str)

匹配一個中文字符 [\u4E00-\u9FA5]

Python-字符串解析-正則-re