1. 程式人生 > >Python基礎day-13[模塊:re,subprocess未完]

Python基礎day-13[模塊:re,subprocess未完]

str exe nbsp ati req 滿足 return tin for

re(續):

  re默認是貪婪模式。

  貪婪模式:在滿足匹配時,匹配盡可能長的字符串。

import re
s = askldlaksdabccccccccasdabcccalsdacbcccacbcccabccc

res = re.findall(abc+,s)
print(res)

res = re.findall(abc+?,s)    #在規則後面加?來取消貪婪模式。
print(res)

執行結果:
D:\Python\Python36-32\python.exe E:/Python/DAY-15/3213.py
[abcccccccc, abccc, abccc
] [abc, abc, abc] Process finished with exit code 0

re的模塊的常用方式:

re.split(): 類似字符串的split命令但是比 字符串的split 更強大。

import re
s = askldlaksdab8ccccc.cccas8dabc8cc.alsdacbcccac.cccab8ccc

res = re.split(\d,s)
print(res)
res = re.split((\d+),s)    #加()來保留分割符
print(res)


執行結果:
D:\Python\Python36
-32\python.exe E:/Python/DAY-15/3213.py [askldlaksdab, ccccc.cccas, dabc, cc.alsdacbcccac.cccab, ccc] [askldlaksdab, 8, ccccc.cccas, 8, dabc, 8, cc.alsdacbcccac.cccab, 8, ccc] Process finished with exit code 0

re.sub():類似replace 替換操作。

import re
s = askldlaksdab8ccccc.cccas8dabc8cc.alsdacbcccac.cccab8ccc
res = re.sub(abc+,123,s) print(res) 執行結果: D:\Python\Python36-32\python.exe E:/Python/DAY-15/3213.py askldlaksdab8ccccc.cccas8d1238cc.alsdacbcccac.cccab8ccc Process finished with exit code 0

re.compile():編譯

import re
s = askldlaksdab8ccccc.cccas8dabc8cc.alsdacbcccac.cccab8ccc

obj = re.compile(\d+)   #定義一個對象對應的編譯規則
res = obj.findall(s)    #調用處理
print(res)

執行結果:
D:\Python\Python36-32\python.exe E:/Python/DAY-15/3213.py
[8, 8, 8, 8]

Process finished with exit code 0

一個小爬蟲正則練習(爬校花網)

import requests,re,json
url = http://www.xiaohuar.com/2014.html‘    #校花排行榜top120
def req():
    req_str = requests.get(url)
    # print(‘encoding‘,req_str.encoding)
    return req_str.text

def run():
    html = req()
    html = html.encode(Latin-1).decode(gbk)
    # print(html)
    obj = re.compile(<div class="top-title">(.*?)</div>.*?<div class="title">.*?target="_blank">(.*?)</a></span></div>,re.S)   #匹配top排名序號和姓名學校
    res = obj.findall(html)
    return res

dic = {}
res = run()
for x in res:
    dic[x[0]]=x[1]
data = json.dumps(dic)       #序列化
with open(xiaohua.json,a,encoding=utf-8) as f:
    f.write(data)

with open(xiaohua.json, r, encoding=utf-8) as f:
    data = json.load(f)   #反序列化
    print(data)

subprocess:

   subprocess模塊允許一個進程創建一個新的子進程,通過管道連接到子進程的stdin/stdout/stderr,獲取子進程的返回值等操作。

import subprocess

s = subprocess.Popen(dir,shell=True,stdout=subprocess.PIPE)
print(s.stdout.read().decode(gbk))

執行結果:
D:\Python\Python36-32\python.exe E:/Python/DAY-15/3213.py
 驅動器 E 中的卷沒有標簽。
 卷的序列號是 383D-453A

 E:\Python\DAY-15 的目錄

2017/06/27  19:52    <DIR>          .
2017/06/27  19:52    <DIR>          ..
2017/06/27  19:52               338 3213.py
2017/06/27  19:47               778 tmp.py
2017/06/27  19:25             9,146 xiaohua.json
               3 個文件         10,262 字節
               2 個目錄 117,877,260,288 可用字節


Process finished with exit code 0

Python基礎day-13[模塊:re,subprocess未完]