Python之旅.第四章.模塊與包4.09

阿新 • • 發佈：2018-04-09

port 換行符 earch re模塊 shel bytes fda count sub


一、shelve模塊

Shelve（了解），是更高程度的封裝。使用時只針對之前設計生成的文件，可以無視不同平臺自動生成的其他文件。

Json的中間格式為字符串，用w寫入文件

Pickle的中間格式為bytes，用b寫入文件

序列化時更常用Json

import shelve

info1={‘age‘:18,‘height‘:180,‘weight‘:80}

info2={‘age‘:73,‘height‘:150,‘weight‘:80}

d=shelve.open(‘db.shv‘)

d[‘egon‘]=info1

d[‘alex‘]=info2

d.close()

d=shelve.open(‘db.shv‘)

print(d[‘egon‘])

print(d[‘alex‘])

d.close()

d=shelve.open(‘db.shv‘,writeback=True)

d[‘alex‘][‘age‘]=10000

print(d[‘alex‘])

d.close()

d=shelve.open(‘db.shv‘,writeback=True) #如果想改寫，需設置writeback=True

print(d[‘alex‘])

d.close()


二、xml模塊

xml時一種組織數據的形式

xml下的元素對應三個特質，tag， attrib， text

#==========================================>查

import xml.etree.ElementTree as ET

tree=ET.parse(‘a.xml‘)

root=tree.getroot()

三種查找節點的方式

res=root.iter(‘rank‘) # 會在整個樹中進行查找，而且是查找到所有

for item in res:

    print(‘=‘*50)

    print(item.tag) # 標簽名

    print(item.attrib) #屬性

    print(item.text) #文本內容

res=root.find(‘country‘) # 只能在當前元素的下一級開始查找。並且只找到一個就結束

print(res.tag)

print(res.attrib)

print(res.text)

nh=res.find(‘neighbor‘)

print(nh.attrib)

cy=root.findall(‘country‘) # 只能在當前元素的下一級開始查找,

print([item.attrib for item in cy])

#==========================================>改

import xml.etree.ElementTree as ET

tree=ET.parse(‘a.xml‘)

root=tree.getroot()

for year in root.iter(‘year‘):

    year.text=str(int(year.text) + 10)

    year.attrib={‘updated‘:‘yes‘}   #一般不會改tag

tree.write(‘a.xml‘)

#==========================================>增

import xml.etree.ElementTree as ET

tree=ET.parse(‘a.xml‘)

root=tree.getroot()

for country in root.iter(‘country‘):

    year=country.find(‘year‘)

    if int(year.text) > 2020:

        print(country.attrib)

        ele=ET.Element(‘egon‘)

        ele.attrib={‘nb‘:‘yes‘}

        ele.text=‘非常帥‘

        country.append(ele)

        country.remove(year)

tree.write(‘b.xml‘)

三、re模塊（正則）

正則---在爬蟲中最為常用；使用爬蟲時有其他模塊可以導入幫助clear數據，正則也可用於其他方面

import re

print(re.findall(‘\w‘,‘ab 12\+- *&_‘))

print(re.findall(‘\W‘,‘ab 12\+- *&_‘))

print(re.findall(‘\s‘,‘ab \r1\n2\t\+- *&_‘))

print(re.findall(‘\S‘,‘ab \r1\n2\t\+- *&_‘))

print(re.findall(‘\d‘,‘ab \r1\n2\t\+- *&_‘))

print(re.findall(‘\D‘,‘ab \r1\n2\t\+- *&_‘))

print(re.findall(‘\w_sb‘,‘egon alex_sb123123wxx_sb,lxx_sb‘))

print(re.findall(‘\Aalex‘,‘abcalex is salexb‘))

print(re.findall(‘\Aalex‘,‘alex is salexb‘))

print(re.findall(‘^alex‘,‘alex is salexb‘))

print(re.findall(‘sb\Z‘,‘alexsb is sbalexbsb‘))

print(re.findall(‘sb$‘,‘alexsb is sbalexbsb‘))

print(re.findall(‘^ebn$‘,‘ebn1‘)) #^ebn$ 篩出的就是ebn（以ebn開頭，以ebn結尾）

print(re.findall(‘a\nc‘,‘a\nc a\tc a1c‘))

\t為制表符，在不同平臺表示不同的空個數

\A ó ^     #使用^

\Z ó $     #使用$

# 重復匹配：

#.   ?   *   +  {m,n}  .*  .*?

1、.:代表除了換行符外的任意一個字符

. 除了換行符之外的任意一個字符， 如果想不除換行符，後加re.DOTALL

print(re.findall(‘a.c‘,‘abc a1c aAc aaaaaca\nc‘))

print(re.findall(‘a.c‘,‘abc a1c aAc aaaaaca\nc‘,re.DOTALL))

2、？：代表左邊那一個字符重復0次或1次

？不能單獨使用

print(re.findall(‘ab?‘,‘a ab abb abbb abbbb abbbb‘))

3、*：代表左邊那一個字符出現0次或無窮次

print(re.findall(‘ab*‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

4、+ ：代表左邊那一個字符出現1次或無窮次

print(re.findall(‘ab+‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

5、{m,n}:代表左邊那一個字符出現m次到n次

print(re.findall(‘ab?‘,‘a ab abb abbb abbbb abbbb‘))

print(re.findall(‘ab{0,1}‘,‘a ab abb abbb abbbb abbbb‘))

print(re.findall(‘ab*‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

print(re.findall(‘ab{0,}‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

print(re.findall(‘ab+‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

print(re.findall(‘ab{1,}‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

print(re.findall(‘ab{1,3}‘,‘a ab abb abbb abbbb abbbb a1bbbbbbb‘))

6、.*：匹配任意長度，任意的字符=====》貪婪匹配

print(re.findall(‘a.*c‘,‘ac a123c aaaac a *123)()c asdfasfdsadf‘))

7、.*？：非貪婪匹配

print(re.findall(‘a.*?c‘,‘a123c456c‘))

():分組

print(re.findall(‘(alex)_sb‘,‘alex_sb asdfsafdafdaalex_sb‘))

print(re.findall(

    ‘href="(.*?)"‘,

    ‘<li><a id="blog_nav_sitehome" class="menu" href="http://www.cnblogs.com/">博客園</a></li>‘)

[]:匹配一個指定範圍內的字符（這一個字符來自於括號內定義的）

[] 內寫什麽就是其單獨的意義， 可寫0-9 a-zA-Z

print(re.findall(‘a[0-9][0-9]c‘,‘a1c a+c a2c a9c a11c a-c acc aAc‘))

當-需要被當中普通符號匹配時，只能放到[]的最左邊或最 右邊

a-b有特別的意思，所以如果想讓-表示它本身，要將其放在最左或最右

print(re.findall(‘a[-+*]c‘,‘a1c a+c a2c a9c a*c a11c a-c acc aAc‘))

print(re.findall(‘a[a-zA-Z]c‘,‘a1c a+c a2c a9c a*c a11c a-c acc aAc‘))

[]內的^代表取反的意思 （^在[]中表示取反）

print(re.findall(‘a[^a-zA-Z]c‘,‘a c a1c a+c a2c a9c a*c a11c a-c acc aAc‘))

print(re.findall(‘a[^0-9]c‘,‘a c a1c a+c a2c a9c a*c a11c a-c acc aAc‘))

print(re.findall(‘([a-z]+)_sb‘,‘egon alex_sb123123wxxxxxxxxxxxxx_sb,lxx_sb‘))

| :或者

print(re.findall(‘compan(ies|y)‘,‘Too many companies have gone bankrupt, and the next one is my company‘))

(?:   ):代表取匹配成功的所有內容，而不僅僅只是括號內的內容 （（？：   ）表示匹配的結果都要，不單單要（）內的）

print(re.findall(‘compan(?:ies|y)‘,‘Too many companies have gone bankrupt, and the next one is my company‘))

print(re.findall(‘alex|sb‘,‘alex sb sadfsadfasdfegon alex sb egon‘))

re模塊的其他方法：

print(re.findall(‘alex|sb‘,‘123123 alex sb sadfsadfasdfegon alex sb egon‘))

print(re.search(‘alex|sb‘,‘123213 alex sb sadfsadfasdfegon alex sb egon‘).group())

print(re.search(‘^alex‘,‘123213 alex sb sadfsadfasdfegon alex sb egon‘))

print(re.search(‘^alex‘,‘alex sb sadfsadfasdfegon alex sb egon‘).group())

re.search, 取第一個結果，若沒有返回None；若想讓結果直接顯示後加group（）；返回None時用group（）會報錯

print(re.match(‘alex‘,‘alex sb sadfsadfasdfegon alex sb egon‘).group())

print(re.match(‘alex‘,‘123213 alex sb sadfsadfasdfegon alex sb egon‘))

re.match 相當於^版本的search

info=‘a:b:c:d‘

print(info.split(‘:‘))

print(re.split(‘:‘,info))

info=r‘get :a.txt\3333/rwx‘

print(re.split(‘[ :\\\/]‘,info))

re.split與split相比，內部可以使用正則表達式

print(‘egon is beutifull egon‘.replace(‘egon‘,‘EGON‘,1))

print(re.sub(‘(.*?)(egon)(.*?)(egon)(.*?)‘,r‘\1\2\3EGON\5‘,‘123 egon is beutifull egon 123‘))

print(re.sub(‘(lqz)(.*?)(SB)‘,r‘\3\2\1‘,r‘lqz is SB‘))

print(re.sub(‘([a-zA-Z]+)([^a-zA-Z]+)([a-zA-Z]+)([^a-zA-Z]+)([a-zA-Z]+)‘,r‘\5\2\3\4\1‘,r‘lqzzzz123+ is SB‘))

re.sub 與replace相比，內部可以使用正則表達式

pattern=re.compile(‘alex‘)

print(pattern.findall(‘alex is alex alex‘))

print(pattern.findall(‘alexasdfsadfsadfasdfasdfasfd is alex alex‘))

Python之旅.第四章.模塊與包4.09

port 換行符 earch re模塊 shel bytes fda count sub 一、shelve模塊 Shelve（了解），是更高程度的封裝。使用時只針對之前設計生成的文件，可以無視不同平臺自動生成的其他文件。 Json的中間格式為字符串，用w寫入文件 Pic

Python之旅.第四章.模塊與包 4.02

ack 包含 sql mod 名稱空間 app mysql 一次 true 一、模塊的使用之import 1 什麽是模塊？模塊就一系統功能的集合體，在python中，一個py文件就是一個模塊，比如module.py,其中模塊名module2 使用模塊2.1 import 導

Python之旅.第四章.模塊與包.總結（未完待遇）

standard 後綴 att 擔心 lse 綁定做的業務搜索一、模塊模塊：一系列功能的集合體，在python中一個py文件就是一個模塊，模塊名就是py文件的文件名；模塊的好處： 1.減少重復的代碼 2.拿來主義定義模塊：就是創建一個py文件；

Python之旅.第三章.函數3.28

傳值 int 類型內存輸入關鍵字 pass 文件的 tps sta 一、命名關鍵字參數：什麽是命名關鍵字參數？格式：在*後面參數都是命名關鍵字參數特點：1 必須被傳值1 約束函數的調用者必須按照key=value的形式傳值2 約束函數的調用者必須用我們指定的key名d

Python之旅.第三章.函數3.30

幹什麽需要 not item 依賴做出索引 cond 信息一、叠代器 1、什麽是叠代？：叠代是一個重復的過程，並且每次重復都是基於上一次的結果而來2、要想了解叠代器到底是什麽？必須先了解一個概念，即什麽是可叠代的對象？可叠代的對象:在python中，但凡內置有__i

Python之旅.第九章.並發編程

導入 pid 線程理論 self. Go getname 一行代碼 ack 互斥 p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; line-height: 15.0px; font: 13.0px "PingFang SC"; color:

Python之旅.第九章.並發編程。

要花解耦合獲取 ID llb 並發 %s 遇到問題： p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; line-height: 15.0px; font: 13.0px "PingFang SC"; color: #000066; ba

Python之旅.第九章.並發編程.

一個列表 ipc patch remove from 非阻塞並發編程 name p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; line-height: 15.0px; font: 13.0px Courier; color: #0000

Python之旅.第十章.mysql.

man dep 增刪改 port 問題解決 eight 查詢語句 -h fetchmany p.p1 { margin: 0.0px 0.0px 10.0px 0.0px; line-height: 16.0px; font: 13.0px "PingFang SC"; c

現代軟件工程—構建之法---第四章：練習與討論

人在做出鍵盤工具等級分閱讀 nbsp 現實是個 1 、結對項目的案例與論文　　論文已閱讀。 2、性格對合作的影響　　我的ＭＢＴＩ為：ESTJ 管家型——掌控當下，讓各種事務有條不紊地進行　　ESTJ型的人高效率地工作，自我負責，監督他人工作，合理分配和處置

python-學習協程函數模塊與包

擴展性 nco 顯式 printer 中新二分法執行而已 strip 一、協程函數　yield的用法： 1:把函數的執行結果封裝好__iter__和__next__，即得到一個叠代器2：與return功能類似，都可以返回值，但不同的是，return只能返回一次

python學習的第21天模塊之pickle、json、xml、shelve、configparser

簡化我們計算機 pri 區別 type 簽名 pass 企業一、pickle *** 1、作用;專門用於python語言的序列化 PS;（1）什麽是序列化？指的是將內存中的數據結構轉化為一種中間格式，並存儲到硬盤上（2）

《學習之道》第四章組塊的形成1，先把註意力集中

一個拼接拼圖整體理解包含如果所在情況　　當你第一次遇到科學或數學中的全新概念時，往往不知其所雲，就像看見拼圖碎片一樣。　　如果不理解含義，也不考慮其所在的背景，僅記憶一個事實，是不能幫你理清頭緒的，或者說，你仍不會明白一個概念是如何與其他已學的概念拼合在

讀構建之法第四章：兩人合作

應用結對編程使用一對一測試一個比較以及領域程序員寫的代碼最終是人在看，所以代碼規範很重要，原則是：簡明，易讀，無二義性。不光是程序書寫的格式問題，還牽涉到程序設計、模塊之間的關系、設計模式等方方面面。代碼復審的正確定義看代碼是否在代碼規範的框架內正確的

構建之法第四章讀書心得

算法邏輯錯誤規範審核領域之間心得使用部分代碼風格規範——主要是文字上的規定，看似表面文章，實際上非常重要代碼風格的原則是：簡明，易讀，無二義性代碼設計規範——牽涉到程序設計、模塊之間的關系、設計模式等方方面面的通用原則代碼設計規範不光是程序書寫的格

Python自動化運維之模塊與包的使用

模塊與包使用 import from...import... 一、模塊1、什麽是模塊？一個模塊就是一個包含了python定義和聲明的文件，文件名就是模塊名字加上.py的後綴。2、為何要使用模塊？如果你退出python解釋器然後重新進入，那麽你之前定義的函數或者變量都將丟失，因此我們通常將程序寫到文

python爬蟲學習第四章

center 導入編碼 .cn 設置 figure imp cto 內部 html,body,div,span,applet,object,iframe,h1,h2,h3,h4,h5,h6,p,blockquote,pre,a,abbr,acronym,address,

構建之法第四章讀後感

可維護編程思想可維護性有著人在經歷項目能夠疑惑第四章讀後感在經過對第四章的閱讀後，我更加清晰地認識到了在項目開發中，規範二字的重要性，也新學到了除開代碼規範以外，其他對於團隊協作也很重要的東西，比如說構造函數的使用，模塊化的編程思想，當然自己也對一些問題

構建之法第四章第十七章

height 多文檔缺失後來更強最大的 fun 影響手機一、關於goto函數：濫用goto語句會使程序變得很難理解，而不是所有人能正確的使用goto函數，我的問題是：是不是因為這樣所以很多文檔規定禁用或少用goto函數？但其實如果可以正確的使用goto語句就不能

讀構建之法第四章第十七章有感

限制選擇 class blog 了解什麽靈活多重循環價值第四章　1、原文；“函數最好有單一的出口，為了達到這個目的，可以使用goto.只要有助於程序邏輯的清晰體現，什麽方法都可以使用。——P69” 　　問題：關於goto，我記得老師講過，這個在編程中是盡力避

Python之旅.第四章.模塊與包4.09

相關推薦