常用模組之 shutil，json，pickle，shelve，xml，configparser

阿新 • • 發佈：2018-11-11

一、shutil

高階的檔案、資料夾、壓縮包處理模組

高階的 檔案、資料夾、壓縮包 處理模組

shutil.copyfileobj(fsrc, fdst[, length])   將檔案內容拷貝到另一個檔案中

import shutil
shutil.copyfileobj(open('old.xml','r'), open('new.xml', 'w'))
 

shutil.copyfile(src, dst)     拷貝檔案
shutil.copyfile('f1.log', 'f2.log') #目標檔案無需存在
 

shutil.copymode(src, dst)    僅拷貝許可權。內容、組、使用者均不變

shutil.copymode( 
'f1.log', 'f2.log') #目標檔案必須存在
 

shutil.copystat(src, dst)   僅拷貝狀態的資訊，包括：mode bits, atime, mtime, flags

shutil.copystat('f1.log', 'f2.log') #目標檔案必須存在
 

shutil.copy(src, dst)   拷貝檔案和許可權

import shutil
shutil.copy('f1.log', 'f2.log')


shutil.copy2(src, dst)   拷貝檔案和狀態資訊

import shutil
 
shutil.copy2( 
'f1.log', 'f2.log')
 
shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
遞迴的去拷貝資料夾

import shutil
shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*')) #目標目錄不能存在，注意對folder2目錄父級目錄要有可寫許可權，ignore的意思是排除 


import shutil
shutil.copytree( 
'f1', 'f2', symlinks=True, ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

'''
通常的拷貝都把軟連線拷貝成硬連結，即對待軟連線來說，建立新的檔案
'''

 

shutil.rmtree(path[, ignore_errors[, onerror]])
遞迴的去刪除檔案

import shutil

shutil.rmtree('folder1')
 

shutil.move(src, dst)
遞迴的去移動檔案，它類似mv命令，其實就是重新命名。

import shutil

shutil.move('folder1', 'folder3')
 

shutil.make_archive(base_name, format,...)

建立壓縮包並返回檔案路徑，例如：zip、tar

建立壓縮包並返回檔案路徑，例如：zip、tar

base_name： 壓縮包的檔名，也可以是壓縮包的路徑。只是檔名時，則儲存至當前目錄，否則儲存至指定路徑，
如 data_bak                       =>儲存至當前路徑
如：/tmp/data_bak =>儲存至/tmp/
format：    壓縮包種類，“zip”, “tar”, “bztar”，“gztar”
root_dir：    要壓縮的資料夾路徑（預設當前目錄）
owner：    使用者，預設當前使用者
group：    組，預設當前組
logger：    用於記錄日誌，通常是logging.Logger物件

將 /data 下的檔案打包放置當前程式目錄
import shutil
ret = shutil.make_archive("data_bak", 'gztar', root_dir='/data')
 

將 /data下的檔案打包放置 /tmp/目錄
import shutil
ret = shutil.make_archive("/tmp/data_bak", 'gztar', root_dir='/data') 


用shutil直接解壓
shutil.unpack_archive("1111.zip")

shutil

shutil 對壓縮包的處理是呼叫 ZipFile 和 TarFile 兩個模組來進行的

import zipfile

# 壓縮
z = zipfile.ZipFile('laxi.zip', 'w')
z.write('a.log')
z.write('data.data')
z.close()

# 解壓
z = zipfile.ZipFile('laxi.zip', 'r')
z.extractall(path='.')
z.close()

zipfile壓縮解壓

import tarfile

# 壓縮
 t=tarfile.open('/tmp/egon.tar','w')
 t.add('/test1/a.py',arcname='a.bak')
 t.add('/test1/b.py',arcname='b.bak')
 t.close()


# 解壓
 t=tarfile.open('/tmp/egon.tar','r')
 t.extractall('/egon')
 t.close()

tarfile壓縮解壓

二、json與pickle

什麼叫序列化——將原本的字典、列表等內容轉換成一個字串的過程就叫做序列化

之前我們學習過用eval內建方法可以將一個字串轉成python物件，不過，eval方法是有侷限性的，對於普通的資料型別，json.loads和eval都能用，但遇到特殊型別的時候，eval就不管用了,所以eval的重點還是通常用來執行一個字串表示式，並返回表示式的值，


eval()函式十分強大，但是eval是做什麼的？eval官方demo解釋為：將字串str當成有效的表示式來求值並返回計算結果。強大的函式有代價。安全性是其最大的缺點。
想象一下，如果我們從檔案中讀出的不是一個數據結構，而是一句"刪除檔案"類似的破壞性語句，那麼後果實在不堪設設想。
而使用eval就要擔這個風險。
所以，我們並不推薦用eval方法來進行反序列化操作(將str轉換成python中的資料結構)

為什麼不用eval進行反序列化操作

序列化的目的：1，持久儲存狀態

2，跨平臺資料互動

json

如果我們要在不同的程式語言之間傳遞物件，就必須把物件序列化為標準格式，比如XML，但更好的方法是序列化為JSON，因為JSON表示出來就是一個字串，可以被所有語言讀取，也可以方便地儲存到磁碟或者通過網路傳輸。JSON不僅是標準格式，並且比XML更快，而且可以直接在Web頁面中讀取，非常方便。

JSON表示的物件就是標準的JavaScript語言的物件，JSON和Python內建的資料型別對應如下：

 js 中的資料型別  python資料型別 的對應關係
    {}              字典
    []              list
    string ""       str
    int/float       int/float
    true/false      True/False
    null            None

js與JSON

json格式的語法規範：
最外層通常是一個字典或列表
{} or []
只要你想寫一個json格式的資料那麼最外層直接寫{}
字串必須是雙引號
你可以在裡面套任意多的層次

json模組的核心功能 dump dumps load loads 不帶s封裝write 和 read

import json
dic = {'k1':'v1','k2':'v2','k3':'v3'}
str_dic = json.dumps(dic)  #序列化：將一個字典轉換成一個字串
print(type(str_dic),str_dic)  #<class 'str'> {"k3": "v3", "k1": "v1", "k2": "v2"}
#注意，json轉換完的字串型別的字典中的字串是由""表示的

dic2 = json.loads(str_dic)  #反序列化：將一個字串格式的字典轉換成一個字典
#注意，要用json的loads功能處理的字串型別的字典中的字串必須由""表示
print(type(dic2),dic2)  #<class 'dict'> {'k1': 'v1', 'k2': 'v2', 'k3': 'v3'}


list_dic = [1,['a','b','c'],3,{'k1':'v1','k2':'v2'}]
str_dic = json.dumps(list_dic) #也可以處理巢狀的資料型別 
print(type(str_dic),str_dic) #<class 'str'> [1, ["a", "b", "c"], 3, {"k1": "v1", "k2": "v2"}]
list_dic2 = json.loads(str_dic)
print(type(list_dic2),list_dic2) #<class 'list'> [1, ['a', 'b', 'c'], 3, {'k1': 'v1', 'k2': 'v2'}]

loads與dumps

import json
f = open('json_file','w')
dic = {'k1':'v1','k2':'v2','k3':'v3'}
json.dump(dic,f)  #dump方法接收一個檔案控制代碼，直接將字典轉換成json字串寫入檔案
f.close()

f = open('json_file')
dic2 = json.load(f)  #load方法接收一個檔案控制代碼，直接將檔案中的json字串轉換成資料結構返回
f.close()
print(type(dic2),dic2)

load與dump

import json
f = open('file','w')
json.dump({'國籍':'中國'},f)
ret = json.dumps({'國籍':'中國'})
f.write(ret+'\n')
json.dump({'國籍':'美國'},f,ensure_ascii=False)
ret = json.dumps({'國籍':'美國'},ensure_ascii=False)
f.write(ret+'\n')
f.close()

ensure_ascii關鍵字引數

Serialize obj to a JSON formatted str.(字串表示的json物件) 
Skipkeys：預設值是False，如果dict的keys內的資料不是python的基本型別(str,unicode,int,long,float,bool,None)，設定為False時，就會報TypeError的錯誤。此時設定成True，則會跳過這類key 
ensure_ascii:，當它為True的時候，所有非ASCII碼字元顯示為\uXXXX序列，只需在dump時將ensure_ascii設定為False即可，此時存入json的中文即可正常顯示。) 
If check_circular is false, then the circular reference check for container types will be skipped and a circular reference will result in an OverflowError (or worse). 
If allow_nan is false, then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification, instead of using the JavaScript equivalents (NaN, Infinity, -Infinity). 
indent：應該是一個非負的整型，如果是0就是頂格分行顯示，如果為空就是一行最緊湊顯示，否則會換行且按照indent的數值顯示前面的空白分行顯示，這樣打印出來的json資料也叫pretty-printed json 
separators：分隔符，實際上是(item_separator, dict_separator)的一個元組，預設的就是(‘,’,’:’)；這表示dictionary內keys之間用“,”隔開，而KEY和value之間用“：”隔開。 
default(obj) is a function that should return a serializable version of obj or raise TypeError. The default simply raises TypeError. 
sort_keys：將資料根據keys的值進行排序。 
To use a custom JSONEncoder subclass (e.g. one that overrides the .default() method to serialize additional types), specify it with the cls kwarg; otherwise JSONEncoder is used.

其他引數說明

import json
data = {'username':['李華','二愣子'],'sex':'male','age':16}
json_dic2 = json.dumps(data,sort_keys=True,indent=2,separators=(',',':'),ensure_ascii=False)
print(json_dic2)

json格式化輸出

pickle

pickle模組主要功能 dump load dumps loads
dump是序列化 load反序列化
不帶s的是幫你封裝write read 更方便

load函式可以多次執行每次load 都是往後在讀一個物件如果沒有了就丟擲異常Ran out of input

import pickle
# 使用者註冊後得到的資料
name = "高跟"
password = "123"
height = 1.5
hobby = ["吃","喝","賭","飄",{1,2,3}]


with open("userdb.txt","wt",encoding="utf-8") as f:
     text = "|".join([name,password,str(height)])
     f.write(text)

pickle支援python中所有的資料型別
user = {"name":name,"password":password,"height":height,"hobby":hobby,"test":3}


序列化的過程
with open("userdb.pkl","ab") as f:
     userbytes = pickle.dumps(user)
     f.write(userbytes)


反序列化過程
with open("userdb.pkl","rb") as f:
    userbytes = f.read()
    user = pickle.loads(userbytes)
    print(user)
    print(type(user))

dump 直接序列化到檔案
with open("userdb.pkl","ab") as f:
    pickle.dump(user,f)

load 從檔案反序列化
with open("userdb.pkl","rb") as f:
    user = pickle.load(f)
    print(user)
    print(pickle.load(f))
    print(pickle.load(f))
    print(pickle.load(f))

pickle

如果我們將一個字典或者序列化成了一個json存在檔案裡，那麼java程式碼或者js程式碼也可以拿來用，但是如果我們用pickle進行序列化，其他語言就不能讀

三、shelve

也用於序列化，它於pickle不同之處在於，不需要關心檔案模式什麼的類似把它當成一個字典來看待它可以直接對資料進行修改而不用覆蓋原來的資料，而pickle 你想要修改只能用wb模式來覆蓋

import shelve
user = {"name":"高根"}
s = shelve.open("userdb.shv")
s["user"] = user
s.close()


s = shelve.open("userdb.shv",writeback=True)
print(s["user"])
s["user"]["age"] = 20
s.close()

View Code

四、xml

xml是實現不同語言或程式之間進行資料交換的協議，跟json差不多，但json使用起來更簡單，不過在json還沒誕生前，大家只能選擇用xml呀，至今很多傳統公司如金融行業的很多系統的介面還主要是xml。

解析d.xml
tree = ElementTree.parse("d.xml")
print(tree)

獲取根標籤
rootTree = tree.getroot()

三種獲取標籤的方式
獲取所有人的年齡 iter是用於在全文範圍獲取標籤
for item in rootTree.iter("age"):
     # 一個標籤三個組成部分
     print(item.tag) # 標籤名稱
     print(item.attrib) # 標籤的屬性
     print(item.text) # 文字內容

第二種 從當前標籤的子標籤中找到一個名稱為age的標籤  如果有多個 找到的是第一個
print(rootTree.find("age").attrib)

第三種 從當前標籤的子標籤中找到所有名稱為age的標籤
print(rootTree.findall("age"))

獲取單個屬性
stu = rootTree.find("stu")
print(stu.get("age"))
print(stu.get("name"))

刪除子標籤
rootTree.remove(stu)

新增子標籤
要先建立一個子標籤
newTag = ElementTree.Element("這是新標籤",{"一個屬性":"值"})
rootTree.append(newTag)

另外，節點還有set（設定節點屬性）

寫入檔案
tree.write("f.xml",encoding="utf-8")

View Code

import xml.etree.ElementTree as ET

new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'
 
et = ET.ElementTree(new_xml) #生成文件物件
et.write("test.xml", encoding="utf-8",xml_declaration=True)
 
ET.dump(new_xml) #列印生成的格式

自己建立xml文件

五、configparser

該模組適用於配置檔案的格式與windows ini檔案類似，可以包含一個或多個節（section），每個節可以有多個引數（鍵=值）。

[section1]
k1 = v1
k2:v2
user=egon
age=18
is_admin=true
salary=31

[section2]
k1 = v1

配置檔案

import configparser

config=configparser.ConfigParser()
config.read('a.cfg')

#檢視所有的標題
res=config.sections() #['section1', 'section2']
print(res)

#檢視標題section1下所有key=value的key
options=config.options('section1')
print(options) #['k1', 'k2', 'user', 'age', 'is_admin', 'salary']

#檢視標題section1下所有key=value的(key,value)格式
item_list=config.items('section1')
print(item_list) #[('k1', 'v1'), ('k2', 'v2'), ('user', 'egon'), ('age', '18'), ('is_admin', 'true'), ('salary', '31')]

#檢視標題section1下user的值=>字串格式
val=config.get('section1','user')
print(val) #egon

#檢視標題section1下age的值=>整數格式
val1=config.getint('section1','age')
print(val1) #18

#檢視標題section1下is_admin的值=>布林值格式
val2=config.getboolean('section1','is_admin')
print(val2) #True

讀取

import configparser

config=configparser.ConfigParser()
config.read('a.cfg',encoding='utf-8')


刪除整個標題section2
config.remove_section('section2')

刪除標題section1下的某個k1和k2
config.remove_option('section1','k1')
config.remove_option('section1','k2')

判斷是否存在某個標題
print(config.has_section('section1'))

判斷標題section1下是否有user
print(config.has_option('section1',''))


新增一個標題
config.add_section('egon')

在標題egon下新增name=egon,age=18的配置
config.set('egon','name','egon')
config.set('egon','age',18) #報錯,必須是字串


最後將修改的內容寫入檔案,完成最終的修改
config.write(open('a.cfg','w'))

修改

常用模組之 shutil，json，pickle，shelve，xml，configparser

常用模組之 shutil，json，pickle，shelve，xml，configparser

Python全棧開發記錄_第八篇（模組收尾工作 json & pickle & shelve & xml）

python常用模組(模組和包的解釋，time模組，sys模組，random模組，os模組，json和pickle序列化模組)

20181205（模組迴圈匯入解決方案，json&pickle模組，time，date，random介紹）

Python資料物件的編碼和解碼，json和pickle模組，base64模組的簡單使用

python常用模組之json和pickle模組

python學習第四天，列表生產式，匿名函數，生成器，內置函數，叠代器，裝飾器，json和pickle的序列化和反序列化

python常用模組——json、pickle、shelve

常用模組（資料序列化 json、pickle、shelve）

python-時間模塊,random、os、sys、shutil、json和pickle模塊

【轉】Python之數據序列化（json、pickle、shelve）

例項學習ansible系列（5）常用模組之copy

常用模組之hashlib,subprocess,logging,re,collections

爬蟲--Python常用模組之requests,urllib和re

常用模組之time模組

Python 常用模組之re 正則表示式的使用

python常用模組之time模組

python 常用模組之random,os,sys 模組

常用模組之openpyxl (python3入門)

例項學習ansible系列（10）常用模組之script

常用模組之 shutil，json，pickle，shelve，xml，configparser

相關推薦