1. 程式人生 > >Python——day3_基礎1_集合,文件操作,字符編碼與轉碼

Python——day3_基礎1_集合,文件操作,字符編碼與轉碼

windows 使用 bject 差集 ise fse style spl dev

集合

集合是一個無序的,不重復的數據組合,它的主要作用如下:

  • 去重,把一個列表變成集合,就自動去重了
  • 關系測試,測試兩組數據之前的交集、差集、並集等關系

常用操作

s = set([3,5,9,10])      #創建一個數值集合  
  
t = set("Hello")         #創建一個唯一字符的集合  


a = t | s          # t 和 s的並集  
  
b = t & s          # t 和 s的交集  
  
c = t – s          # 求差集(項在t中,但不在s中)  
  
d = t ^ s          #
對稱差集(項在t或s中,但不會同時出現在二者中) 基本操作: t.add(x) # 添加一項 s.update([10,37,42]) # 在s中添加多項 使用remove()可以刪除一項: t.remove(H) len(s) set 的長度 x in s 測試 x 是否是 s 的成員 x not in s 測試 x 是否不是 s 的成員 s.issubset(t) s <= t 測試是否 s 中的每一個元素都在 t 中 s.issuperset(t) s
>= t 測試是否 t 中的每一個元素都在 s 中 s.union(t) s | t 返回一個新的 set 包含 s 和 t 中的每一個元素 s.intersection(t) s & t 返回一個新的 set 包含 s 和 t 中的公共元素 s.difference(t) s - t 返回一個新的 set 包含 s 中有但是 t 中沒有的元素 s.symmetric_difference(t) s ^ t 返回一個新的 set 包含 s 和 t 中不重復的元素 s.copy() 返回 set “s”的一個淺復制

文件操作

對文件操作流程

  1. 打開文件,得到文件句柄並賦值給一個變量
  2. 通過句柄對文件進行操作
  3. 關閉文件

現有文件如下

技術分享
 1 Somehow, it seems the love I knew was always the most destructive kind
 2 不知為何,我經歷的愛情總是最具毀滅性的的那種
 3 Yesterday when I was young
 4 昨日當我年少輕狂
 5 The taste of life was sweet
 6 生命的滋味是甜的
 7 As rain upon my tongue
 8 就如舌尖上的雨露
 9 I teased at life as if it were a foolish game
10 我戲弄生命 視其為愚蠢的遊戲
11 The way the evening breeze
12 就如夜晚的微風
13 May tease the candle flame
14 逗弄蠟燭的火苗
15 The thousand dreams I dreamed
16 我曾千萬次夢見
17 The splendid things I planned
18 那些我計劃的絢麗藍圖
19 I always built to last on weak and shifting sand
20 但我總是將之建築在易逝的流沙上
21 I lived by night and shunned the naked light of day
22 我夜夜笙歌 逃避白晝赤裸的陽光
23 And only now I see how the time ran away
24 事到如今我才看清歲月是如何匆匆流逝
25 Yesterday when I was young
26 昨日當我年少輕狂
27 So many lovely songs were waiting to be sung
28 有那麽多甜美的曲兒等我歌唱
29 So many wild pleasures lay in store for me
30 有那麽多肆意的快樂等我享受
31 And so much pain my eyes refused to see
32 還有那麽多痛苦 我的雙眼卻視而不見
33 I ran so fast that time and youth at last ran out
34 我飛快地奔走 最終時光與青春消逝殆盡
35 I never stopped to think what life was all about
36 我從未停下腳步去思考生命的意義
37 And every conversation that I can now recall
38 如今回想起的所有對話
39 Concerned itself with me and nothing else at all
40 除了和我相關的 什麽都記不得了
41 The game of love I played with arrogance and pride
42 我用自負和傲慢玩著愛情的遊戲
43 And every flame I lit too quickly, quickly died
44 所有我點燃的火焰都熄滅得太快
45 The friends I made all somehow seemed to slip away
46 所有我交的朋友似乎都不知不覺地離開了
47 And only now Im left alone to end the play, yeah
48 只剩我一個人在臺上來結束這場鬧劇
49 Oh, yesterday when I was young
50 噢 昨日當我年少輕狂
51 So many, many songs were waiting to be sung
52 有那麽那麽多甜美的曲兒等我歌唱
53 So many wild pleasures lay in store for me
54 有那麽多肆意的快樂等我享受
55 And so much pain my eyes refused to see
56 還有那麽多痛苦 我的雙眼卻視而不見
57 There are so many songs in me that wont be sung
58 我有太多歌曲永遠不會被唱起
59 I feel the bitter taste of tears upon my tongue
60 我嘗到了舌尖淚水的苦澀滋味
61 The time has come for me to pay for yesterday
62 終於到了付出代價的時間 為了昨日
63 When I was young
64 當我年少輕狂
View Code

基本操作

f = open(lyrics) #打開文件
first_line = f.readline()
print(first line:,first_line) #讀一行
print(我是分隔線.center(50,-))
data = f.read()# 讀取剩下的所有內容,文件大時不要用
print(data) #打印文件
 
f.close() #關閉文件

打開文件的模式有:

  • r,只讀模式(默認)。
  • w,只寫模式。【不可讀;不存在則創建;存在則刪除內容;】
  • a,追加模式。【可讀; 不存在則創建;存在則只追加內容;】

"+" 表示可以同時讀寫某個文件

  • r+,可讀寫文件。【可讀;可寫;可追加】
  • w+,寫讀
  • a+,同a

"U"表示在讀取時,可以將 \r \n \r\n自動轉換成 \n (與 r 或 r+ 模式同使用)

  • rU
  • r+U

"b"表示處理二進制文件(如:FTP發送上傳ISO鏡像文件,linux可忽略,windows處理二進制文件時需標註)

  • rb
  • wb
  • ab

其它語法

def close(self): # real signature unknown; restored from __doc__
        """
        Close the file.
        
        A closed file cannot be used for further I/O operations.  close() may be
        called more than once without error.
        """
        pass

    def fileno(self, *args, **kwargs): # real signature unknown
        """ Return the underlying file descriptor (an integer). """
        pass

    def isatty(self, *args, **kwargs): # real signature unknown
        """ True if the file is connected to a TTY device. """
        pass

    def read(self, size=-1): # known case of _io.FileIO.read
        """
        註意,不一定能全讀回來
        Read at most size bytes, returned as bytes.
        
        Only makes one system call, so less data may be returned than requested.
        In non-blocking mode, returns None if no data is available.
        Return an empty bytes object at EOF.
        """
        return ""

    def readable(self, *args, **kwargs): # real signature unknown
        """ True if file was opened in a read mode. """
        pass

    def readall(self, *args, **kwargs): # real signature unknown
        """
        Read all data from the file, returned as bytes.
        
        In non-blocking mode, returns as much as is immediately available,
        or None if no data is available.  Return an empty bytes object at EOF.
        """
        pass

    def readinto(self): # real signature unknown; restored from __doc__
        """ Same as RawIOBase.readinto(). """
        pass #不要用,沒人知道它是幹嘛用的

    def seek(self, *args, **kwargs): # real signature unknown
        """
        Move to new file position and return the file position.
        
        Argument offset is a byte count.  Optional argument whence defaults to
        SEEK_SET or 0 (offset from start of file, offset should be >= 0); other values
        are SEEK_CUR or 1 (move relative to current position, positive or negative),
        and SEEK_END or 2 (move relative to end of file, usually negative, although
        many platforms allow seeking beyond the end of a file).
        
        Note that not all file objects are seekable.
        """
        pass

    def seekable(self, *args, **kwargs): # real signature unknown
        """ True if file supports random-access. """
        pass

    def tell(self, *args, **kwargs): # real signature unknown
        """
        Current file position.
        
        Can raise OSError for non seekable files.
        """
        pass

    def truncate(self, *args, **kwargs): # real signature unknown
        """
        Truncate the file to at most size bytes and return the truncated size.
        
        Size defaults to the current file position, as returned by tell().
        The current file position is changed to the value of size.
        """
        pass

    def writable(self, *args, **kwargs): # real signature unknown
        """ True if file was opened in a write mode. """
        pass

    def write(self, *args, **kwargs): # real signature unknown
        """
        Write bytes b to file, return number written.
        
        Only makes one system call, so not all of the data may be written.
        The number of bytes actually written is returned.  In non-blocking mode,
        returns None if the write would block.
        """
        pass

with語句

為了避免打開文件後忘記關閉,可以通過管理上下文,即:

1 with open(log,r) as f:
2      
3     ...

如此方式,當with代碼塊執行完畢時,內部會自動關閉並釋放文件資源。

在Python 2.7 後,with又支持同時對多個文件的上下文進行管理,即:

1 with open(log1) as obj1, open(log2) as obj2:
2     pass

字符編碼與轉碼

詳細文章:

http://www.cnblogs.com/yuanchenqi/articles/5956943.html

http://www.diveintopython3.net/strings.html

需知:

1.在python2默認編碼是ASCII, python3裏默認是unicode

2.unicode 分為 utf-32(占4個字節),utf-16(占兩個字節),utf-8(占1-4個字節), so utf-16就是現在最常用的unicode版本, 不過在文件裏存的還是utf-8,因為utf8省空間

3.在py3中encode,在轉碼的同時還會把string 變成bytes類型,decode在解碼的同時還會把bytes變回string

技術分享

上圖僅適用於py2

in python2

#-*-coding:utf-8-*-
__author__ = Alex Li

import sys
print(sys.getdefaultencoding())


msg = "我愛北京天安門"
msg_gb2312 = msg.decode("utf-8").encode("gb2312")
gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk")

print(msg)
print(msg_gb2312)
print(gb2312_to_gbk)

in python2

in python3

#-*-coding:gb2312 -*-   #這個也可以去掉
__author__ = Alex Li

import sys
print(sys.getdefaultencoding())


msg = "我愛北京天安門"
#msg_gb2312 = msg.decode("utf-8").encode("gb2312")
msg_gb2312 = msg.encode("gb2312") #默認就是unicode,不用再decode,喜大普奔
gb2312_to_unicode = msg_gb2312.decode("gb2312")
gb2312_to_utf8 = msg_gb2312.decode("gb2312").encode("utf-8")

print(msg)
print(msg_gb2312)
print(gb2312_to_unicode)
print(gb2312_to_utf8)

in python3

Python——day3_基礎1_集合,文件操作,字符編碼與轉碼