python3與python2的字串編碼問題

阿新 • • 發佈：2018-12-27

Python3和Python2字串編碼採用不同的方式，下面分為幾部分進行比較。

1、檢視Python版本

import sys
__author__ = "author"
print(sys.version_info) #字典方式顯示
print(sys.version)

python3.6.0:
python3.6.0版本
python2.7.11
python2.7.11版本

2、檢視Python預設編碼方式

print(sys.getdefaultencoding()) #python3
print sys.getdefaultencoding() #python2

輸出結果為python3的為utf-8，Python2的為ascii。

3、Python3與Python2中的字串編碼區別
python3中包含兩種方式，一種為bytes，一種為str。Python2中一種為unicode，一種為bytes
其實bytes為二進位制方式，例如字串b”hello”就為bytes模式，Python3採用8位模式，python2中採用7位模式。
Python3對於編碼和解碼字元尤其嚴格，bytes與str模式是不同的型別，他們比較的結果是False，而在Python2中他們比較的結果就是True。如下所示：
Python3：這裡寫圖片描述 Python2：
從結果中也可以看出來。要想可以判斷字元格式並且可以得到想要的格式可以編寫函式進行判斷，利用isinstance函式：

#python3
def Get_Str(str_or_bytes):
    if isinstance(str_or_bytes, bytes):
        VALUE = str_or_bytes.decode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE

def Get_Bytes(str_or_bytes):
    if isinstance(str_or_bytes, str):
        VALUE = str_or_bytes.encode("utf-8")
    else 
:
        VALUE = str_or_bytes
    return VALUE
#測試
str1 = "abc"
str2 = b"abc"
print(Get_Bytes(str1))
print(Get_Str(str2))

#! _*_encoding=utf-8_*_
#python2
def Get_Unicode(str_or_bytes):
    if isinstance(str_or_bytes, bytes):
        VALUE = str_or_bytes.decode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE

def Get_Str(str_or_bytes):
    if isinstance(str_or_bytes, str):
        VALUE = str_or_bytes.encode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE
#測試
str1 = "abc"
str2 = b"abc"
print Get_Unicode(str2)
print Get_Str(str1)

4、在Python3中bytes模式的字串不支援%s格式化輸出，python2支援。
python3 bytes格式化輸出錯誤

Python2 bytes格式化輸出正確

5、輸出到檔案，兩種方式也不同，Python3不支援bytes模式直接輸出，需要以二進位制模式輸出，Python2支援。

#python3
"""
with open("test.text", "w+") as f:
    f.write("Welcome to China")  #TypeError 錯誤
"""
with open("text.text", "wb+") as f:
    f.write(b"Welcome to China") #正確

#python2
with open("test.text", 'a+') as f:
    f.write(b"Welcome to China\n") #正確

with open("test.text", "ab+") as f:
    f.write(b"Welcome to China") #正確

6、encode 與 decode應該一一對應，如下程式碼所示：

Str = "Welcome to China"
print(s.encode("gbk"))
print(s.encode("utf-8"))
print(s.encode("utf-8").decode("utf-8")

對於熱愛Python的愛好者，在Python3中編碼和解碼尤為重要。

python3與python2的字串編碼問題

python3與python2的字串編碼問題

Python2和Python3之間關於字串編碼處理的差別

Python3 與 Python2 的不同

python3 與python2 使用map的坑

centOS 安裝Python3與python2並存

python3與python2的區別

Python2 字串編碼

python與go字串編碼

python3與python2虛擬環境搭建

徹底解決python3與python2的版本衝突（window版）

python3 與python2 異常處理的區別與聯絡

CentOS 7增加Python3與Python2共存

python3與python2.7的分別

Python2 & Python3 ctypes 字串編碼型別轉換大坑

關於python3.6.3 與python2.7.14使用for循環便利時遇到in range(變量)時錯誤

Python3.X Socket 一個編碼與解碼的坑

[轉] linux-Centos7安裝python3並與python2共存

linux-Centos7安裝python3並與python2共存

Python3中字符串的編碼與解碼以及編碼之間轉換(decode、encode)

Python2 和 Python3 中默認編碼的差異

python3與python2的字串編碼問題

相關推薦