1. 程式人生 > >decode解碼報錯UnicodeDecodeError: 'gb2312' codec can't decode byte 0x8f in position 6018: illegal multib

decode解碼報錯UnicodeDecodeError: 'gb2312' codec can't decode byte 0x8f in position 6018: illegal multib

python抓取網頁後用decode解碼,報錯資訊如下:

Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    html = html.decode("gb2312")
UnicodeDecodeError: 'gb2312' codec can't decode byte 0x8f in position 6018: illegal multibyte sequence

初步推測是網頁中有部分數值是錯誤的或者說不是採用<meta>標籤中charset顯示的顯示的編碼,那麼可以通過設定‘decode’函式的第二引數——‘errors’來解決這一問題

舉例:

html = html.decode("gb2312",errors = 'ignore')

截圖:

注意:不要把‘ignore’輸成了‘ignone’,否則會報錯!

報錯資訊:

LookupError: unknown error handler name 'ignone'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Personal\Desktop\測試.py", line 8, in <module>
    html = rep.read().decode("gb2312",errors="ignone")
LookupError: decoding with 'gb2312' codec failed (LookupError: unknown error handler name 'ignone')

截圖: