1. 程式人生 > >Python 爬蟲常見的坑和解決方法

Python 爬蟲常見的坑和解決方法

gpo 爬蟲 nic 詳細 true wow user html encoding

1.請求時出現HTTP Error 403: Forbidden

headers = {‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0‘}  

req = urllib.request.Request(url=url, headers=headers)  

urllib.request.urlopen(req).read()  

詳細:https://www.2cto.com/kf/201309/242273.html

2.保存html內容時出現Python UnicodeEncodeError: ‘gbk‘ codec can‘t encode character

f = open("out.html","w")  

換成

f = open("out.html","w",encoding=‘utf-8‘)  

詳細:http://www.jb51.net/article/64816.htm

Python 爬蟲常見的坑和解決方法