1. 程式人生 > >Python爬蟲問題彙總(持續更新)

Python爬蟲問題彙總(持續更新)

@分散式爬蟲的slave端找不到scrapy_redis:

  • 執行slave端時使用:sudo scrapy crawl spidername,或sudo scrapy runspider mycrawler_redis.py,總之sudo一下;
  • 沒sudo居然報找不到模組…沒道理,蛋疼啊;

@分散式爬蟲嘗試連線遠端redis被拒:

  • 報錯:redis.exceptions.ResponseError: DENIED Redis is running in protected mode…:

@爬蟲報連線丟失錯誤

  • 報錯:twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion.
  • 被反爬了,要配置請求頭或IP代理

@ubuntu16下安裝chrome瀏覽器:

@安裝chromedriver和phantomjs:

@chromedriver的版本與chrome版本要注意匹配,否則會報非法上下文錯誤(Runtime.executionContextCreated has invalid ‘context’):