scrapy爬蟲報錯“Temporaty failure in name resolution”

阿新 • • 發佈：2019-02-19

原因不明，根據一個帖子，進行了如下修改後，仍未解決該問題：

$ vim /etc/resolv.conf

然後修改檔案中的nameserver

該方法無效。

後連線vpn後再次執行該爬蟲，就不報這個錯誤了。

而是報錯：

$ apt-get update
$ apt-get install openssl

現在報錯內容不再是改錯誤了，不知道是否徹底解決，後期會繼續更新。

2016.12.21

=====================================================================

執行後，報錯 crawled （404），該錯誤的生成應該是爬蟲被網站ban掉了。

在網路上查找了一些解決方法，最後參考如何讓你的爬蟲不再被ban，對程式添加了middlewares.py。

在scrapy爬蟲中，僅修改settings.py中的DELAY、PROXY、USER_AGENT是不行的，需要通過middlewares.py來管理這些變數。

故將帖子中的中介軟體檔案新增在專案中，檔案內容如下：

[root@bogon cnblogs]# vi cnblogs/middlewares.py

import random
import base64
from settings import PROXIES

class RandomUserAgent(object):
    """Randomly rotate user agents based on a list of predefined ones"""

    def __init__(self, agents):
        self.agents = agents

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler.settings.getlist('USER_AGENTS'))

    def process_request(self, request, spider):
        #print "**************************" + random.choice(self.agents)
        request.headers.setdefault('User-Agent', random.choice(self.agents))

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        proxy = random.choice(PROXIES)
        if proxy['user_pass'] is not None:
            request.meta['proxy'] = "http://%s" % proxy['ip_port']
            encoded_user_pass = base64.encodestring(proxy['user_pass'])
            request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass
            print "**************ProxyMiddleware have pass************" + proxy['ip_port']
        else:
            print "**************ProxyMiddleware no pass************" + proxy['ip_port']
            request.meta['proxy'] = "http://%s" % proxy['ip_port']

根據參考帖子中的內容新增middlewares.py檔案後，執行時報錯：

File "/usr/lib/python2.7/random.py",line 273, in choice
    return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty

IndexError: list index out of range

進入該錯誤指明的檔案中，choice函式的引數（seq）為空時就會報此錯。

退回至自己middlewares.py檔案中檢視，在16行和20行使用了choice方法。而，20行中，choice方法的引數，是在檔案頭引入的settings檔案中的PROXIES列表，並不為空，所以錯誤應該發生在16行。

16行中，choice方法的引數是在類初始化時定義的，而在語句：

self.agents = agents

中，agents是未被定義的，所以是空。16行的目的，是在我們的USER_AGENT中隨機選擇，所以我們在頭部的模組引用中，新增：

from settings import USER_AGENT

並將類初始化中的語句改為：

self.agents = USER_AGENT

20161622

=====================================================================================

根據如上修改後，新增的中介軟體可以正常工作了。但是，爬蟲仍舊不能成功連結網站進行內容抓取。希望不是因為這個網站的反爬蟲技術太牛逼。。。2333333

不能爬取的資訊顯示： Connection was refused by other side: 111: Connection refused

scrapy爬蟲報錯“Temporaty failure in name resolution”

原因不明，根據一個帖子，進行了如下修改後，仍未解決該問題： $ vim /etc/resolv.conf然後修改檔案中的nameserver 該方法無效。後連線vpn後再次執行該爬蟲，就不報這個錯誤了。而是報錯： $ apt-get update $

mycat啟動報錯UnknownHostException（Temporary failure in name resolution）解決方法

temporary exc ava ora wrap 命令 temp lex PE 重啟命令 ./mycat restart 查看日誌 cd logs tail -f wrapper.log 報錯信息 INFO | jvm 2 | 2018/05/09 11:28

運行ntpdate報錯：Temporary failure in name resolution

使用問題 -s dns fail 解決分鐘方法 span 發現問題：　　忽然發現某臺機器時間慢了些幾分鐘，之前沒有搭建ntpd服務，目前都是使用的ntpdate加定時任務進行時間同步。直接執行ntpdate報錯如下： # ntpdate cn.pool.ntp

在安裝python的pip工具時，遇到以下報錯[Errno -3] Temporary failure in name resolution',)': /simple/pip/

c/apt# pip2 install -U pip Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip

Python用Scrapy爬蟲報錯UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' ，解決方案

錯誤：UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' in position 7: illegal multibyte sequence 解決：import io import sys sys.st

ssh: Could not resolve hostname git.*****-inc.com : Temporary failure in name resolution fatal: The remote end hung up unexpectedly

配置 soft mic target clas 無法執行 ssh pull 開發　　問題出現的情景：使用git pull拉取開發的代碼到測試服務器，報錯：　　ssh: Could not resolve hostname git.****-inc.com : Tempo

scrapy爬蟲報錯“Temporaty failure in name resolution”

scrapy爬蟲報錯“Temporaty failure in name resolution”

mycat啟動報錯UnknownHostException（Temporary failure in name resolution）解決方法

運行ntpdate報錯：Temporary failure in name resolution

在安裝python的pip工具時，遇到以下報錯[Errno -3] Temporary failure in name resolution',)': /simple/pip/

Python用Scrapy爬蟲報錯UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' ，解決方案

ssh: Could not resolve hostname git.*****-inc.com : Temporary failure in name resolution fatal: The remote end hung up unexpectedly

CentOS Kafka 啟動失敗 - java.net.UnknownHostException: maven: maven: Temporary failure in name resolution

SpringBoot之解決雲伺服器VPS在所處雲端叢集的內網不能解析域名的問題：java.net.UnknownHostException:abc.cn: Temporary failure in name resolution

SpringBoot之解決雲服務器VPS在所處雲端集群的內網不能解析域名的問題：java.net.UnknownHostException:abc.cn: Temporary failure in name resolution

阿里雲SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution

wget下載出現failed: Temporary failure in name resolution錯誤的解決方法

socket.gaierror Errno -3 Temporary failure in name resolution

IP address could not be resolved: Temporary failure in name resolution

ssh: Could not resolve hostname you: Temporary failure in name resolution

ssh: Could not resolve hostname host: Temporary failure in name resolution

Temporary failure in name resolution

scrapy+mongodb報錯 TypeError: name must be an instance of str

python 網絡爬蟲報錯“UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position”解決方案

【Scrapy】Scrapy在Python3下報錯：“cannot import name '_win32stdio'”解決辦法

maven項目下載報錯：Failure to transfer org.apache.maven:maven-archiver:jar:x.x from https://repo.maven.apache.org/maven...

scrapy爬蟲報錯“Temporaty failure in name resolution”

相關推薦