Windows 7下安裝Scrapy問題的解決方法記錄
引言: Scrapy是爬蟲界大名鼎鼎的存在,在Linux下安裝非常順利,但是在Windows下確實非常多的問題發生,這裡記錄存在的各類問題以及如何解決。
1. 安裝Scrapy失敗
直接在windows下安裝scrapy大概率會是失敗的,各類不同的原因,主要是各類依賴包未必全部安裝,故依據具體的問題而定。
>> pip install scrapy
2. 安裝visual c++ build tools
這個工具是由微軟提供,主要是用於編譯相關的原始碼而成的:
下載地址: http://www.microsoft.com/zh-CN/download/details.aspx?id=48159
3. openssl 之不是有效的win 32程式
如何解決?Traceback (most recent call last): File "D:\Program Files\python\Scripts\scrapy-script.py", line 11, in <module> load_entry_point('scrapy==1.1.3', 'console_scripts', 'scrapy')() File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 121, in execute cmds = _get_commands_dict(settings, inproject) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 45, in _get_commands_dict cmds = _get_commands_from_module('scrapy.commands', inproject) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 28, in _get_commands_from_module for cmd in _iter_command_classes(module): File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 19, in _iter_command_classes for module in walk_modules(module_name): File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\utils\misc.py", line 71, in walk_modules submod = import_module(fullpath) File "D:\Program Files\python\lib\importlib\__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 978, in _gcd_import File "<frozen importlib._bootstrap>", line 961, in _find_and_load File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\commands\version.py", line 6, in <module> import OpenSSL File "D:\Program Files\python\lib\site-packages\OpenSSL\__init__.py", line 8, in <module> from OpenSSL import rand, crypto, SSL File "D:\Program Files\python\lib\site-packages\OpenSSL\rand.py", line 12, in <module> from OpenSSL._util import ( File "D:\Program Files\python\lib\site-packages\OpenSSL\_util.py", line 6, in <module> from cryptography.hazmat.bindings.openssl.binding import Binding File "D:\Program Files\python\lib\site-packages\cryptography\hazmat\bindings\openssl\binding.py", line 14, in <module> from cryptography.hazmat.bindings._openssl import ffi, lib ImportError: DLL load failed: %1 不是有效的 Win32 應用程式。
>> pip3 uninstall pyopenssl
>> pip3 uninstall cryptography
>> pip3 install pyopenssl
>> pip3 install cryptography
4. 問題 ModuleNotFoundError: No module named '_cffi_backend'
錯誤資訊:
How to solve it?2017-03-31 14:56:42 [scrapy] INFO: Scrapy 1.1.3 started (bot: scrapybot) 2017-03-31 14:56:42 [scrapy] INFO: Overridden settings: {'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0} 2017-03-31 14:56:42 [scrapy] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole'] Traceback (most recent call last): File "D:\Program Files\python\Scripts\scrapy-script.py", line 11, in <module> load_entry_point('scrapy==1.1.3', 'console_scripts', 'scrapy')() File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 142, in execute _run_print_help(parser, _run_command, cmd, args, opts) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 88, in _run_print_help func(*a, **kw) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 149, in _run_command cmd.run(args, opts) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\commands\shell.py", line 65, in run crawler.engine = crawler._create_engine() File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\crawler.py", line 97, in _create_engine return ExecutionEngine(self, lambda _: self.stop()) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\core\engine.py", line 68, in __init__ self.downloader = downloader_cls(crawler) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\core\downloader\__init__.py", line 88, in __init__ self.middleware = DownloaderMiddlewareManager.from_crawler(crawler) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\middleware.py", line 58, in from_crawler return cls.from_settings(crawler.settings, crawler) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\middleware.py", line 34, in from_settings mwcls = load_object(clspath) File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\utils\misc.py", line 44, in load_object mod = import_module(module) File "D:\Program Files\python\lib\importlib\__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 978, in _gcd_import File "<frozen importlib._bootstrap>", line 961, in _find_and_load File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\downloadermiddlewares\retry.py", line 23, in <module> from scrapy.xlib.tx import ResponseFailed File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\xlib\tx\__init__.py", line 3, in <module> from twisted.web import client File "D:\Program Files\python\lib\site-packages\twisted\web\client.py", line 42, in <module> from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS File "D:\Program Files\python\lib\site-packages\twisted\internet\endpoints.py", line 37, in <module> from twisted.internet.stdio import StandardIO, PipeAddress File "D:\Program Files\python\lib\site-packages\twisted\internet\stdio.py", line 30, in <module> from twisted.internet import _win32stdio File "D:\Program Files\python\lib\site-packages\twisted\internet\_win32stdio.py", line 9, in <module> import win32api ModuleNotFoundError: No module named 'win32api'
>> pip install pypiwin32
5. scrapy shell url 發生TypeError: 'float' object is not iterable
錯誤資訊:
2017-03-31 15:07:16 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-03-31 15:07:16 [scrapy] INFO: Spider opened
Traceback (most recent call last):
File "D:\Program Files\python\Scripts\scrapy-script.py", line 11, in <module>
load_entry_point('scrapy==1.1.3', 'console_scripts', 'scrapy')()
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 142, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 88, in _run_print_hel
func(*a, **kw)
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\cmdline.py", line 149, in _run_command
cmd.run(args, opts)
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\commands\shell.py", line 71, in run
shell.start(url=url)
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\shell.py", line 47, in start
self.fetch(url, spider)
File "D:\Program Files\python\lib\site-packages\scrapy-1.1.3-py3.5.egg\scrapy\shell.py", line 112, in fetch
reactor, self._schedule, request, spider)
File "D:\Program Files\python\lib\site-packages\twisted\internet\threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "D:\Program Files\python\lib\site-packages\twisted\python\failure.py", line 372, in raiseException
raise self.value.with_traceback(self.tb)
TypeError: 'float' object is not iterable
檢查當下的twisted版本是17.1.0, 需要切換至16.6.0>> pip uninstall twisted
>> pip install twisted==16.6.0
6. 總結
scrapy在Linux安裝非常順利,但是在Windows上卻問題多多,一波三折,個人建議,如果玩開發,還是推薦linux或者mac。