1.8.1 pyspider的安裝 · python3爬蟲筆記

# 1.8.1 pyspider的安裝 ## 1.說明帶有強大的 WebUI、腳本編輯器、任務監控器、項目管理器以及結果處理器，同時它支持多種數據庫后端、多種消息隊列，另外它還支持 JavaScript 渲染頁面的爬取，使用起來非常方便。 ## 2. 相關鏈接 {#1-相關鏈接} * 官方文檔：[http://docs.pyspider.org/](http://docs.pyspider.org/) * PyPi：[https://pypi.python.org/pypi/pyspider](https://pypi.python.org/pypi/pyspider) * GitHub：[https://github.com/binux/pyspider](https://github.com/binux/pyspider) * 官方教程：[http://docs.pyspider.org/en/latest/tutorial](http://docs.pyspider.org/en/latest/tutorial) * 在線實例：[http://demo.pyspider.org](http://demo.pyspider.org/) ## 3.準備工作 PySpider 是支持 JavaScript 渲染的，而這個過程是依賴于 PhantomJS 的，所以還需要安裝 PhantomJS，所以在安裝之前請[安裝](../12-qing-qiu-ku-de-an-zhuang/125-phantomjsde-an-zhuang.md)好 PhantomJS ## 4.常見錯誤 Windows 下可能會出現這樣的錯誤提示：Command "python setup.py egg\_info" failed with error code 1 in /tmp/pip-build-vXo1W3/pycurl 這是pycurl安裝錯誤，需要安裝pucurlku，下載地址:[http://www.lfd.uci.edu/~gohlke/pythonlibs/\#pycurl](http://www.lfd.uci.edu/~gohlke/pythonlibs/#pycurl)，找到與之對應的wheel文件。如Windows 64 位，Python3.6 則下載 pycurl?7.43.0?cp36?cp36m?win\_amd64.whl ```text pip install pycurl?7.43.0?cp36?cp36m?win_amd64.whl ``` Linux 下如果遇到 PyCurl 的錯誤:\_\_main\_\_.ConfigurationError: Could not run curl-config: \[Errno2\] No such file or directory 解決方案： ```text sudo apt-get install libcurl4-gnutls-dev ``` 運行安裝后即可正常安裝`pycurl` ## 5. 驗證安裝 {#5-驗證安裝} 啟動pyspider ```text pyspider all ``` 控制平臺如下輸出: ```text C:\Users\miku>pyspider all e:\python36\lib\site-packages\pyspider\libs\utils.py:196: FutureWarning: timeout is not supported on your platform. warnings.warn("timeout is not supported on your platform.", FutureWarning) phantomjs fetcher running on port 25555 [I 180729 16:04:42 result_worker:49] result_worker starting... [I 180729 16:04:43 processor:211] processor starting... [I 180729 16:04:43 scheduler:647] scheduler starting... [I 180729 16:04:43 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0 [I 180729 16:04:43 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333 [I 180729 16:04:43 tornado_fetcher:638] fetcher starting... [I 180729 16:04:44 app:76] webui running on 0.0.0.0:5000 ``` 如果報錯，出現以下錯誤 ```text C:\Users\miku>pyspider all .... pkg_resources.DistributionNotFound: wsgidav ``` 解決方案: ```text pip install -U setuptools ``` 再嘗試命令 pyspider all，成功運行這時 PySpider 的 Web 服務就會在本地 5000 端口運行，直接在瀏覽器打開： [http://localhost:5000/](http://localhost:5000/)即可進入 PySpider 的 WebUI 管理頁面![](https://box.kancloud.cn/f606e3cd06dcc00b4a7bc5c8900a1fda_1909x381.png)出現類似上面頁面，就代表安裝成功了