最近想要学习Python3+Scrapy的爬虫技术,需要先安装Python3和Scrapy。因为Mac是自带Python2.7的。安装Python3.6版本有两种方法,一种是升级,一种是额外安装3.6版本。
安装Python3.6版本
其实安装3.6版本也就是在官网上直接下载之后安装,和普通的mac软件安装方式是一样的~~
安装完成之后,不会覆盖原来的Python,新安装的Python3.6版本会在 /usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/bin/python3.6 文件中
此时在终端直接输入 python 会执行python2.7版本
$ python
Python 2.7.10 (default, Jul 15 2017, 17:16:57)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
输入 python3 则会执行Python3.6版本
$ python3
Python 3.6.2 (default, Sep 11 2017, 16:24:44)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
安装Scrapy
接下来就可以开始安装scrapy了
python3.6中自带 pip,所以不需要额外安装,可以直接在终端输入 pip3 --version查看版本和路径
$ pip3 --version
pip 9.0.1 from /usr/local/lib/python3.6/site-packages (python 3.6)
使用 pip3 安装scrapy
$ pip3 install Scrapy
这里的Scrapy一定要首字母大写,不然会在安装的过程中报错~~
Collecting scrapy
Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x103aa2c88>: Failed to establish a new connection: [Errno 61] Connection refused',)': /simple/scrapy/
Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x103aa29e8>: Failed to establish a new connection: [Errno 61] Connection refused',)': /simple/scrapy/
Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x103aa2630>: Failed to establish a new connection: [Errno 61] Connection refused',)': /simple/scrapy/
Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x103aa2f28>: Failed to establish a new connection: [Errno 61] Connection refused',)': /simple/scrapy/
Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x103aa2be0>: Failed to establish a new connection: [Errno 61] Connection refused',)': /simple/scrapy/
Could not find a version that satisfies the requirement scrapy (from versions: )
No matching distribution found for scrapy
安装成功之后,可以直接在终端上输入 scrapy 查看版本号及使用
$ scrapy
Scrapy 1.4.0 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
[ more ] More commands available when run from project directory
Use "scrapy <command> -h" to see more info about a command
创建Scrapy项目
在pycharm中是没有直接创建scrapy项目的,可以使用 scrapy 命令手动新建项目(ArticleSpider为项目名称)
$ scrapy startproject ArticleSpider