简述:安装Pip
Mac:命令行敲入sudo easy_install pip,输入Mac密码,等待片刻
命令行:sudo easy_install pip
Password:
Searching for pip
Reading https://pypi.python.org/simple/pip/
...
##一、使用requests和BeautifulSoup进行爬虫
# -*- coding: UTF-8 -*-
import requests
from bs4 import BeautifulSoup
html = requests.get("http://www.jianshu.com/") #拉取指定网站
soup = BeautifulSoup(html.content, 'html.parser') #运用BeautifulSoup解析返回的网页源代码,并且指定解析器
for item in soup.select(".content"): #查找Html中class = content的标签,返回此标签的列表
# print item.select(".avatar")[0]
name = item.select(".blue-link")[0].text
title = item.select(".title")[0].text
content = item.select(".abstract")[0].text
time = item.select(".time")[0]["data-shared-at"]
print "名字:",name
print "标题:",title
print "内容:",content
print "时间:",time
##二、数据库操作
数据库驱动:https://dev.mysql.com/downloads/connector/python/ ```
-- coding: UTF-8 --
import mysql.connector
config = { 'user': 'root', 'password': 'root', 'host': '127.0.0.1', 'database': 'test', }
con = mysql.connector.connect(**config) cursor = con.cursor()
#增加 cursor.execute("insert into User values(null,%s,%s)",['haha','123']) row = cursor.rowcount ##返回影响的行数 print row # 1
#查询 cursor = con.cursor() cursor.execute("select * from User") fetchall = cursor.fetchall() print fetchall # [(1, u'junwen', u'123'), (2, u'junwen', u'123'), (3, u'junwen', u'123')]
cursor.close() con.close()
##三、Splinter测试工具,能够网页自动执行
</br>
pip install splinter pip install selenium
**下载chromedriver.exe 和 geckodriver.exe 分别加入环境变量,路径不要加上.exe文件**
chromedriver : http://download.csdn.net/download/qianaier/7966945
http://download.csdn.net/download/anan_ss/9723479
geckodriver:https://github.com/mozilla/geckodriver/releases/
添加环境后,再把chromedriver .exe放入你要执行的.py目录中
coding=utf-8
from splinter.browser import Browser xx = Browser(driver_name="chrome") xx.visit("http://item.jd.com/2707976.html") print xx.title #页面标题 : 京东... print xx.driver_name ##浏览器名称:chrome print xx.url #当前页面的Url地址 xx.click_link_by_text("你好,请登录") #点击text是后面文件本 xx.click_link_by_text("账户登录") xx.fill("loginname","18695604770") #填充数据根据name xx.fill("nloginpwd","yao20100814") xx.click_link_by_id("loginsubmit")

##四、注意事项
</br>
**一、 编码问题存在中文字符,再代码第一行加入 `# -*- coding: UTF-8 -*-`**

**二、'module' object is not callable 原因分析**

解决:原因分析:[Python](http://lib.csdn.net/base/python)导入模块的方法有两种:import module 和 from module import,区别是前者所有导入的东西使用时需加上模块名的限定,而后者不要。
正确的代码:
import Person person = Person.Person('dnawo','man') print person.Name 或
from Person import * person = Person('dnawo','man') print person.Name
**三 WindowsError: [Error 183] : 这是因为文件夹重名了**

**四 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)**

解决:加入代码就可以了
import sys reload(sys) sys.setdefaultencoding('utf8')
##四、学习资料
</br>
http://cuiqingcai.com/1319.html
http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001391435131816c6a377e100ec4d43b3fc9145f3bb8056000
http://www.runoob.com/python/python-object.html