Python爬虫模块Urlib和Requests之间的区别是什么？在上一篇文章中（见参考文献部分），我们已经谈到了Pyt

在这篇文章中，我们主要讨论PythonUrllib和Requests 模块之间的区别。

1.获取网页数据的区别

导入不同的库。很明显，这两个库导入了不同的python模块。

# import the python urllib module.
import urllib.request

# import the python requests module.
import requests

发送网页请求的方法不同。pythonUrllib通过urlopen()方法发送网页请求，pythonRequests模块需要通过网页的响应类型获取数据。

# urllib module send web page request by the urlopen() method.
resp = urllib.request.urlopen("https://www.google.com")

# python requests module send web page request 
resp = request.get("https://www.google.com")

数据封装是不同的。对于复杂的数据请求，我们不能简单地使用urlopen方法。

使用pythonurllib 模块，我们知道对于有反爬虫机制的网站，我们需要对URL进行封装来获取数据。

url = "https://www.google.com"

headers = {
    
    "user-agent": "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36"
}

data = bytes(urllib.parse.urlencode({"hello":"world"}), encoding="utf-8")

req=urllib.request.Request(url=url,data=data, headers=headers)

resp=urllib.request.urlopen(req,context = ctx)

在pythonRequests模块中，不需要进行这样复杂的操作。只要在第二步中加入参数头即可。

import requests

headers = {
    
    "user-agent": "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36"
}

resp=requests.get("https://www.google.com", headers=headers)

2.解析网络数据的差异

python的urllib和Requests 模块都可以通过bs4和re 模块解析数据，而且Requests模块还可以通过XPath解析回复数据。