1.简介
httpx是新一代的网络请求库,它有如下的特点:
基于python3的功能齐全的http请求模块
既能发送同步请求,也能发送异步请求
支持http1.1和http2
能直接向WSGI、ASGI等应用程序发送请求
2.安装
httpx需要使用python3.6+,使用异步请求则需要python3.8+
pip install httpx
要使用http/2,则需要安装http2的相关依赖
pip install httpx[http2]
3.基本使用
3.1 请求发送
3.1.1 get请求
import httpx
r = httpx.get('https://www.baidu.com/')
print(r)
result:
<Response [200 OK]>
3.1.2 post请求
r = httpx.post('https://httpbin.org/post', data={'key': 'value'})
3.1.3 put请求
r = httpx.put('https://httpbin.org/put', data={'key': 'value'})
3.1.4 delete请求
r = httpx.delete('https://httpbin.org/delete')
3.1.5 head请求
r = httpx.head('https://httpbin.org/get')
3.1.5 options请求
r = httpx.options('https://httpbin.org/get')
3.2 请求头和请求参数
import httpx
headers = {'user-agent': 'my-app/1.0.0'}
params = {'key1': 'value1', 'key2': 'value2'}
url = 'https://httpbin.org/get'
r = httpx.get(url, headers=headers, params=params)
3.2.1 状态码
print(r.status_code)
result:
200
还可以通过响应消息访问状态码
assert r.status_code == httpx.codes.OK
3.2.2 文本编码
r = httpx.get('https://www.example.org')
print(r.encoding)
result:
UTF-8
3.2.3 响应文本
print(r.text)
result:
{
"args": {
"key1": "value1",
"key2": "value2"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "my-app/1.0.0",
"X-Amzn-Trace-Id": "Root=1-61aecbb8-0f6c4dc5758108010b1537f2"
},
"origin": "121.52.252.26",
"url": "https://httpbin.org/get?key1=value1&key2=value2"
}
3.2.4 json格式的响应文本
print(r.json())
result:
{'args': {'key1': 'value1', 'key2': 'value2'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'my-app/1.0.0', 'X-Amzn-Trace-Id': 'Root=1-61aecbb8-0f6c4dc5758108010b1537f2'}, 'origin': '121.52.252.26', 'url': 'https://httpbin.org/get?key1=value1&key2=value2'}
3.3 cookies的处理
3.3.1 请求中添加cookies
import httpx
url = 'http://httpbin.org/cookies'
cookies = {'color': 'green'}
r = httpx.get(url, cookies=cookies)
print(r.json())
result:
{'cookies': {'color': 'green'}}
3.3.2 设置cookies按域访问
import httpx
url = 'http://httpbin.org/cookies'
cookies = httpx.Cookies()
cookies.set('cookie_on_domain', 'start', domain='httpbin.org')
cookies.set('cookie_off_domain', 'end', domain='example.org')
r = httpx.get(url, cookies=cookies)
print(r.json())
result:
{'cookies': {'cookie_on_domain': 'start'}}
3.4 超时处理
import httpx
r = httpx.get('http://httpbin.org', timeout=0.001)
print(r)
result:
httpx.ReadTimeout: timed out
超过设定的超时时间则报错,httpx.ReadTimeout: timed out
3.5 上传文件
import httpx
files = {'upload_file':open(r'code_text.txt', 'r')}
r = httpx.post('https://httpbin.org/post', files=files)
3.6 发送json格式的数据
import httpx
data = {"integer":123, "boolean":True, "list":["a", "b", "c"]}
r = httpx.post('https://httpbin.org/post', json=data)
print(r.text)
result:
{
"args": {},
"data": "{"integer": 123, "boolean": true, "list": ["a", "b", "c"]}",
"files": {},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "58",
"Content-Type": "application/json",
"Host": "httpbin.org",
"User-Agent": "python-httpx/0.18.2",
"X-Amzn-Trace-Id": "Root=1-61af1dca-5ca9cc7026faad621b4b8d88"
},
"json": {
"boolean": true,
"integer": 123,
"list": [
"a",
"b",
"c"
]
},
"origin": "121.52.252.26",
"url": "https://httpbin.org/post"
}
3.7 https认证
3.7.1 基本认证
import httpx
url = 'https://example.com'
r = httpx.get(url, auth=("my_user", 'password123'))
print(r)
result:
<Response [200 OK]>
3.7.2 摘要式身份认证
import httpx
url = 'https://example.com'
auth = httpx.DigestAuth("my_user", 'password123')
r = httpx.get(url, auth=auth)
print(r)
result:
<Response [200 OK]>
4.高级用法
基本用法中,httpx每发送一次请求都要建立一个新的连接,请求数量很大时,效率就会变得很差,也会造成资源的浪费
httpx中提供了client来解决这个问题,它是基于http连接池来实现的。如果对同一个网站发送多次请求,client会继续保持原有的tcp连接来提高效率。
4.1 使用client对象来发送请求
import httpx
with httpx.Client() as client:
headers = {'X-Custom': 'value'}
r = client.get('https://example.com', headers=headers)
print(r.text)
result:
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 2em;
background-color: #fdfdff;
border-radius: 0.5em;
box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
div {
margin: 0 auto;
width: auto;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
4.2 跨请求共享配置
可以将headers、cookies、params等参数放在httpx.Client()中,在client下的请求共享这些配置参数
import httpx
headers1 = {'x-auth': 'from-client'}
params1 = {'client_id': '1234'}
url = 'https://example.com'
with httpx.Client(headers=headers1, params=params1) as client:
headers2 = {'x-custom': 'from-request'}
params2 = {'request_id': '4321'}
r1 = client.get(url)
print(r1.request.headers)
r2 = client.get(url, headers=headers2, params=params2)
print(r2.request.headers)
result:
Headers({'host': 'example.com', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.18.2', 'x-auth': 'from-client'})
Headers({'host': 'example.com', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.18.2', 'x-auth': 'from-client', 'x-custom': 'from-request'})
可以看到,r1的请求头包含from-client,r2的请求头不仅包含headers2的内容,也包含了headers1的内容,最后的请求头相当于headers1和headers2做了合并作为最终的请求头
4.3 http代理
httpx可以通过proxies参数来使用htttp代理,也可以使用不同的代理分别来处理http和https协议的请求
import httpx
proxies = {
'http://': 'http://localhost:8080', # 代理1
'https://': 'http://localhost:8081', # 代理2
}
url = 'https://example.com'
with httpx.Client(proxies=proxies) as client:
r1 = client.get(url)
print(r1)
httpx的代理参数proxies只能在httpx.Client()里面添加,不能在 client.get()里面添加
4.4 超时问题
默认情形下,httpx做了严格的超时处理,超过5秒无响应则超时
4.4.1 设置请求时间
普通请求:
httpx.get('http://example.com/api/v1/example', timeout=10.0)
client实例:
with httpx.Client() as client:
client.get("http://example.com/api/v1/example", timeout=10.0)
4.4.2 关闭超时处理
普通请求:
httpx.get('http://example.com/api/v1/example', timeout=None)
client实例:
with httpx.Client() as client:
client.get("http://example.com/api/v1/example", timeout=None)
4.5 SSL认证
当请求https的协议的链接时,发出的请求需要对主机身份进行认证,因此需要SSL证书来进行认证。如果要自定义CA证书,则可以使用verify参数
r = httpx.get("https://example.org", verify="path/to/client.pem")
或者禁用SSL认证
r = httpx.get("https://example.org", verify=False)
5.异步请求
默认情况下,httpx采用标准的同步请求方式。如果要使用异步请求,也可以用起提供的异步client来处理异步请求
使用异步client比使用多线程发送请求更加高效,更能体现明显的性能优势,并且还支持websocket等长连接
5.1 发送异步请求
使用async/await来进行异步请求的相关处理
import httpx
import asyncio
async def main():
async with httpx.AsyncClient() as client:
r = await client.get('https://www.example.com/')
print(r)
if __name__ == '__main__':
asyncio.run(main())
result:
<Response [200 OK]>
5.2 异步请求与同步请求的比较
5.2.1 同步请求
import httpx
import time
def main():
with httpx.Client() as client:
for i in range(300):
res = client.get('https://www.example.com/')
print(f'第{i}次请求, status_code = {res.status_code}')
if __name__ == '__main__':
tiem1 = time.time()
main()
tiem2 = time.time()
print(f'同步发送300次请求,耗时:{tiem2 - tiem1}')
result:
第0次请求, status_code = 200
第1次请求, status_code = 200
第2次请求, status_code = 200
......
第298次请求, status_code = 200
第299次请求, status_code = 200
同步发送300次请求,耗时:71.6340000629425
5.2.2 异步请求
# 普通请求:
import httpx
import time
import asyncio
async def req(client, i):
res = await client.get('https://www.example.com/')
print(f'第{i}次请求, status_code = {res.status_code}')
return res
async def main():
async with httpx.AsyncClient() as client:
task_list = []
for i in range(300):
res = req(client, i)
task = asyncio.create_task(res)
task_list.append(task)
await asyncio.gather(*task_list)
if __name__ == '__main__':
tiem1 = time.time()
asyncio.run(main())
tiem2 = time.time()
print(f'同步发送300次请求,耗时:{tiem2 - tiem1}')
result:
第6次请求, status_code = 200
第10次请求, status_code = 200
第4次请求, status_code = 200
......
第297次请求, status_code = 200
第298次请求, status_code = 200
第299次请求, status_code = 200
同步发送300次请求,耗时:3.936000108718872
由于异步执行,所以打印出来的i是无序的