requests
是 Python 中最流行的 HTTP 客户端库,以其简洁的 API 和强大的功能著称。本文涵盖其核心功能、高级用法及最佳实践。
目录
安装与导入
- 安装:
pip install requests
- 导入:
import requests
发送 HTTP 请求
基础方法
支持所有 HTTP 方法:
response = requests.get("https://api.example.com/data")
response = requests.post("https://api.example.com/post", data={"key": "value"})
response = requests.put("https://api.example.com/put", json={"key": "value"})
response = requests.delete("https://api.example.com/delete")
控制重定向
默认允许重定向,可通过 allow_redirects
禁用:
response = requests.get(url, allow_redirects=False)
流式请求
处理大响应时,逐块接收数据:
response = requests.get(url, stream=True)
for chunk in response.iter_content(chunk_size=8192):
process(chunk)
处理响应
响应属性
属性/方法 | 说明 | 示例 |
---|---|---|
response.status_code | HTTP 状态码(如 200, 404) | if response.status_code == 200: |
response.text | 解码后的文本内容(自动检测编码) | print(response.text[:100]) |
response.content | 原始字节数据 | with open("image.png", "wb") as f: f.write(response.content) |
response.json() | 解析 JSON 为字典 | data = response.json() |
response.headers | 响应头(字典形式) | content_type = response.headers["Content-Type"] |
response.cookies | 服务器返回的 Cookies | print(response.cookies.get("session_id")) |
response.history | 重定向历史记录 | for resp in response.history: print(resp.url) |
编码处理
手动指定编码(如遇乱码):
response.encoding = "gbk" # 针对中文网页
异常处理
强制检查 HTTP 错误状态:
try:
response.raise_for_status() # 非 2xx/3xx 状态码抛出异常
except requests.HTTPError as e:
print(f"请求失败: {e}")
请求参数详解
URL 参数
自动构建查询字符串:
params = {"page": 2, "sort": "desc"}
response = requests.get("https://api.example.com", params=params)
# URL 变为: https://api.example.com?page=2&sort=desc
请求头定制
模拟浏览器或传递认证信息:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
"Authorization": "Bearer YOUR_TOKEN",
"Accept-Language": "en-US"
}
response = requests.get(url, headers=headers)
数据发送
表单数据
data = {"username": "admin", "password": "secret"}
response = requests.post(url, data=data) # Content-Type: application/x-www-form-urlencoded
JSON 数据
json_data = {"title": "Hello", "body": "World"}
response = requests.post(url, json=json_data) # Content-Type: application/json
文件上传
单文件上传:
with open("report.pdf", "rb") as f:
files = {"document": f}
response = requests.post(url, files=files)
多文件/混合数据:
files = {
"image": ("cat.jpg", open("cat.jpg", "rb"), "image/jpeg"),
"metadata": ("data.json", json.dumps({"tag": "animal"}), "application/json")
}
response = requests.post(url, files=files)
高级功能
会话管理(Session)
持久化配置和连接池复用:
with requests.Session() as session:
session.headers.update({"User-Agent": "MyApp/1.0"})
session.auth = ("user", "pass")
# 首次登录保存 Cookie
login_resp = session.post(login_url, data=credentials)
# 后续请求自动携带 Cookie
profile_resp = session.get(profile_url)
超时与重试
- 全局超时:
response = requests.get(url, timeout=(3.05, 27)) # 连接超时 3.05s,读取超时 27s
- 自定义重试策略(需
requests.adapters.HTTPAdapter
):from requests.adapters import HTTPAdapter from urllib3.util.retry import Retry session = requests.Session() retries = Retry(total=3, backoff_factor=1, status_forcelist=[500, 502, 503]) session.mount("https://", HTTPAdapter(max_retries=retries))
代理配置
支持 HTTP/S 和 SOCKS 代理:
proxies = {
"http": "http://10.10.1.10:3128",
"https": "socks5://user:pass@host:port"
}
response = requests.get(url, proxies=proxies)
SSL/TLS 安全
- 禁用验证(不推荐):
response = requests.get(url, verify=False)
- 自定义证书:
response = requests.get(url, verify="/path/to/ca-bundle.crt")
- 客户端证书认证:
response = requests.get(url, cert=("/path/client.cert", "/path/client.key"))
认证机制
- Basic Auth:
from requests.auth import HTTPBasicAuth response = requests.get(url, auth=HTTPBasicAuth("user", "pass"))
- Digest Auth:
from requests.auth import HTTPDigestAuth response = requests.get(url, auth=HTTPDigestAuth("user", "pass"))
- OAuth 1.0:
from requests_oauthlib import OAuth1 auth = OAuth1("client_key", "client_secret", "token", "token_secret") response = requests.get(url, auth=auth)
事件钩子
在请求过程中插入自定义逻辑:
def log_response(resp, *args, **kwargs):
print(f"Received {len(resp.content)} bytes from {resp.url}")
hooks = {"response": [log_response]}
requests.get("https://api.example.com", hooks=hooks)
最佳实践
- 超时设置:始终指定
timeout
避免阻塞。 - 连接复用:高并发场景使用
Session
提升性能。 - 安全传输:生产环境避免
verify=False
,优先使用 HTTPS。 - 资源管理:使用
with
语句确保文件/会话正确关闭。 - 错误处理:
try: resp = requests.get(url, timeout=5) resp.raise_for_status() except requests.RequestException as e: logging.error(f"Request failed: {str(e)}")
通过掌握这些功能,您能高效处理 REST API 调用、Web 爬虫、文件传输等场景。requests
的灵活性与易用性使其成为 Python 开发者的首选 HTTP 工具。