Python Requests库完全指南

49 阅读3分钟

requests 是 Python 中最流行的 HTTP 客户端库,以其简洁的 API 和强大的功能著称。本文涵盖其核心功能、高级用法及最佳实践。


目录

  1. 安装与导入
  2. 发送 HTTP 请求
  3. 处理响应
  4. 请求参数详解
  5. 高级功能
  6. 最佳实践

安装与导入

  • 安装
    pip install requests
    
  • 导入
    import requests
    

发送 HTTP 请求

基础方法

支持所有 HTTP 方法:

response = requests.get("https://api.example.com/data")
response = requests.post("https://api.example.com/post", data={"key": "value"})
response = requests.put("https://api.example.com/put", json={"key": "value"})
response = requests.delete("https://api.example.com/delete")

控制重定向

默认允许重定向,可通过 allow_redirects 禁用:

response = requests.get(url, allow_redirects=False)

流式请求

处理大响应时,逐块接收数据:

response = requests.get(url, stream=True)
for chunk in response.iter_content(chunk_size=8192):
    process(chunk)

处理响应

响应属性

属性/方法说明示例
response.status_codeHTTP 状态码(如 200, 404)if response.status_code == 200:
response.text解码后的文本内容(自动检测编码)print(response.text[:100])
response.content原始字节数据with open("image.png", "wb") as f: f.write(response.content)
response.json()解析 JSON 为字典data = response.json()
response.headers响应头(字典形式)content_type = response.headers["Content-Type"]
response.cookies服务器返回的 Cookiesprint(response.cookies.get("session_id"))
response.history重定向历史记录for resp in response.history: print(resp.url)

编码处理

手动指定编码(如遇乱码):

response.encoding = "gbk"  # 针对中文网页

异常处理

强制检查 HTTP 错误状态:

try:
    response.raise_for_status()  # 非 2xx/3xx 状态码抛出异常
except requests.HTTPError as e:
    print(f"请求失败: {e}")

请求参数详解

URL 参数

自动构建查询字符串:

params = {"page": 2, "sort": "desc"}
response = requests.get("https://api.example.com", params=params)
# URL 变为: https://api.example.com?page=2&sort=desc

请求头定制

模拟浏览器或传递认证信息:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
    "Authorization": "Bearer YOUR_TOKEN",
    "Accept-Language": "en-US"
}
response = requests.get(url, headers=headers)

数据发送

表单数据

data = {"username": "admin", "password": "secret"}
response = requests.post(url, data=data)  # Content-Type: application/x-www-form-urlencoded

JSON 数据

json_data = {"title": "Hello", "body": "World"}
response = requests.post(url, json=json_data)  # Content-Type: application/json

文件上传

单文件上传:

with open("report.pdf", "rb") as f:
    files = {"document": f}
    response = requests.post(url, files=files)

多文件/混合数据:

files = {
    "image": ("cat.jpg", open("cat.jpg", "rb"), "image/jpeg"),
    "metadata": ("data.json", json.dumps({"tag": "animal"}), "application/json")
}
response = requests.post(url, files=files)

高级功能

会话管理(Session)

持久化配置和连接池复用:

with requests.Session() as session:
    session.headers.update({"User-Agent": "MyApp/1.0"})
    session.auth = ("user", "pass")
    # 首次登录保存 Cookie
    login_resp = session.post(login_url, data=credentials)
    # 后续请求自动携带 Cookie
    profile_resp = session.get(profile_url)

超时与重试

  • 全局超时
    response = requests.get(url, timeout=(3.05, 27))  # 连接超时 3.05s,读取超时 27s
    
  • 自定义重试策略(需 requests.adapters.HTTPAdapter):
    from requests.adapters import HTTPAdapter
    from urllib3.util.retry import Retry
    
    session = requests.Session()
    retries = Retry(total=3, backoff_factor=1, status_forcelist=[500, 502, 503])
    session.mount("https://", HTTPAdapter(max_retries=retries))
    

代理配置

支持 HTTP/S 和 SOCKS 代理:

proxies = {
    "http": "http://10.10.1.10:3128",
    "https": "socks5://user:pass@host:port"
}
response = requests.get(url, proxies=proxies)

SSL/TLS 安全

  • 禁用验证(不推荐)
    response = requests.get(url, verify=False)
    
  • 自定义证书
    response = requests.get(url, verify="/path/to/ca-bundle.crt")
    
  • 客户端证书认证
    response = requests.get(url, cert=("/path/client.cert", "/path/client.key"))
    

认证机制

  • Basic Auth
    from requests.auth import HTTPBasicAuth
    response = requests.get(url, auth=HTTPBasicAuth("user", "pass"))
    
  • Digest Auth
    from requests.auth import HTTPDigestAuth
    response = requests.get(url, auth=HTTPDigestAuth("user", "pass"))
    
  • OAuth 1.0
    from requests_oauthlib import OAuth1
    auth = OAuth1("client_key", "client_secret", "token", "token_secret")
    response = requests.get(url, auth=auth)
    

事件钩子

在请求过程中插入自定义逻辑:

def log_response(resp, *args, **kwargs):
    print(f"Received {len(resp.content)} bytes from {resp.url}")

hooks = {"response": [log_response]}
requests.get("https://api.example.com", hooks=hooks)

最佳实践

  1. 超时设置:始终指定 timeout 避免阻塞。
  2. 连接复用:高并发场景使用 Session 提升性能。
  3. 安全传输:生产环境避免 verify=False,优先使用 HTTPS。
  4. 资源管理:使用 with 语句确保文件/会话正确关闭。
  5. 错误处理
    try:
        resp = requests.get(url, timeout=5)
        resp.raise_for_status()
    except requests.RequestException as e:
        logging.error(f"Request failed: {str(e)}")
    

通过掌握这些功能,您能高效处理 REST API 调用、Web 爬虫、文件传输等场景。requests 的灵活性与易用性使其成为 Python 开发者的首选 HTTP 工具。