使⽤用Requests库构建⼀一个HTTP请求

1,629 阅读1分钟

请求方法

  • GET 查看资源
  • POST 增加资源
  • PUT 修改资源
  • DELETE 删除资源
  • HEAD 查看响应头
  • OPTIONS 查看可用请求方法 requests.[method](url)

GitHub API 示例

https://developer.github.com/v3/migrations/users/

  • json.load()将已编码的 JSON 字符串解码为 Python 对象
def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None,
        parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
    """Deserialize ``s`` (a ``str`` instance containing a JSON
    document) to a Python object.
  • json.dumps()将 Python 对象编码成 JSON 字符串
def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
        allow_nan=True, cls=None, indent=None, separators=None,
        default=None, sort_keys=False, **kw):
    """Serialize ``obj`` to a JSON formatted ``str``.
  • encode将Python对象编码成JSON字符串
  • decode将已编码的JSON字符串解码为Python对象
URL = 'https://api.github.com'

def build_uri(endpoint):
    return '/'.join([URL, endpoint])

def better_print(json_str):
    """Deserialize ``s`` (a ``str`` instance containing a JSON
        document) to a Python object."""
    return json.dumps(json.loads(json_str), indent=4)

def request_method():
    response = requests.get(build_uri('user/emails'), auth=('user', 'psw'))
    print(response.status_code)
    print(better_print(response.text))
>>[
  {
    "email": "octocat@github.com",
    "verified": true,
    "primary": true,
    "visibility": "public"
  }
]

Github修改用户信息

  • PATCH /user
  • Note: If your email is set to private and you send an email parameter as part of this request to update your profile, your privacy settings are still enforced: the email address will not be displayed on your public profile or via the API.
  • response = requests.patch(url, auth=('user', 'psw'), json={'name':'123'})

请求异常处理

request.get(url,timeout=timeout) #timeout=(x1,x2) 每步单独限制request-response时长 #timeout=x 总体限制request-response时长

from requests import exceptions
def timeout_request():
    try:
        response = requests.get(build_uri('user/emails'),timeout=10)
    except exceptions.Timeout as e:
        print(str(e))
    else:
        print(response.text)

自定义Requests

def hard_request():
    from requests import  Request, Session
    s = Session()
    headers = {'User-Agent': 'fake1.3.4'}
    req = Request('GET',build_uri('user/emails'), auth=('user','psw'),
                  headers = headers)
    prepped = req.prepare()
    print(prepped.body)
    print(prepped.headers)
    resp = s.send(prepped, timeout=5)
    print(resp.status_code)
    print(resp.headers)
    print(resp.text)

关于User-Agent

User-Agent会告诉网站服务器,访问者是通过什么工具来请求的,如果是爬虫请求,一般会拒绝,如果是用户浏览器,就会应答。