response status code 418该怎么办?

417 阅读1分钟

前言

使用python、nodejs等语言进行爬虫时,请求网站时返回了418,故此记录一下。

原因

目标网站有反爬虫机制,如果没有正确的请求头信息,会以418状态码返回响应。

418状态码

被目标网站反爬程序检测返回的状态码。

英文解释:418 I’m a teapot。The HTTP 418 I’m a teapot client error response code indicates that the server refuses to brew coffee because it is a teapot. This error is a reference to Hyper Text Coffee Pot Control Protocol which was an April Fools’ joke in 1998.

解决方法

添加请求头信息

def request_douban(url):
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
        response = requests.get(url, headers=headers)
        print(response)

        if response.status_code == 200:
            return response.text
    except requests.RequestException:
        return None