对于爬虫curl转化工具的使用

451 阅读1分钟

网站的地址:curl.trillworks.com/#

步骤

第一步:

我们打开我们的目标网站,在F12检查中,找到目标的地址

image.png

第二步

我们在我们我们找到的网址上右键,拷贝curl命令

image.png

第三步

打开我们的上面的网址:curl.trillworks.com/# 将我们的命令放到curl窗口中

image.png

第四步

我们复制我们的python requests 到我们的代码中,所有的请求参数都会在headers

import requests

headers = {
    'authority': 'www.pexels.com',
    'cache-control': 'max-age=0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'document',
    'accept-language': 'zh-CN,zh;q=0.9',
    'cookie': 'ab.storage.deviceId.5791d6db-4410-4ace-8814-12c903a548ba=%7B%22g%22%3A%22ee41d5cf-b95c-72e8-47b6-8021191ee1cf%22%2C%22c%22%3A1629607476794%2C%22l%22%3A1629607476794%7D; locale=zh-TW; NEXT_LOCALE=zh-TW; _ga=GA1.2.2086648963.1629607477; _gid=GA1.2.1964460082.1629607477; _fbp=fb.1.1629607477458.1367769568; _hjid=9a7f4759-4fe1-40a6-a71b-00408ce0b67a; __cf_bm=c99322c9f18bb6aece68ccb68776a006535d03fa-1629685925-1800-AT2dEL6HY+3YE5NusuVzUpbFu656vUooeok71yXPChtgSmWdY/z3GFumJ2oABOVA0oKPJJa90disRpgugKz0qdkLUUrfckCYXEAx95XBGFoAb7/CR32Pg0PA604dm48FX4oiBRbwHa6qx2mHTrkQCv0Q1HNU/LTLQK0GV8GXkq/A; _gaexp=GAX1.2.tR3-05irSjCuHGCWvr4mHw.18888.0; _hjIncludedInSessionSample=1; _hjAbsoluteSessionInProgress=0; ab.storage.sessionId.5791d6db-4410-4ace-8814-12c903a548ba=%7B%22g%22%3A%22897973a1-faf2-2186-c4f5-b76e8b7b3b6a%22%2C%22e%22%3A1629687992369%2C%22c%22%3A1629685961891%2C%22l%22%3A1629686192369%7D',
}

response = requests.get('https://www.pexels.com/zh-tw/', headers=headers)
print(response.status_code)

with open("waws.html","w",encoding="utf-8") as f:
    f.write(response.text)

这个只是个演示的例子,大体的方式是这样的