urllib2的使用(三)

136 阅读1分钟

使用add_data添加路径参数

Request的data的添加方法

import urllib2

data = "first=true&p=1&kd=python"
headers = {"User-Agent": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0);"}
request = urllib2.Request("https://www.lagou.com/lbs/getAllCitySearchLabels.json",headers = headers)
request.add_data(data)
print(urllib2.urlopen(request).read())

post请求data参数

这个部分是Request中的data进行赋值,注意是字典类型,这个data赋值了,就代表是post请求了

import urllib2
import urllib

url = "https://umbra.nascom.nasa.gov/cgi-bin/eit-catalog.cgi"
values = {'obs_year':'2011','obs_month':'March',
             'obs_day':'8','start_year':'2011'
             ,'start_month':'March','start_day':'8'
             ,'start_hour':'All Hours','stop_year':'2011'
             ,'stop_month':'March','stop_day':'8'
             ,'stop_hour':'All Hours','xsize':'All'
             ,'ysize':'All','wave':'all'
             ,'filter':'all','object':'all'
             ,'xbin':'all','ybin':'all'
             ,'highc':'all'}

data = urllib.urlencode(values)
request = urllib2.Request(url,data) # post类型,发起请求,传递data
print(urllib2.urlopen(request).read())

ajax的post请求

  • ajax的请求方式是:post
  • 因为使用的是python的库的方式,返回状态码为418,触发了反扒机制
  • 添加上headers,模拟浏览器即可
import urllib2
import urllib

url = "https://movie.douban.com/j/chart/top_list?type=10&interval_id=100%3A90&action"
headers = {"User-Agent": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0);"}
formdata = {"limit":"20","start":"0"}

data = urllib.urlencode(formdata)
request = urllib2.Request(url,data = data,headers=headers)
print(urllib2.urlopen(request).read())