前言
本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理。
开发工具
-
python 3.6.5
-
pycharm
import requests import parsel 12
相关模块可pip安装
分析目标网页
请求网页获取列表页
response = requests.get(url=url, headers=headers)
selector = parsel.Selector(response.text)
lis = selector.css('#pins li a::attr(href)').getall()
for li in lis:
page_id = li.split('/')[-1]
获取详情页下一页url
def netx_url(url, page_id):
response_2 = requests.get(url=url, headers=headers)
selector = parsel.Selector(response_2.text)
last_num = selector.css('.pagenavi a:nth-child(7) span::text').get()
for i in range(1, int(last_num) + 1):
new_url = 'https://www.mzitu.com/{}/{}'.format(page_id, i)
保存数据
def download(url):
response = requests.get(url=url, headers=headers)
selector = parsel.Selector(response.text)
title = selector.css('body > div.main > div.content > h2::text').get() # 图片标题
img_url = selector.css('.main-image p img::attr(src)').get() # 图片地址
path = 'D:\\python\\demo\\妹子图\\img\\' + title + '.jpg'
download_response = requests.get(url=img_url, headers=headers)
with open(path, mode='wb') as f:
f.write(download_response.content)
print(title, img_url)
快拿去学吧!记得多补充点营养哦!