事件起因
前天远道上海而来看我的女朋友又飞走了临别她告诉我让我不要对其它女孩产生非分之想,否则见我一次打我一次。为了锻炼我对小姐姐的抵抗力,我准备来一次抗诱惑训练。
气势汹汹的
解决办法
俗话说,做好的防御就是进攻。想要抵制诱惑就要面对诱惑,陷入小姐姐们的汪洋大海。
于是我就搜了搜-> 美女
宙斯说:潘多拉魔盒打开了你还想合上?小姐姐的光辉将我淹没,让我如沐春风,如鱼得水,如痴如醉。
此时我已无法自拔,看到小姐姐们孤单单的呆在网页上,我于心何忍、寝食难安、如坐针毡。
突然脑子里有个兄弟对我说:硬盘是干什么的?为什么不给小姐姐们一个家?看到小姐姐孤零零的你忍心吗?
说时迟那时快,说干在那就干,身为一个充满安心的程序员,我于心何忍,要我专业有何用。
把小姐姐搬回家计划->百度图片
第一步、分析百度图片url
你要进行这几步
- 打开百度图片
- 输入关键词 “老婆们”,咳咳不对 -> 再来一遍,输入"美女"
- F12 -> 打开开发者工具
- 滚动屏幕获取几个url
- 分析url参数
url:image.baidu.com/search/acjs…
请求参数:
参数很多,主要参数有那么几个:
- word : 关键词
- pn : 分页参数
- rn :分页参是
- 其他 : 默认就好
第二步、分析返回json数据
{
"queryEnc":"%C3%C0%C5%AE",
"queryExt":"美女",
"listNum":1761,
"displayNum":1089421,
"gsm":"5a",
"bdFmtDispNum":"约1,080,000",
"bdSearchTime":"",
"isNeedAsyncRequest":0,
"bdIsClustered":"1",
"data":[
{
"thumbURL":"https://img1.baidu.com/it/u=2560758776,1992136467&fm=26&fmt=auto",
"type":"jpg",
"fromPageTitle":"青春活力运动服<strong>美女</strong>图片壁纸(5/9)",
"fromPageTitleEnc":"青春活力运动服美女图片壁纸(5/9)",
"simid":"2560758776,1992136467"
}
...
]
}
图片信息只要保存在data里
- thumbURL :图片地址
- type :图片类型
- simid :图片id (校验重复下载)
知道了这些,小姐姐们已经进家门一半了
第三步、编码
编码思路:
- 通过百度图片搜索url获取返回json
- 解析json获取所有图片url保存到list
- 遍历list逐个给小姐姐搬家
我撸起袖子就是干!!!我仿佛看到小姐姐已经对我招手了。
新建maven项目,引包
<dependencies>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.12.0</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.62</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.4.1</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.2</version>
</dependency>
</dependencies>
DownloadBeauty.java
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.shadow.study.downImg.model.ImageDownModel;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
/**
* @author trouble-night
*/
public class DownloadBeauty {
/**
* @param startPage 从第几页下载
* @param endPage 第几页结束
* @param pageSize 每页几张图片
* @param keyWord 关键词
* @param homePath 小姐姐的家
*/
private static void downloadImages(int startPage, int endPage, int pageSize, String keyWord, String homePath) {
List<ImageDownModel> imageDownModelList = new ArrayList<>();
for (int i=startPage;i<endPage;i++) {
// 拼接图片检索url
StringBuilder builder = new StringBuilder("https://image.baidu.com/search/acjson?");
builder.append("&tn=").append("resultjson_com");
builder.append("&logid=").append("11077014820233865934");
builder.append("&ipn=").append("rj");
builder.append("&ct=").append("201326592");
builder.append("&is=");
builder.append("&fp=").append("result");
builder.append("&queryWord=").append(keyWord);
builder.append("&cl=").append("2");
builder.append("&lm=").append("-1");
builder.append("&ie=").append("utf-8");
builder.append("&oe=").append("utf-8");
builder.append("&adpicid=");
builder.append("&st=").append("-1");
builder.append("&z=");
builder.append("&ic=").append("0");
builder.append("&hd=");
builder.append("&latest=");
builder.append("©right=");
builder.append("&word=").append(keyWord);
builder.append("&se=");
builder.append("&tab=");
builder.append("&width=");
builder.append("&height=");
builder.append("&face=").append("0");
builder.append("&istype=").append("2");
builder.append("&qc=");
builder.append("&nc=").append("1");
builder.append("&fr=");
builder.append("&expermode=");
builder.append("&nojc=");
builder.append("&pn=").append(i*pageSize);
builder.append("&rn=").append(pageSize);
builder.append("&gsm=").append("1e");
builder.append("&1634642613614=");
// 获取解析图片url并保存list
String res = DownloadUtils.getUriJson(builder.toString());
JSONArray jsonArray = JSONObject.parseObject(res).getJSONArray("data");
jsonArray.forEach(o -> {
JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(o));
String imageUrl = jsonObject.getString("thumbURL");
String imageType = jsonObject.getString("type");
String imageName = jsonObject.getString("simid");
if (StringUtils.isNotEmpty(imageUrl)) {
imageDownModelList.add(new ImageDownModel(imageUrl, imageName+"."+imageType));
}
});
}
// 逐个下载保存图片
DownloadUtils.saveImage(imageDownModelList, keyWord, homePath);
}
}
DownloadUtils.java
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
/**
* @author trouble-night
*/
public class DownloadUtils {
public static String getUriJson(String url) {
HttpGet httpGet = new HttpGet(url);
httpGet.setHeader("Accept","text/plain, */*; q=0.01");
httpGet.setHeader("Accept-Encoding", "gzip, deflate, br");
httpGet.setHeader("Accept-Language", "zh-CN,zh;q=0.9");
httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36");
return sendRequest(httpGet, "UTF-8");
}
public static String sendRequest(HttpUriRequest httpRequest, String charset) {
String urlListJson = "";
HttpEntity entity = null;
try(CloseableHttpClient httpclient = "https".equals(httpRequest.getURI().getScheme())? createSslClient(): HttpClients.createDefault()) {
try (CloseableHttpResponse response = httpclient.execute(httpRequest)) {
entity = response.getEntity();
urlListJson = EntityUtils.toString(entity, charset);
} finally {
EntityUtils.consume(entity);
}
} catch (IOException e) {
e.printStackTrace();
}
return urlListJson;
}
private static CloseableHttpClient createSslClient() {
try {
SSLContext sslContext = new SSLContextBuilder().loadTrustMaterial((TrustStrategy) (chain, authType) -> true).build();
SSLConnectionSocketFactory sslFactory = new SSLConnectionSocketFactory(sslContext, (hostname, session) -> true);
return HttpClients.custom().setSSLSocketFactory(sslFactory).build();
} catch (GeneralSecurityException ex) {
throw new RuntimeException(ex);
}
}
public static void saveImage(List<ImageDownModel> imageList, String keyWord, String savePath) {
final AtomicInteger successCount = new AtomicInteger();
final AtomicInteger failCount = new AtomicInteger();
long startDate = System.currentTimeMillis();
ExecutorService executorService = new ThreadPoolExecutor(5, 10,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(1024),
new BasicThreadFactory.Builder().namingPattern("image-down-pool-%d").daemon(true).build(),
new ThreadPoolExecutor.AbortPolicy());
for (ImageDownModel image : imageList) {
executorService.execute(() -> {
try {
boolean b = downloadImg(image, keyWord, savePath);
if (b) {
System.out.printf("已下载第%s张图片,图片地址:%s%n", successCount.incrementAndGet(),image.getImageUrl());
} else {
failCount.incrementAndGet();
}
} catch (Exception e) {
failCount.incrementAndGet();
}
});
}
executorService.shutdown();
try {
if (!executorService.awaitTermination(10, TimeUnit.MINUTES)) {
System.out.println("任务超时,停止后续下载!!!");
executorService.shutdownNow();
}
File dir = new File(savePath + "/" + keyWord + "/");
int len = Objects.requireNonNull(dir.list()).length;
System.out.printf("下载结束-》\n任务量:%s;下载成功:%s; 下载失败:%s; \n当前共有文件:%s; \n耗时:%s秒",imageList.size(), successCount, failCount, len,(System.currentTimeMillis() - startDate) / 1000.0);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private static boolean downloadImg(ImageDownModel image, String keyWord, String savePath) throws Exception {
String path = savePath + "/" + keyWord + "/";
File dir = new File(path);
if (!dir.exists()) {
if (!dir.mkdirs()) {
System.out.println("图片存储文件夹创建失败,禁止下载");
return false;
}
}
String filePath = path + image.getImageName();
File img = new File(filePath);
if(img.exists()){
System.out.printf("图片%s已经存在目录中%n",image.getImageName());
return true;
}
URLConnection con = new URL(image.getImageUrl()).openConnection();
con.setConnectTimeout(5000);
con.setReadTimeout(5000);
InputStream inputStream = con.getInputStream();
byte[] bs = new byte[1024];
FileOutputStream os = new FileOutputStream(new File(filePath), true);
int len;
while ((len = inputStream.read(bs)) != -1) {
os.write(bs, 0, len);
}
return true;
}
}
ImageDownModel.java
/**
* @author changzheng.wang
*/
public class ImageDownModel {
private String imageUrl;
private String imageName;
public String getImageUrl() {
return imageUrl;
}
public void setImageUrl(String imageUrl) {
this.imageUrl = imageUrl;
}
public String getImageName() {
return imageName;
}
public void setImageName(String imageName) {
this.imageName = imageName;
}
public ImageDownModel() {
}
public ImageDownModel(String imageUrl, String imageName) {
this.imageUrl = imageUrl;
this.imageName = imageName;
}
@Override
public String toString() {
return "Image{" +
"imageUrl='" + imageUrl + '\'' +
", imageName='" + imageName + '\'' +
'}';
}
}
第四步、带小姐姐回家
public static void main(String[] args) {
String home = "C:\\Users\\changzheng.wang.KINGSTAR\\Desktop\\111";
DownloadBeauty.downloadImages(1,2,50, "美女", home);
}
思考
百度的小姐姐有家了,搜狗图片呢、360图片呢... 为此我陷入了深深的思考,认真思考雨露均沾。
借用杜甫的一句大公无私的话:“安得广厦千万间,大庇天下寒士俱欢颜”
改造
采用策略模式进行改造
废话补多少,小姐姐们已经着急了
策略接口
import java.util.HashMap;
import java.util.HashMap;
import java.util.Map;
/**
* @author trouble-night
*/
public interface DownLoadImageStrategy {
Map<String,DownLoadImageStrategy> STRATEGY = new HashMap<String,DownLoadImageStrategy>(){{
put("BaiDu",new BaiDuStrategy());
put("SouGou", new SouGouStrategy());
put("360", new Three60Strategy());
}};
/**
* 下载图片,每页10条
* @param startPage 第几页开始
* @param pageSize 每页图片数量
* @param endPage 结束页数
* @param localSavePath 保存文件夹地址
* @param keyWord 关键词
*/
void downLoadImages(int startPage, int pageSize, int endPage, String keyWord, String localSavePath);
}
百度下载策略
BaiDuStrategy.java
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
/**
* @author trouble-night
*/
public class BaiDuStrategy implements DownLoadImageStrategy {
@Override
public void downLoadImages(int startPage, int endPage, int pageSize, String keyWord, String homePath) {
homePath = homePath + "\\百度";
List<ImageDownModel> imageDownModelList = new ArrayList<>();
for (int i=startPage;i<=endPage;i++) {
StringBuilder builder = new StringBuilder("https://image.baidu.com/search/acjson?");
builder.append("&tn=").append("resultjson_com");
builder.append("&logid=").append("11077014820233865934");
builder.append("&ipn=").append("rj");
builder.append("&ct=").append("201326592");
builder.append("&is=");
builder.append("&fp=").append("result");
builder.append("&queryWord=").append(keyWord);
builder.append("&cl=").append("2");
builder.append("&lm=").append("-1");
builder.append("&ie=").append("utf-8");
builder.append("&oe=").append("utf-8");
builder.append("&adpicid=");
builder.append("&st=").append("-1");
builder.append("&z=");
builder.append("&ic=").append("0");
builder.append("&hd=");
builder.append("&latest=");
builder.append("©right=");
builder.append("&word=").append(keyWord);
builder.append("&se=");
builder.append("&tab=");
builder.append("&width=");
builder.append("&height=");
builder.append("&face=").append("0");
builder.append("&istype=").append("2");
builder.append("&qc=");
builder.append("&nc=").append("1");
builder.append("&fr=");
builder.append("&expermode=");
builder.append("&nojc=");
builder.append("&pn=").append(i*pageSize);
builder.append("&rn=").append(pageSize);
builder.append("&gsm=").append("1e");
builder.append("&1634642613614=");
String res = DownloadUtils.getUriJson(builder.toString());
JSONArray jsonArray = JSONObject.parseObject(res).getJSONArray("data");
jsonArray.forEach(o -> {
JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(o));
String imageUrl = jsonObject.getString("thumbURL");
String imageType = jsonObject.getString("type");
String imageName = jsonObject.getString("simid");
if (StringUtils.isNotEmpty(imageUrl)) {
imageDownModelList.add(new ImageDownModel(imageUrl, imageName+"."+imageType));
}
});
}
DownloadUtils.saveImage(imageDownModelList, keyWord, homePath);
}
}
搜狗下载策略
SouGouStrategy.java
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
/**
* @author trouble-night
*/
public class SouGouStrategy implements DownLoadImageStrategy {
@Override
public void downLoadImages(int startPage, int endPage, int pageSize, String keyWord, String homePath) {
homePath = homePath + "\\搜狗";
List<ImageDownModel> imageDownModelList = new ArrayList<>();
for (int i=startPage;i<=endPage;i++) {
StringBuilder builder = new StringBuilder("https://pic.sogou.com/napi/pc/searchList?");
builder.append("&mode=").append("1")
.append("&start=").append(i*pageSize)
.append("&xml_len=").append(pageSize)
.append("&query=").append(keyWord);
String res = DownloadUtils.getUriJson(builder.toString());
JSONArray jsonArray = JSONObject.parseObject(res).getJSONObject("data").getJSONArray("items");
jsonArray.forEach(o -> {
JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(o));
String picUrl = jsonObject.getString("picUrl");
String imageName = jsonObject.getString("docId");
String imageType = jsonObject.getString("type");
if (StringUtils.isNotEmpty(picUrl)) {
imageDownModelList.add(new ImageDownModel(picUrl,imageName+imageType));
}
});
}
DownloadUtils.saveImage(imageDownModelList, keyWord, homePath);
}
}
360下载策略
Three60Strategy.java
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
/**
* @author trouble-night
*/
public class Three60Strategy implements DownLoadImageStrategy {
@Override
public void downLoadImages(int startPage, int endPage, int pageSize, String keyWord, String homePath) {
homePath = homePath + "\\360";
List<ImageDownModel> imageDownModelList = new ArrayList<>();
for (int i=startPage;i<=endPage;i++) {
StringBuilder builder = new StringBuilder("https://image.so.com/j?");
builder.append("&").append("callback").append("=").append("jQuery183048982013190765383_1634709515409")
.append("&").append("q").append("=").append(keyWord)
.append("&").append("pd").append("=").append("1")
.append("&").append("pn").append("=").append(pageSize)
.append("&").append("correct").append("=").append(keyWord)
.append("&").append("adstar").append("=").append("0")
.append("&").append("tab").append("=").append("all")
.append("&").append("sid").append("=").append("70b0299d677db5f8ef33a361d0750f6a")
.append("&").append("ras").append("=").append("6")
.append("&").append("cn").append("=").append("0")
.append("&").append("gn").append("=").append("0")
.append("&").append("kn").append("=").append("50")
.append("&").append("crn").append("=").append("0")
.append("&").append("bxn").append("=").append("20")
.append("&").append("cuben").append("=").append("0")
.append("&").append("pornn").append("=").append("0")
.append("&").append("manun").append("=").append("43")
.append("&").append("src").append("=").append("srp")
.append("&").append("sn").append("=").append(i*pageSize)
.append("&").append("ps").append("=").append((i+1)*pageSize)
.append("&").append("pc").append("=").append(i)
.append("&").append("_").append("=").append(System.currentTimeMillis());
System.out.println(builder.toString());
String res = DownloadUtils.getUriJson(builder.toString());
res = res.substring(res.indexOf("(")+1, res.lastIndexOf(")"));
JSONArray jsonArray = JSONObject.parseObject(res).getJSONArray("list");
jsonArray.forEach(o -> {
JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(o));
String imageUrl = jsonObject.getString("img");
String imageType = jsonObject.getString("imgtype").toLowerCase();
String imageName = jsonObject.getString("id");
if (StringUtils.isNotEmpty(imageUrl)) {
imageDownModelList.add(new ImageDownModel(imageUrl, imageName+"."+imageType));
}
});
}
DownloadUtils.saveImage(imageDownModelList, keyWord, homePath);
}
}
测试
public static void main(String[] args) {
String localDir = "E:\\小姐姐姐的小房子";
DownLoadImageStrategy.STRATEGY.get("SouGou").downLoadImages(1, 5, 50, "美女", localDir);
DownLoadImageStrategy.STRATEGY.get("BaiDu").downLoadImages(1, 5, 50, "美女", localDir);
DownLoadImageStrategy.STRATEGY.get("360").downLoadImages(1, 5, 50, "美女", localDir);
}
总结
最终程序猿小哥哥与小姐姐们过上了没羞没臊的生活。
梦醒时分,唏嘘不已
哎昨晚上的榴莲已经跪碎了,家里的纸也不够用了。
代码大公无私已交出,各位老铁别忘记一键三连,腾讯图片、必应图片……小姐姐们等你们去拯救呢。
不说了,揉揉我的膝盖,接着买榴莲去了。