问题描述:产品经理安排了一个任务,需要你开发一个“商品价格比价神器”,从不同电商网站获取商品的最低价格,然后输出给用户查询。
思考:首先从不同的网站爬取该商品的最近最低销售价格,然后异步流程处理。 (爬取获取电商网站对该商品商品价格)。
1.定义商品价格对象。
public class ShopPrice {
private String shopName;
public ShopPrice(String shopName) {
this.shopName = shopName;
}
public String getShopName() {
return shopName;
}
public void setShopName(String shopName) {
this.shopName = shopName;
}
public double getPrice(String product) {
return calculatePrice(product);
}
/**
* 异步方式计算价格
*
* @param product 商品名称
* @return 商品价格
*/
private double calculatePrice(String product) {
try {
//任务-延迟1秒-模拟http网络请求延迟1秒
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
return random.nextDouble() * product.charAt(0) + product.charAt(1);
}
private static Random random = new Random();
//模拟一个表中的电商网站
static List<AsyncGetShopPrice> shopList = newArrayList(
new AsyncGetShopPrice("Shop1"),
new AsyncGetShopPrice("Shop2"),
new AsyncGetShopPrice("Shop3"),
new AsyncGetShopPrice("Shop4"),
new AsyncGetShopPrice("Shop5"),
new AsyncGetShopPrice("Shop6"),
new AsyncGetShopPrice("Shop7"),
new AsyncGetShopPrice("Shop8"),
new AsyncGetShopPrice("Shop9"),
new AsyncGetShopPrice("Shop10"),
new AsyncGetShopPrice("Shop11"),
new AsyncGetShopPrice("Shop12"),
new AsyncGetShopPrice("Shop13"),
new AsyncGetShopPrice("Shop14"),
new AsyncGetShopPrice("Shop15"),
new AsyncGetShopPrice("Shop16"),
new AsyncGetShopPrice("Shop17"),
new AsyncGetShopPrice("Shop18"),
new AsyncGetShopPrice("Shop19"),
new AsyncGetShopPrice("Shop20"));
2.使用并行流处理:
//同一个商品在不同电商网站的价格获取
public static List<String> findPriceByProduct(String product) {
return shopList.stream()
.parallel()
.map(shop ->
String.format("%s price is %.2f", shop.getShopName(), shop.getPrice(product)))
.collect(Collectors.toList());
}
3.使用CompletableFuture实现
/**
* 使用CompletableFuture
*
* @param product 商品名称
* @return 被CompletableFuture处理过的list
*/
public static List<String> findPriceByProductAsync(String product) {
//使用CompletableFuture的工厂方法supplyAsync,异步获取值并放入future中
List<CompletableFuture<String>> priceFuture = shopList.stream()
.map(shop -> CompletableFuture.supplyAsync(() ->
String.format("%s price is %.2f",
shop.getShopName(), shop.getPrice(product))))
.collect(Collectors.toList());
return priceFuture.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
}
4.使用CompletableFuture线程池实现
/**
* 使用CompletableFuture
*
* @param product 商品名称
* @return 被CompletableFuture处理过的list
*/
public static List<String> findPriceByProductAsyncThread(String product) {
//使用CompletableFuture的工厂方法supplyAsync,异步获取值并放入future中
List<CompletableFuture<String>> priceFuture = shopList.stream()
.map(shop -> CompletableFuture.supplyAsync(() ->
String.format("%s price is %.2f", shop.getShopName(), shop.getPrice(product)),
executor))
.collect(Collectors.toList());
return priceFuture.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
}
/**
* 创建一个线程池,Math.min
* 《Java并发编程实战》Brian Goetz建议:
* 线程池的大小按照下面公式来计算:
* N(threads)=N(cpu)*U(cpu)*(1 + W/C)
* N(cpu)处理器的核数目,可以通过:CUP的核数= (Runtime.getRuntime().availableProcessors())获取
* U(cpu)是期望的CUP利用率,数值介于0~1之间
* W/C是等待时间和程序单次执行时间
*/
static final int corePoolSize = Math.min(shopList.size(), 100);
private static final Executor executor = new ThreadPoolExecutor(
corePoolSize, corePoolSize,
0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>(),
r -> {
Thread t = new Thread(r);
//设置为守护线程不阻止程序关停
t.setDaemon(true);
return t;
});
5.下面是我本地Mac 4核,测试结果说明,我本地模拟了20个shop比较
public static void main(String[] args) {
long start = System.nanoTime();
//System.out.println(findPriceByProduct("myIphone11Max"));
//System.out.println(findPriceByProductAsync("myIphone11Max"));
System.out.println(findPriceByProductAsyncThread("myIphone11Max"));
long duration = (System.nanoTime() - start) / 1_000_000;
System.out.println("Duration in " + duration + " msecs");
//[A1 price is 168.99, A2 price is 190.99, A3 price is 183.31, A4 price is 210.65]
//Duration in 4164 msecs -- 4个商家接口调用大约4秒
//[A1 price is 184.21, A2 price is 148.43, A3 price is 134.85, A4 price is 221.05, A5 price is 190.88]
//Duration in 5140 msecs -- 5个商家接口调用大约5秒
//[A1 price is 229.06, A2 price is 198.69, A3 price is 163.47, A4 price is 135.35]
//Duration in 1180 msecs -- parallel 并行流处理时间
//[A1 price is 153.41, A2 price is 137.70, A3 price is 169.99, A4 price is 212.28, A5 price is 133.78]
//Duration in 2138 msecs -- parallel 并行流处理5个商家时间
//[A1 price is 180.93, A2 price is 187.74, A3 price is 206.71, A4 price is 123.72, A5 price is 185.13, A6 price is 201.66, A7 price is 122.12, A8 price is 122.28, A9 price is 133.44]
//Duration in 3139 msecs parallel 并行流处理9个商家耗时 3秒左右
//[A1 price is 165.29, A2 price is 194.87, A3 price is 226.72, A4 price is 146.20, A5 price is 164.49, A6 price is 146.97, A7 price is 198.16, A8 price is 213.22, A9 price is 128.42, A10 price is 166.33, A11 price is 186.02, A12 price is 137.38, A13 price is 130.81, A14 price is 212.11, A15 price is 184.66, A16 price is 188.99, A17 price is 222.41, A18 price is 133.20, A19 price is 198.52, A20 price is 192.64]
//Duration in 5154 msecs --parallel并行流处理20个商家耗时 5秒左右
//[A1 price is 182.99, A2 price is 163.31, A3 price is 210.66, A4 price is 150.64]
//Duration in 2146 msecs -- CompletableFuture 4个商家耗时 2秒左右
//[A1 price is 203.81, A2 price is 146.71, A3 price is 195.79, A4 price is 148.64, A5 price is 172.47]
//Duration in 2155 msecs -- CompletableFuture 5个商家耗时 2秒左右
//[A1 price is 175.50, A2 price is 227.35, A3 price is 136.42, A4 price is 149.88, A5 price is 227.83, A6 price is 158.69, A7 price is 145.41, A8 price is 199.95, A9 price is 152.76]
//Duration in 3184 msecs -- CompletableFuture 9个商家耗时 3秒左右
//[A1 price is 139.27, A2 price is 227.24, A3 price is 185.27, A4 price is 189.21, A5 price is 212.77, A6 price is 223.50, A7 price is 134.73, A8 price is 161.68, A9 price is 181.64, A10 price is 228.58, A11 price is 190.20, A12 price is 133.67, A13 price is 229.10, A14 price is 148.53, A15 price is 212.89, A16 price is 159.47, A17 price is 170.14, A18 price is 202.56, A19 price is 137.83, A20 price is 128.59]
//Duration in 7165 msecs -- CompletableFuture 20个商家耗时 7秒左右
//System.out.println("CUP的核数=" + Runtime.getRuntime().availableProcessors());
//[A1 price is 214.90, A2 price is 192.81, A3 price is 147.93, A4 price is 174.46, A5 price is 208.14, A6 price is 177.70, A7 price is 183.81, A8 price is 161.27, A9 price is 129.07, A10 price is 194.89, A11 price is 174.63, A12 price is 159.22, A13 price is 143.92, A14 price is 141.92, A15 price is 136.10, A16 price is 174.94, A17 price is 137.55, A18 price is 204.08, A19 price is 150.38, A20 price is 147.92]
//Duration in 1041 msecs -- 使用多线程线程池情况下20个异步请求耗时1秒左右
}
结论:
根据以上的测试,最后发现使用CompleteFuture和stream.parallel的方式时间差不多,20个相差2秒左右。但是在CompleteFuture中使用线程池来处理的时候贼快1041毫秒左右。
并行--使用流还是使用CompletableFutures?
目前对集合的操作进行并行计算主要有两种方式:
(1)将集合转化为并行流利用stream.map进行操作。
(2)枚举出集合中的每一个元素,创建新的线程,在CompletableFuture内对其进行处理。CompletableFuture提供了更多的灵活性,可以使用线程池调整线程池的大小,确保整体的计算不会因为线程等待I/O而发生阻塞。
建议:
1、如果是计算密集型操作,程序内部没有调用耗时的I/O,推荐使用Steam.parallel接口,实现简单效率最高(如果处理的所有线程都是计算密集型接口,没必要创建比处理器核数更多的线程)。
2、如果并行的工作单元设计很多超时的I/O操作(比如网络超时连接,接口请求返回耗时),那么使用CompleteFuture的灵活性更好,可以按照线程池创建多少线程适合的公式W/C比率设定需要使用的线程数。其他情况考虑不使用流的另一个原因是:处理流水线调用方式中如果发现I/O长时间等待,流的延迟特性会导致我们很难判断到底什么时候触发了等待(很难抛出异常),而CompleteFuture可以很好的处理异常。