java8 强大的收集器1、collect a、基础对象 b、构造集合对象 1、过滤得到90分以上的学生列表，代码是这样

1、collect

a、基础对象

public class Student {
   
   private String name ;
   
   private double score;
   
   public Student() {
   }
   
   public Student(String name, double score) {
      this.name = name;
      this.score = score;
   }
   
   public String getName() {
      return name;
   }
   
   public void setName(String name) {
      this.name = name;
   }
   
   
   public double getScore() {
      return score;
   }
   
   public void setScore(double score) {
      this.score = score;
   }
   
   @Override
   public String toString() {
      return "Student{" +
            "name='" + name + ''' +
            ", score=" + score +
            '}';
   }
}

b、构造集合对象

List<Student> students = Arrays.asList(new Student[]{
      new Student("zhangsan", 89d ),
      new Student("lisi", 89d ),
      new Student("wangwu", 98d ),
});

1、过滤得到90分以上的学生列表，代码是这样的

List<Student> above90 = students.stream().filter(s -> s.getScore() > 90).collect(Collectors.toList());

最后的collect调用看上去很神奇，它到底是怎么把Stream转换为List的呢？先看下collect方法的定义：

<R, A> R collect(Collector<? super T, A, R> collector);

它接受一个收集器collector作为参数，类型是Collector，这是一个接口，它的定义基本上是：

/** @see Stream#collect(Collector)
 * @see Collectors
 *
 * @param <T> the type of input elements to the reduction operation
 * @param <A> the mutable accumulation type of the reduction operation (often
 *            hidden as an implementation detail)
 * @param <R> the result type of the reduction operation
 * @since 1.8
 */
public interface Collector<T, A, R> {
    /**
     * A function that creates and returns a new mutable result container.
     *
     * @return a function which returns a new, mutable result container
     */
    Supplier<A> supplier();
    ......
	
}

在顺序流中，collect方法与这些接口方法的交互大概是这样的：

// 首先调用工厂方法supplier创建一个存放处理状态的容器 container，类型为 A
A container = collector.supplier().get();
// 对流中的每个元素t,调用累加器accumulator,参数为累计状态 container 和当前元素 t 
for(T t : data) {
    collector.accumulator().accept(container,t);
    // 最后调用 finisher 对累计状态container进行可能的调整，类型转换（A转换为R）,返回结果。
    return collector.finisher().apply(container)
}

combiner只在并行流中有用，用于合并部分结果。characteristics用于标示收集器的特征，Collector接口的调用者可以利用这些特征进行一些优化。

Characteristics是一个枚举，有三个值：CONCURRENT、UNORDERED和IDENTITY_FINISH，它们的含义我们后面通过例子简要说明，目前可以忽略

Collectors.toList()具体是什么呢？看下代码：

/**
 * Returns a {@code Collector} that accumulates the input elements into a
 * new {@code List}. There are no guarantees on the type, mutability,
 * serializability, or thread-safety of the {@code List} returned; if more
 * control over the returned {@code List} is required, use {@link #toCollection(Supplier)}.
 *
 * @param <T> the type of the input elements
 * @return a {@code Collector} which collects all the input elements into a
 * {@code List}, in encounter order
 */
public static <T>
Collector<T, ?, List<T>> toList() {
    return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
                               (left, right) -> { left.addAll(right); return left; },
                               CH_ID);
}

它的实现类是CollectorImpl，这是Collectors内部的一个私有类，实现很简单，主要就是定义了两个构造方法，接受函数式参数并赋值给内部变量。对toList来说：

1）supplier的实现是ArrayList::new，也就是创建一个ArrayList作为容器。

2）accumulator的实现是List::add，也就是将碰到的每一个元素加到列表中。

3）第三个参数是combiner，表示合并结果。

4）第四个参数CH_ID是一个静态变量，只有一个特征IDENTITY_FINISH，表示finisher没有什么事情可以做，就是把累计状态container直接返回。

也就是说，collect(Collectors.toList())背后的伪代码如下所示：

List<T> container = new ArrayList();
for (T t : data) {
    container.add(t);
}
return container;

2、容器收集器

1. toSet
  toSet的使用与toList类似，只是它可以排重，就不举例了。toList背后的容器是ArrayList, toSet背后的容器是HashSet。
1. toCollection
  toCollection是一个通用的容器收集器，可以用于任何Collection接口的实现类，它接受一个工厂方法Supplier作为参数，具体代码为

/**
 * Returns a {@code Collector} that accumulates the input elements into a
 * new {@code Collection}, in encounter order.  The {@code Collection} is
 * created by the provided factory.
 *
 * @param <T> the type of the input elements
 * @param <C> the type of the resulting {@code Collection}
 * @param collectionFactory a {@code Supplier} which returns a new, empty
 * {@code Collection} of the appropriate type
 * @return a {@code Collector} which collects all the input elements into a
 * {@code Collection}, in encounter order
 */
public static <T, C extends Collection<T>>
Collector<T, ?, C> toCollection(Supplier<C> collectionFactory) {
    return new CollectorImpl<>(collectionFactory, Collection<T>::add,
                               (r1, r2) -> { r1.addAll(r2); return r1; },
                               CH_ID);
}

比如，如果希望排重但又希望保留出现的顺序，可以使用LinkedHashSet,Collector可以这么创建：

Collectors.toCollection(LinkedHashSet::new)

3、toMap

toMap将元素流转换为一个Map，我们知道，Map有键和值两部分，toMap至少需要两个函数参数，一个将元素转换为键，另一个将元素转换为值，其基本定义为：

public static <T, K, U>
Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                Function<? super T, ? extends U> valueMapper) {
    return toMap(keyMapper, valueMapper, throwingMerger(), HashMap::new);
}

返回结果为Map<K, U>, keyMapper将元素转换为键，valueMapper将元素转换为值。比如，将学生流转换为学生名称和分数的Map，代码可以为：

Map<String, Double> collect = students.stream().collect(Collectors.toMap(Student::getName, Student::getScore));

实践中，经常需要将一个对象列表按主键转换为一个Map，以便以后按照主键进行快速查找，比如，假定Student的主键是id，希望转换学生流为学生id和学生对象的Map，代码可以为：

Map<String, Student> collect1 = students.stream().collect(Collectors.toMap(Student::getName, Function.identity()));

上面的toMap假定元素的键不能重复，如果有重复的，会抛出异常。比如

Stream.of("abc","hello","abc").collect(Collectors.toMap(Function.identity(),t->t.length()));

希望得到字符串与其长度的Map，但由于包含重复字符串"abc"，程序会抛出异常。这种情况下，我们希望的是程序忽略后面重复出现的元素，这时，可以使用另一个toMap函数：

public static <T, K, U>
Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                Function<? super T, ? extends U> valueMapper,
                                BinaryOperator<U> mergeFunction) {
    return toMap(keyMapper, valueMapper, mergeFunction, HashMap::new);
}

相比前面的toMap，它接受一个额外的参数mergeFunction，它用于处理冲突，在收集一个新元素时，如果新元素的键已经存在了，系统会将新元素的值与键对应的旧值一起传递给mergeFunction得到一个值，然后用这个值给键赋值。

对于前面字符串长度的例子，新值与旧值其实是一样的，我们可以用任意一个值，代码可以为：

Stream.of("abc","hello","abc").collect(Collectors.toMap(Function.identity(),t->t.length(),(v1,v2)->v2));

toMap还有一个更为通用的形式：

public static <T, K, U, M extends Map<K, U>>
Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
                            Function<? super T, ? extends U> valueMapper,
                            BinaryOperator<U> mergeFunction,
                            Supplier<M> mapSupplier) {
    BiConsumer<M, T> accumulator
            = (map, element) -> map.merge(keyMapper.apply(element),
                                          valueMapper.apply(element), mergeFunction);
    return new CollectorImpl<>(mapSupplier, accumulator, mapMerger(mergeFunction), CH_ID);
}

相比前面的toMap，多了一个mapSupplier，它是Map的工厂方法，对于前面的两个toMap，其mapSupplier其实是HashMap::new。我们知道，HashMap是没有任何顺序的，如果希望保持元素出现的顺序，可以替换为LinkedHashMap，如果希望收集的结果排序，可以使用TreeMap。