Guava随记 —— 集合不可变集合 Guava 提供了一种不可变的集合 —— ImmutableCollections

不可变集合

Guava 提供了一种不可变的集合 —— ImmutableCollections，该集合一旦初始化完成，任何的修改，都会抛出 UnsupportedOperationException

实例

public static final ImmutableSet<String> COLOR_NAMES = ImmutableSet.of(
  "red",
  "orange",
  "yellow",
  "green",
  "blue",
  "purple");

class Foo {
  final ImmutableSet<Bar> bars;
  Foo(Set<Bar> bars) {
    this.bars = ImmutableSet.copyOf(bars); // defensive copy!
  }
}

优势

使用时安全可靠
线程安全区
不需要支持修改，节省时间和空间
可以作为一个常量来使用

JDK 提供了 Collections.unmodifiableXXX 方法，来创建一个不可修改的集合。但是，这个相对于 Guava 的不可变集合来说：

笨重且冗长。

不安全。只有没有人持有原来集合的引用时，所谓的不可变集合才是彻底不可变的。例如：

public void test() {
    List<Integer> list = new ArrayList<>();
    list.add(1);
    list.add(2);

    List<Integer> unmodifiableList = Collections.unmodifiableList(list);
    System.out.println(unmodifiableList);   // [1, 2]

    list.add(3);
    System.out.println(unmodifiableList);   // [1, 2]
}

效率低。数据结构中仍持有原来可变集合的所有开销，例如并发修改检查、散列表中的额外空间等等。

内部实现

JDK 不可变类

Collections.unmodifiableCollection

public static <T> Collection<T> unmodifiableCollection(Collection<? extends T> c) {
    return new UnmodifiableCollection<>(c);
}

UnmodifiableCollection

static class UnmodifiableCollection<E> implements Collection<E>, Serializable {
    private static final long serialVersionUID = 1820017752578914078L;

    final Collection<? extends E> c;

    UnmodifiableCollection(Collection<? extends E> c) {
        if (c==null)
            throw new NullPointerException();
        this.c = c;
    }

    public boolean add(E e) {
        throw new UnsupportedOperationException();
    }
	// ...
}

JDK 的不可变类，实际上是对对应的可变集合进行了包装，正是因为如此，所以才会出现不安全的问题。

Guava 不可变类

这里以 ImmutableList 为例

ImmutableList

public abstract class ImmutableList<E> extends ImmutableCollection<E>
    implements List<E>, RandomAccess {

  public static <E> ImmutableList<E> copyOf(Iterable<? extends E> elements) {
    checkNotNull(elements); // TODO(kevinb): is this here only for GWT?
    return (elements instanceof Collection)
        ? copyOf((Collection<? extends E>) elements)
        : copyOf(elements.iterator());
  }

  private static <E> ImmutableList<E> construct(Object... elements) {
    return asImmutableList(checkElementsNotNull(elements));
  }
    
  static <E> ImmutableList<E> asImmutableList(Object[] elements) {
    return asImmutableList(elements, elements.length);
  }
    
  static <E> ImmutableList<E> asImmutableList(@Nullable Object[] elements, int length) {
    switch (length) {
      case 0:
        return of();
      case 1:
        /*
         * requireNonNull is safe because the callers promise to put non-null objects in the first
         * `length` array elements.
         */
        @SuppressWarnings("unchecked") // our callers put only E instances into the array
        E onlyElement = (E) requireNonNull(elements[0]);
        return of(onlyElement);
      default:
        /*
         * The suppression is safe because the callers promise to put non-null objects in the first
         * `length` array elements.
         */
        @SuppressWarnings("nullness")
        Object[] elementsWithoutTrailingNulls =
            length < elements.length ? Arrays.copyOf(elements, length) : elements;
        return new RegularImmutableList<E>(elementsWithoutTrailingNulls);
    }
  }
}

创建集合的方法，最终都会到 asImmutableList，当多个元素时，返回的是子类 RegularImmutableList

RegularImmutableList

class RegularImmutableList<E> extends ImmutableList<E> {
  static final ImmutableList<Object> EMPTY = new RegularImmutableList<>(new Object[0]);

  @VisibleForTesting final transient Object[] array;

  RegularImmutableList(Object[] array) {
    this.array = array;
  }

  @Override
  public int size() {
    return array.length;
  }

  @Override
  boolean isPartialView() {
    return false;
  }

  @Override
  Object[] internalArray() {
    return array;
  }

  @Override
  int internalArrayStart() {
    return 0;
  }

  @Override
  int internalArrayEnd() {
    return array.length;
  }

  @Override
  int copyIntoArray(@Nullable Object[] dst, int dstOff) {
    System.arraycopy(array, 0, dst, dstOff, array.length);
    return dstOff + array.length;
  }

  // The fake cast to E is safe because the creation methods only allow E's
  @Override
  @SuppressWarnings("unchecked")
  public E get(int index) {
    return (E) array[index];
  }

  @SuppressWarnings("unchecked")
  @Override
  public UnmodifiableListIterator<E> listIterator(int index) {
    // for performance
    // The fake cast to E is safe because the creation methods only allow E's
    return (UnmodifiableListIterator<E>) Iterators.forArray(array, 0, array.length, index);
  }

  @Override
  public Spliterator<E> spliterator() {
    return Spliterators.spliterator(array, SPLITERATOR_CHARACTERISTICS);
  }

  // TODO(lowasser): benchmark optimizations for equals() and see if they're worthwhile
}

Guava 内的不可变集合，属于一个新的类，相对于原来 JDK 中可变的类，由于不支持修改，所以内部结构极其简单，所以在内存和效率方便较优

同时，在创建不可变集合时，对每个元素都进行了非空校验，也就是说，ImmutableList 拒绝空值，其实，Guava 中的所有不可变集合都是拒绝空值的。

Guava 团队对 Google 的内部代码库进行了详尽的研究，结果表明，在大约 5% 的情况下，集合中允许使用 null 元素，而在其它 95% 的情况下，快速处理 null 元素是最好的选择。如果需要使用 null 值，可以考虑使用 Collections.unmodifiableCollection 等 JDK 提供的不可变集合。

使用

Guava 支持通过以下几种方式来创建不可变集合：

copyOf 方法，例如：ImmutableSet.copyOf(set)
of 方法，例如：ImmutableSet.of("a", "b", "c") 或 ImmutableMap.of("a", 1, "b", 2)

Builder 构造，例如：

public static final ImmutableSet<Color> GOOGLE_COLORS =
   ImmutableSet.<Color>builder()
       .addAll(WEBSAFE_COLORS)
       .add(new Color(0, 191, 255))
       .build();

顺序

内部元素的顺序，在创建时就保存了下来。例如：

ImmutableSet.of("a", "b", "c", "a", "d", "b")

对该集合进行迭代时，顺序是 "a", "b", "c", "d"

copyOf

public static <E> ImmutableList<E> copyOf(Collection<? extends E> elements) {
  if (elements instanceof ImmutableCollection) {
    @SuppressWarnings("unchecked") // all supported methods are covariant
    ImmutableList<E> list = ((ImmutableCollection<E>) elements).asList();
    return list.isPartialView() ? ImmutableList.<E>asImmutableList(list.toArray()) : list;
  }
  return construct(elements.toArray());
}

copyOf 方法，做了一个优化，当需要拷贝的已经是不可变集合时，由于不可变集合可以认为是一个常量，所以可以直接用原来的集合。否则通过 construct 方法，最终执行 asImmutableList 方法，来创建一个新的不可变集合。

可变集合与不可变集合对应关系

Interface	JDK or Guava	Immutable Version
`Collection`	JDK	`ImmutableCollection`
`List`	JDK	`ImmutableList`
`Set`	JDK	`ImmutableSet`
`SortedSet`/`NavigableSet`	JDK	`ImmutableSortedSet`
`Map`	JDK	`ImmutableMap`
`SortedMap`	JDK	`ImmutableSortedMap`
`Multiset`	Guava	`ImmutableMultiset`
`SortedMultiset`	Guava	`ImmutableSortedMultiset`
`Multimap`	Guava	`ImmutableMultimap`
`ListMultimap`	Guava	`ImmutableMultimap`
`SetMultimap`	Guava	`ImmutableSetMultimap`
`BiMap`	Guava	`ImmutableBiMap`
`ClassToInstanceMap`	Guava	`ImmutableClassToInstanceMap`
`Table`	Guava	`ImmutableTable`

Guava 新的集合类型

Guava 提供了一些在 JDk 没有的新的集合类型，非常有用，并且可以与 JDK 集合框架很好地共存。

Multiset

在传统的 Java 开发习惯中，如果想统计一个单词在文档中出现的次数，一般来说会使用如下的方法：

Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : words) {
    Integer count = counts.get(word);
    if (count == null) {
        counts.put(word, 1);
    } else {
        counts.put(word, count + 1);
    }
}

这个写法有点笨拙，而且容易出错，如果想要收集单词总数这种统计数据，往往难以支持。

Guava 提供了一个新的集合类型 —— Multiset，来支持添加多个元素。

应用场景

Multiset 主要应用在两个场景：

像 ArrayList 一样，元素可以重复，但内部无序，或者说元素顺序不重要；
统计元素和出现的次数

基本使用

Method	Description
`count(E)`	返回元素 `E` 在集合中的数量
`elementSet()`	返回由 `Multiset` 中不同元素组成的 `Set<E>`
`entrySet()`	像 `Map.entrySet()` 一样，返回 `Set<Multiset.Entry<E>>`，其中 `Entry` 包含 `getElement()` 和 `getCount()` 方法
`add(E, int)`	添加指定元素的指定出现次数
`remove(E, int)`	移除指定元素的指定出现次数
`setCount(E, int)`	设置指定元素的指定出现次数
`size()`	返回所有元素总数量

**Multiset 不是 Map **。

Multiset 是一个 Collection，并且满足所有相关的接口约定。

Multiset 内部的元素出现次数只能是正数，且当出现次数为 0 时，则认为这个元素在 Multiset 中不存在

Multiset.size() 返回所有元素的所有出现次数总和，如果想要查看不同元素的数量，可以用 elementSet().size

Multiset.iterator() 迭代的每个元素每次出现的项，所以迭代次数等同于 size() 的数量

Multiset 支持添加元素，删除元素或者直接设置元素的数量。当 setCount(elem, 0) 时等同于删除该元素

Multiset.count(elem) 当元素在 Multiset 中不存在时，返回的是 0

实现类

Guava 为 Multiset 提供了多种实现类，大致与 JDK 映射实现。

Map	Corresponding Multiset	Supports `null` elements
`HashMap`	`HashMultiset`	Y
`TreeMap`	`TreeMultiset`	Y
`LinkedHashMap`	`LinkedHashMultiset`	Y
`ConcurrentHashMap`	`ConcurrentHashMultiset`	N
`ImmutableMap`	`ImmutableMultiset`	N

核心属性

这里以 HashMultiset 为例

private transient Map<E, Count> backingMap;

private transient long size;

其中，backingMap 是内部数据结构，size 是所有元素出现次数总和，size() 返回的就是该值。

虽然内部结构是 Map 构成的，但 Multiset 继承自 Collection，所以其本质上还是一个 Collection

所以，HashMultiset 内部是实际上是使用了一个 Map 来实现了集合的特性和统计属性数量这两个目的。

Multimap

当一个 key 对应多个 value 时，一般会定义这样的数据结构 —— Map<K, List<V>> 或者 Map<K, Set<V>>，Guava 提供了一种更方便的集合，来处理这种场景，那就是 Multimap

Multimap 方法 asMap() 可以返回 Map<K, Collection<V>>，这里需要注意的是，Multimap 中不会存在一个 key 对应一个空集合的情况，也就是说，一个 key 对应的集合中，最少有一个值，如果是空集合的话，意味着这个 key 在 Multimap 中不存在。

构造

创建 Multimap 最直接的方法是使用 MultimapBuilder，用这个可以配置 key 的 value 的使用。

// creates a ListMultimap with tree keys and array list values
ListMultimap<String, Integer> treeListMultimap =
    MultimapBuilder.treeKeys().arrayListValues().build();

// creates a SetMultimap with hash keys and enum set values
SetMultimap<Integer, MyEnum> hashEnumMultimap =
    MultimapBuilder.hashKeys().enumSetValues(MyEnum.class).build();

这里，可以直接调用实现类来替代 Multimap

修改

put(K, V)：添加元素，等同于 multimap.get(key).add(value)
putAll(K, Iterable<V>)：将每一个 value 添加到对应的 key 集合中，等同于 Iterables.addAll(multimap.get(key), values)
remove(K, V)：删除指定 key 对应集合中的指定 value 元素，并且返回删除是否成功，等同于 multimap.get(key).remove(value)
removeAll(K)：删除指定 key，返回之前的数据，等同于 mutlimap.get(key).clear()
replaceValues(K, Iterable<V>)：删除指定 key，并且添加新的集合中的所有 value，等同于 multimap.get(key).clear(); Iterables.addAll(multimap.get(key), values)

查看

asMap()：返回 Map<K, Collection<V>>，返回的集合支持 remove，但是不支持 put 和 putAll。当指定的 key 不存在时，返回的结果为 null 而不是空集合的话，可以使用 asMap().get(key)
entries()：返回 Collection<Map.Entry<K, V>>
keySet()：返回 key 的 Set 集合
keys()：返回一个 Multiset —— 等同于每个 key 对应 value 的数量，可以删除，但是不能添加
values()：返回所有 key 对应所有 value 的集合 —— Collection<V>

Multimap 和 Map 的区别

Multimap<K, V> 并不是一个 Map<K, Collection<V>>，他们的区别如下：

Multimap.get(key) 总是返回一个非 null，但可能为空的集合。如果更喜欢当 key 不存在时返回 null，则可以使用 asMap() ，返回一个 Map
Multimap.containsKey(key) 只有对应的 key 关联的集合有值时，才会返回 true
entries() 返回 key 和 value 的对应项集合，如果想要 key 和 Collection 的对应项集合，可以使用 asMap().entrySet()
size() 返回的是 entries() 返回集合的数量，并不是不同 key 的数量。如果想要获取不同 key 的数量，可以使用 keySet().size()

实现类

实现类	Keys	Values
`ArrayListMultimap`	`HashMap`	`ArrayList`
`HashMultimap`	`HashMap`	`HashSet`
`LinkedListMultimap`	`LinkedHashMap`	`LinkedList`
`LinkedHashMultimap`	`LinkedHashMap`	`LinkedHashSet`
`TreeMultimap`	`TreeMap`	`TreeSet`
`ImmutableListMultimap`	`ImmutableMap`	`ImmutableList`
`ImmutableSetMultimap`	`ImmutableMap`	`ImmutableSet`

内部核心属性

private transient Map<K, Collection<V>> map;
private transient int totalSize;

transient: 用来修饰成员变量，被 transient 修饰的成员变量不参与序列化过程。

在 Java 中，除了该关键字，静态成员变量也是不能被序列化的，不管有没有 transient 关键字。

BiMap

当想要在 Map 中根据 value 反向查找 key 时，传统的方式，一般会维护两个 Map，在保存 key 和 value 时，同步维护 value 和 key 的对应关系。例如：

Map<String, Integer> nameToId = Maps.newHashMap();
Map<Integer, String> idToName = Maps.newHashMap();

nameToId.put("Bob", 42);
idToName.put(42, "Bob");

但这样子容易出现 bug，并且当一个值已经出现在映射中时，会变得混乱。

为了解决这种场景，Guava 提供了一个集合 —— BiMap

BiMap 和 Map 的区别

BiMap<K, V> 不是 Map<K, V>，他们之间的区别如下：

可以通过 inverse() 方法，获取 value 和 key 的关联 Map
value 不能重复，当尝试保存一个已经存在的 value 时，会抛出 IllegalArgumentException，如果想要删除之前的对应关系，则使用 BiMap.forcePut(key, value)

实现类

Key-value Map	Value-key Map	Correponding BiMap
`HashMap`	`HashMap`	`HashBiMap`
`ImmutableMap`	`ImmutableMap`	`ImmutableBiMap`
`EnumMap`	`EnumMap`	`EnumBiMap`
`EnumMap`	`HashMap`	`EnumHashBiMap`

Table

当视图一次对多个键进行索引时，通常会定义 Map<R, Map<C, V>> 这样的结构，这个使用起来既难看又笨拙。

Guava 提供了一种新的集合类型 —— Table，支持任何 row 类型和 column 类型的用例。

rowMap()：返回 Map<R, Map<C, V>> 视图
rowKeySet()：返回 Set<R>
row(r)：返回一个非 null 的 Map<C, V>
column(c)：返回一个非 null 的 Map<R, V>，比基于行的访问（row(r) `）效率低一些
cellSet()：返回 Table 内部的元素集合 Table.Cell<R, C, V>

实现类

实现类
`HashBasedTable`	等同于 `HashMap<R, HashMap<C, V>>`
`TreeBasedTable`	等同于 `TreeMap<R, TreeMap<C, V>>`