ArrayList详解

2,993 阅读5分钟

基本属性

ArrayList利用一个数组存储元素,它的实现很简单,基本属性只有两个。

/**
 * 存储元素的数组,当size等于数组的长度,再次进行插入就需要扩容
 */
transient Object[] elementData;

/**
 * 存储的元素的个数,一般情况下,size != elementData.length
 */
private int size;

AarryList提供了三种构造方法。

/**
 * 指定数组大小
 */
public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}

/**
 * 初始数组大小为0,第一次插入时数组扩容到10
 */
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

/**
 * 通过其他集合构造
 */
public ArrayList(Collection<? extends E> c) {
    Object[] a = c.toArray();
    if ((size = a.length) != 0) {
        if (c.getClass() == ArrayList.class) {
            elementData = a;
        } else {
            elementData = Arrays.copyOf(a, size, Object[].class);
        }
    } else {
        // replace with empty array.
        elementData = EMPTY_ELEMENTDATA;
    }
}

元素操作

插入元素

插入元素时,当size == elementData.length的时候,就需要扩容,我们先看看扩容。

private Object[] grow(int minCapacity) {
    return elementData = Arrays.copyOf(elementData,
                                       newCapacity(minCapacity));
}

private Object[] grow() {
    return grow(size + 1);
}

private int newCapacity(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity <= 0) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
            return Math.max(DEFAULT_CAPACITY, minCapacity);
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return minCapacity;
    }
    return (newCapacity - MAX_ARRAY_SIZE <= 0)
        ? newCapacity
        : hugeCapacity(minCapacity);
}

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE)
        ? Integer.MAX_VALUE
        : MAX_ARRAY_SIZE;
}

扩容实际上是通过System.arraycopy方法将旧数组内容复制到新数组。新数组长度的大小逻辑如下:

  1. 正常情况下,新数组的长度是原数组的1.5倍(oldCapacity + (oldCapacity >> 1)
  2. 初始情况下,数组长度扩容为10(默认的)
  3. 数组长度最大为Integer.MAX_VALUE 接下来看看插入元素,ArrayList提供了两种插入单个元素的方法。
private void add(E e, Object[] elementData, int s) {
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s + 1;
}

/**
 * 元素插入到数组末尾
 */
public boolean add(E e) {
    modCount++;
    add(e, elementData, size);
    return true;
}

/**
 * 在指定位置插入元素
 */
public void add(int index, E element) {
    rangeCheckForAdd(index);
    modCount++;
    final int s;
    Object[] elementData;
    if ((s = size) == (elementData = this.elementData).length)
        elementData = grow();
    System.arraycopy(elementData, index,
                     elementData, index + 1,
                     s - index);
    elementData[index] = element;
    size = s + 1;
}

private void rangeCheckForAdd(int index) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}
  • 元素插入到数组末尾。首先判断是否需要扩容,然后给elementData[size]赋值,时间复杂度为O(1)
  • 元素插入到指定位置。首先判断指定位置是否越界了([0, size]),然后判断是否需要扩容,然后将[index, size - 1]的元素移动到[index + 1, size],这里是通过System.arraycopy直接复制指定区间的元素,速度比循环更快,最后给elementData[index]赋值。 可以看出来,在中间位置插入元素效率比在末尾插入要低,所以我们在使用的时候,尽量不要在中间位置插入元素。ArrayList也提供了批量插入的方法。
public boolean addAll(Collection<? extends E> c) {
    Object[] a = c.toArray();
    modCount++;
    int numNew = a.length;
    if (numNew == 0)
        return false;
    Object[] elementData;
    final int s;
    if (numNew > (elementData = this.elementData).length - (s = size))
        elementData = grow(s + numNew);
    System.arraycopy(a, 0, elementData, s, numNew);
    size = s + numNew;
    return true;
}

public boolean addAll(int index, Collection<? extends E> c) {
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    modCount++;
    int numNew = a.length;
    if (numNew == 0)
        return false;
    Object[] elementData;
    final int s;
    if (numNew > (elementData = this.elementData).length - (s = size))
        elementData = grow(s + numNew);

    int numMoved = s - index;
    if (numMoved > 0)
        System.arraycopy(elementData, index,
                         elementData, index + numNew,
                         numMoved);
    System.arraycopy(a, 0, elementData, index, numNew);
    size = s + numNew;
    return true;
}
  • 在数组末尾插入。首先判断是否需要扩容,这里扩容加上了原来元素的个数和新添加的元素个数,然后通过批量复制将新加的元素添加到数组末尾。
  • 在指定位置插入。首先判断指定位置是否越界([0, size]),然后判断是否需要扩容,然后通过数组复制将[index, size)的元素移动到[index + newNum, size + newNum),最后将要添加的元素复制到[index, index + newNum]

删除元素

/**
 * 删除指定位置元素
 */
public E remove(int index) {
    Objects.checkIndex(index, size);
    final Object[] es = elementData;

    @SuppressWarnings("unchecked") E oldValue = (E) es[index];
    fastRemove(es, index);

    return oldValue;
}

/**
 * 删除指定元素
 */
public boolean remove(Object o) {
    final Object[] es = elementData;
    final int size = this.size;
    int i = 0;
    found: {
        if (o == null) {
            for (; i < size; i++)
                if (es[i] == null)
                    break found;
        } else {
            for (; i < size; i++)
                if (o.equals(es[i]))
                    break found;
        }
        return false;
    }
    fastRemove(es, i);
    return true;
}

private void fastRemove(Object[] es, int i) {
    modCount++;
    final int newSize;
    if ((newSize = size - 1) > i)
        System.arraycopy(es, i + 1, es, i, newSize - i);
    es[size = newSize] = null;
}

删除指定元素时,先定位元素的下标,再执行删除操作。删除指定位置的元素,先将指定位置后面的所有元素向前移动一位,再将最后一个位置元素置为null,并修改sizeArrayList提供了批量删除元素的方法。

/**
 * 删除集合c中包含的元素
 */
public boolean removeAll(Collection<?> c) {
    return batchRemove(c, false, 0, size);
}

/**
 * 只保留集合c中的元素,删除集合c不包含的元素
 */
public boolean retainAll(Collection<?> c) {
    return batchRemove(c, true, 0, size);
}

boolean batchRemove(Collection<?> c, boolean complement,
                    final int from, final int end) {
    Objects.requireNonNull(c);
    final Object[] es = elementData;
    int r;
    // Optimize for initial run of survivors
    for (r = from;; r++) {
        if (r == end)
            return false;
        if (c.contains(es[r]) != complement)
            break;
    }
    int w = r++;
    try {
        for (Object e; r < end; r++)
            if (c.contains(e = es[r]) == complement)
                es[w++] = e;
    } catch (Throwable ex) {
        // Preserve behavioral compatibility with AbstractCollection,
        // even if c.contains() throws.
        System.arraycopy(es, r, es, w, end - r);
        w += end - r;
        throw ex;
    } finally {
        modCount += end - w;
        shiftTailOverGap(es, w, end);
    }
    return true;
}

private void shiftTailOverGap(Object[] es, int lo, int hi) {
    System.arraycopy(es, hi, es, lo, size - hi);
    for (int to = size, i = (size -= hi - lo); i < to; i++)
        es[i] = null;
}

批量删除的时候首先确定要删除元素的起始下标(即第一个循环),然后根据条件,移动elementData里面的元素(第二个循环),最后将后续的值置为nullshiftTailOverGap)。从JDK8以后,ArrayList还提供了删除符合指定条件的元素按照指定要求修改所有的元素

/**
 * 删除符合指定条件的元素
 */
@Override
public boolean removeIf(Predicate<? super E> filter) {
    return removeIf(filter, 0, size);
}

boolean removeIf(Predicate<? super E> filter, int i, final int end) {
    Objects.requireNonNull(filter);
    int expectedModCount = modCount;
    final Object[] es = elementData;
    // Optimize for initial run of survivors
    for (; i < end && !filter.test(elementAt(es, i)); i++)
        ;
    // Tolerate predicates that reentrantly access the collection for
    // read (but writers still get CME), so traverse once to find
    // elements to delete, a second pass to physically expunge.
    if (i < end) {
        final int beg = i;
        final long[] deathRow = nBits(end - beg);
        deathRow[0] = 1L;   // set bit 0
        for (i = beg + 1; i < end; i++)
            if (filter.test(elementAt(es, i)))
                setBit(deathRow, i - beg);
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        modCount++;
        int w = beg;
        for (i = beg; i < end; i++)
            if (isClear(deathRow, i - beg))
                es[w++] = es[i];
        shiftTailOverGap(es, w, end);
        return true;
    } else {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        return false;
    }
}

private static long[] nBits(int n) {
    return new long[((n - 1) >> 6) + 1];
}
private static void setBit(long[] bits, int i) {
    bits[i >> 6] |= 1L << i;
}
private static boolean isClear(long[] bits, int i) {
    return (bits[i >> 6] & (1L << i)) == 0;
}

/**
 * 按照指定要求替换元素
 */
@Override
public void replaceAll(UnaryOperator<E> operator) {
    replaceAllRange(operator, 0, size);
    modCount++;
}

private void replaceAllRange(UnaryOperator<E> operator, int i, int end) {
    Objects.requireNonNull(operator);
    final int expectedModCount = modCount;
    final Object[] es = elementData;
    for (; modCount == expectedModCount && i < end; i++)
        es[i] = operator.apply(elementAt(es, i));
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

/**
 * 替换指定位置的元素
 */
public E set(int index, E element) {
    Objects.checkIndex(index, size);
    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}
  • 删除符合要求的元素。首先确定首个符合要求的元素的下标,然后遍历数组找到需要删除元素的下标,这里是通过位图来记录要删除的下标的,然后再次遍历删除对应的元素(和批量删除那里类似),最后将额外的位置的元素置为null
  • 按照要求替换所有元素很简单,遍历数组替换元素就行了。 接下来是例子:
public static void main(String[] args) {
    int size = 10;
    List<Integer> list = new ArrayList<>(size);
    for (int i = 0; i < size; i++) {
        list.add(i);
    }

    list.removeIf(i -> i > 5);
    System.out.println(list);
    list.replaceAll(i -> i + 5);
    System.out.println(list);
}

输出为:

[0, 1, 2, 3, 4, 5]
[5, 6, 7, 8, 9, 10]

访问元素

访问元素很简单,通过下标去获取,时间复杂度为O(1)

public E get(int index) {
    Objects.checkIndex(index, size);
    return elementData(index);
}

E elementData(int index) {
    return (E) elementData[index];
}

还有判断集合是否包含元素和查询元素在集合中的下标的方法,这些方法都简单。

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}
    
public int indexOf(Object o) {
    return indexOfRange(o, 0, size);
}
    
int indexOfRange(Object o, int start, int end) {
    Object[] es = elementData;
    if (o == null) {
        for (int i = start; i < end; i++) {
            if (es[i] == null) {
                return i;
            }
        }
    } else {
        for (int i = start; i < end; i++) {
            if (o.equals(es[i])) {
                return i;
            }
        }
    }
    return -1;
}
    
public int lastIndexOf(Object o) {
    return lastIndexOfRange(o, 0, size);
}

int lastIndexOfRange(Object o, int start, int end) {
    Object[] es = elementData;
    if (o == null) {
        for (int i = end - 1; i >= start; i--) {
            if (es[i] == null) {
                return i;
            }
        }
    } else {
        for (int i = end - 1; i >= start; i--) {
            if (o.equals(es[i])) {
                return i;
            }
        }
    }
    return -1;
}

其它

equals

public boolean equals(Object o) {
    if (o == this) {
        return true;
    }

    if (!(o instanceof List)) {
        return false;
    }

    final int expectedModCount = modCount;
    // ArrayList can be subclassed and given arbitrary behavior, but we can
    // still deal with the common case where o is ArrayList precisely
    boolean equal = (o.getClass() == ArrayList.class)
        ? equalsArrayList((ArrayList<?>) o)
        : equalsRange((List<?>) o, 0, size);

    checkForComodification(expectedModCount);
    return equal;
 }

boolean equalsRange(List<?> other, int from, int to) {
    final Object[] es = elementData;
    if (to > es.length) {
        throw new ConcurrentModificationException();
    }
    var oit = other.iterator();
    for (; from < to; from++) {
        if (!oit.hasNext() || !Objects.equals(es[from], oit.next())) {
            return false;
        }
    }
    return !oit.hasNext();
}

private boolean equalsArrayList(ArrayList<?> other) {
    final int otherModCount = other.modCount;
    final int s = size;
    boolean equal;
    if (equal = (s == other.size)) {
        final Object[] otherEs = other.elementData;
        final Object[] es = elementData;
        if (s > es.length || s > otherEs.length) {
            throw new ConcurrentModificationException();
        }
        for (int i = 0; i < s; i++) {
             if (!Objects.equals(es[i], otherEs[i])) {
                equal = false;
                break;
            }
        }
    }
    other.checkForComodification(otherModCount);
    return equal;
}

private void checkForComodification(final int expectedModCount) {
    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

可以看到,除了ArrayList之间可以比较外,只要实现了List接口的类都可以和ArrayList比较,只要存储的元素顺序和值相同,那么它们就是相等的。其余实现了List接口的类都重写了equals方法。

hashcode

public int hashCode() {
    int expectedModCount = modCount;
    int hash = hashCodeRange(0, size);
    checkForComodification(expectedModCount);
    return hash;
}

int hashCodeRange(int from, int to) {
    final Object[] es = elementData;
    if (to > es.length) {
        throw new ConcurrentModificationException();
    }
    int hashCode = 1;
    for (int i = from; i < to; i++) {
        Object e = es[i];
        hashCode = 31 * hashCode + (e == null ? 0 : e.hashCode());
    }
    return hashCode;
}

可以看到,ArrayList计算hashcode是通过它存储的元素来计算的,所以一般不要用ArrayList作为HashMap的键、HashSet的元素。

序列化

ArrayList里面,它的elementData属性前有transient关键字,这表示elementData不会被序列化,但是ArrayList自定义了writeObjectreadObject方法,实现了elementData序列化和反序列化的逻辑。

private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException {
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioral compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}
    
private void readObject(java.io.ObjectInputStream s)
    throws java.io.IOException, ClassNotFoundException {

    // Read in size, and any hidden stuff
    s.defaultReadObject();

    // Read in capacity
    s.readInt(); // ignored

    if (size > 0) {
        // like clone(), allocate array based upon size not capacity
        SharedSecrets.getJavaObjectInputStreamAccess().checkArray(s, Object[].class, size);
        Object[] elements = new Object[size];

        // Read in all elements in the proper order.
        for (int i = 0; i < size; i++) {
            elements[i] = s.readObject();
        }

        elementData = elements;
    } else if (size == 0) {
        elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new java.io.InvalidObjectException("Invalid size: " + size);
    }
}

至于为什么要自定义,我的理解是elementData是一个动态数组,它有可能存储了很多的null元素,所以为了避免序列化这些没必要的元素,就自定义了序列化逻辑。

迭代器

  • ArrayList自定义了迭代器,可以双向迭代访问元素(nextprevious)。
  • ArrayListforeach循环实际上是通过迭代器来进行访问的,可以将编译过后的代码反编译,可以看到。
// 原代码
List<Integer> list = new ArrayList<>();
for (Integer i : list) {
    System.out.println(i);
}

// 编译后的代码
List<Integer> list = new ArrayList();
Iterator var2 = list.iterator();

while(var2.hasNext()) {
    Integer i = (Integer)var2.next();
    System.out.println(i);
}

所以,通过foreach访问元素的时候,不要对ArrayList进行修改,如果修改了modCount的值,那么会抛出ConcurrentModificationException异常。

public E next() {
    checkForComodification();
    int i = cursor;
    if (i >= size)
        throw new NoSuchElementException();
    Object[] elementData = ArrayList.this.elementData;
    if (i >= elementData.length)
        throw new ConcurrentModificationException();
    cursor = i + 1;
    return (E) elementData[lastRet = i];
}
        
final void checkForComodification() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

总结

  1. ArrayList实际是一个动态数组,它能够添加null元素,它是非线程安全的。
  2. ArrayListhashcode值随着它存储的元素改变而改变。
  3. ArrayList的非尾部插入和非尾部删除消耗比较大,尽量少用;初始化的时候尽量指定容量,减少扩容次数。