ArrayList, LinkedList, Vector

346 阅读7分钟

前言

本文从源码入手聊一聊 List 的三种实现类

1. ArrayList

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

ArrayList 构造器与静态对象

ArrayList 通过动态数组来实现List接口,查询函数时间复杂度通常为O(1),修改函数时间复杂度为O(n)。因为数组动态扩容特性,add复杂度为 amortized O(1)

下面我们先来看一下类的静态对象:

@java.io.Serial
private static final long serialVersionUID = 8683452581122892189L;
private static final int DEFAULT_CAPACITY = 10;
private static final Object[] EMPTY_ELEMENTDATA = {};
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
transient Object[] elementData;
private int size;

serialVersionUID
每个序列化类产生的版本标志: Java Serialization
DEFAULT_CAPACITY
定义了数组的默认初始长度为10
EMPTY_ELEMENTDATA
用来表示空数组 {}
DEFAULTCAPACITY_EMPTY_ELEMENTDATA
用来表示初始数组的空数组 {}
elementData
数据存储数组,当数组为默认空集时插入会自动扩容
size 数组长度


下面我们继续看一下部分构造器的实现:

// List<Integer> list = new ArrayList<>(20);
public ArrayList(int initialCapacity) {
	if (initialCapacity > 0) {
		this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException
            ("Illegal Capacity: "+ initialCapacity);
        }
}

按照输入的数组长度构造 elementData
如果错误,抛出 IllegalArgumentException 异常

public ArrayList(Collection<? extends E> c) {
    Object[] a = c.toArray();
    if ((size = a.length) != 0) {
        if (c.getClass() == ArrayList.class) {
            elementData = a;
        } else {
            elementData = Arrays.copyOf(a, size, Object[].class);
        }
    } else {
        // replace with empty array.
        elementData = EMPTY_ELEMENTDATA;
    }
}

按照collection iterator的顺序来构造新的List
此函数用到了 Arrays.copyOf() 来复制数组

ArrayList 扩容

话不多说,首先来看一下扩容的 grow() 函数细节:

private Object[] grow(int minCapacity) {
    int oldCapacity = elementData.length;
    if (oldCapacity > 0 || elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        int newCapacity = ArraysSupport.newLength
        	(oldCapacity,
                 /* minimum growth */
                 minCapacity - oldCapacity, 
                 /* preferred growth */
                oldCapacity >> 1);
        return elementData = Arrays.copyOf(elementData, newCapacity);
    } else {
        return elementData = new Object[Math.max(DEFAULT_CAPACITY, minCapacity)];
    }
}

扩容出现场合:初始化数组的第一次插入操作和常规扩容。常规扩容会用到 ArraysSupport.newLength来计算数组的长度。初始化数组的扩容会直接按照初始长度和input长度的较大值来决定新的数组长度。


grow() 函数并没有什么有趣的细节
我们继续深挖 ArraysSupport.newLength

public static int newLength(int oldLength, int minGrowth, int prefGrowth) {
   int newLength = Math.max(minGrowth, prefGrowth) + oldLength;
   if (newLength - MAX_ARRAY_LENGTH <= 0) {
       return newLength;
   }
   return hugeLength(oldLength, minGrowth);
}

这下就很清晰了, prefGrowth = oldCapacity >> 1
如果minGrowth 大于50%,按照它的长度来扩容
如果 minGroth 小于50%,按照48%-50%标准扩容
最后如果扩容长度超过了Integer.MAX_VALUE - 8;
最大扩容到整数上限

ArrayList 长度上限为 Integer.MAX_VALUE
单次扩容50%

ArrayList 增删

简单检查了下类的扩容机制,具体看一下 add() 函数

public boolean add(E e) {
    modCount++;
    add(e, elementData, size);
    return true;
}
private void add(E e, Object[] elementData, int s) {
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s + 1;
}

似乎也并没有什么特别的
就是扩容然后把新的元素 E e 加入到数组中

接下来我们再来看一下remove()函数
可能涉及容量缩减吗?

public E remove(int index) {
    Objects.checkIndex(index, size);
    final Object[] es = elementData;
    @SuppressWarnings("unchecked") E oldValue = (E) es[index];
    fastRemove(es, index);
    return oldValue;
}

好吧,又是一个平淡无奇的函数
但是注意函数调用了fastRemove()来执行删除:

private void fastRemove(Object[] es, int i) {
    modCount++;
    final int newSize;
    if ((newSize = size - 1) > i)
        System.arraycopy(es, i + 1, es, i, newSize - i);
    es[size = newSize] = null;
}
//arraycopy: Copies an array from the specified source array,
//beginning at the specified position, 
//to the specified position of the destination array.

可以看出arraycopy用覆盖法删除掉了 es[i]

System.arraycopy 直接对内存操作效率虽高
但是无法开辟新的内存空间

modCount 解析

浏览源码,你会发现modCount这玩意的出镜率贼高

点击发现 modCount来自于
public abstract class AbstractList<E>
懒癌晚期,直接贴官方解释:

/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {@code iterator} and {@code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {@code ConcurrentModificationException} in
* response to the {@code next}, {@code remove}, {@code previous},
* {@code set} or {@code add} operations.  This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.

简单来说是为 itearator 和其子类设计的
当ArrayList的内部数组被非法改动后
抛弃ConcurrentModificationException 异常
部分报错方法:next() remove() set()

modCount makes fail-fast behavior
possible instead of non-deterministic behavior

2. LinkedList

下面来聊一聊 List 的链表实现原理。

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable

LinkedList 构造器与静态对象

LinkedList 通过双向链表来实现List接口,类中提供对数组内存操纵的方法。双链表特性导致增删操作O(1)复杂度,遍历查询复杂度O(n)。

老规矩,先来看一下源码的链表实现:

    @java.io.Serial
    private static final long serialVersionUID = 876323262645176354L;
    transient int size = 0; 
    transient Node<E> first;
    transient Node<E> last;
    
    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

Node 是个数据结构储存着双向链表
size 来表示大小

构造器也是简单粗暴:

public LinkedList(Collection<? extends E> c) {
    this();
    addAll(c);
}

LinkedList 查询

LinkedList 查询会比ArrayList低效
我们来看一下 contains() 函数:

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}
    
public int indexOf(Object o) {
    int index = 0;
    if (o == null) {
        for (Node<E> x = first; x != null; x = x.next) {
            if (x.item == null)
                return index;
            index++;
        }
    } else {
        for (Node<E> x = first; x != null; x = x.next) {
            if (o.equals(x.item))
                return index;
            index++;
        }
    }
    return -1;
}

contains 调用了 indexOf 来完成查询操作
Object o 为 null 的时候,遍历链表寻找 null
其他时候正常遍历链表,查询复杂度为 O(n)

同理我们再来看一下get()的实现

public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}
    
Node<E> node(int index) {
    // assert isElementIndex(index);

    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

get()方法也是一个O(n)的操作,遍历数组找Node
node()方法会根据位置来决定遍历起点(头还是尾)

LinkedList 增删

如果我们知道链表的当前Node,删除操作是 O(1)
那么Java是怎么实现的呢?

public boolean remove(Object o) {
    if (o == null) {
        for (Node<E> x = first; x != null; x = x.next) {
            if (x.item == null) {
                unlink(x);
                return true;
            }
        }
    } else {
        for (Node<E> x = first; x != null; x = x.next) {
            if (o.equals(x.item)) {
                unlink(x);
                return true;
            }
        }
    }
    return false;
}

unlink是个双链表删除当前Node的辅助函数
我们会发现remove也是遍历列表后直接调用unlink
删除表中的元素复杂度为 O(n)

LinkedList 在数组中查询和删除都是 O(n)
LinkedList 删除和添加单一元素的复杂度为 O(1)

3. Vector

Vector作为平常不怎么用的类被我决定留到了最后。

public class Vector<E>
    extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable

Vector 构造器与静态对象

话不多说,直接从静态对象开始抓起:

protected Object[] elementData;
protected int elementCount;
protected int capacityIncrement;
private static final long serialVersionUID = -2767605614048989439L;

elementData 存储了类的数据的数组
elementCount数组的大小
capacityIncrement 类似于ArrayList中的扩容因子

需要注意的是,官方文档中提到:
这就是说在没有其他设置的情况下,
扩容因子为100%而不是50%

If the capacity increment <= 0
the capacity of the vector is doubled

下面来看一下构造器吧:

public Vector(int initialCapacity, int capacityIncrement) {
    super();
    if (initialCapacity < 0)
        throw new IllegalArgumentException("IllegalCapacity: "+ initialCapacity);
    this.elementData = new Object[initialCapacity];
    this.capacityIncrement = capacityIncrement;
}

public Vector(int initialCapacity) {
    this(initialCapacity, 0);
}

这两个构造器其实只告诉了我们一件事情:
没有自定义扩容邀请请使用ArrayList

Vector 查询

既然 Vector 和 ArrayList 有那么多相似处
他们的实现会大同小异吗?

public boolean contains(Object o) {
    return indexOf(o, 0) >= 0;
}

有点打脸了,让我再瞅瞅 indexOf

public synchronized int indexOf(Object o, int index){
    if (o == null) {
        for (int i = index ; i < elementCount ; i++)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = index ; i < elementCount ; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

好了解谜了:

Vector 是线程安全版本的ArrayList

Vector Enumeration

鉴于Vector和ArrayList的相似处
我们最后聊一聊属于Vector使用的遍历器

  1. 可以使用iterator或者 enumerator 实现遍历
  2. 两种接口的遍历都遵循 fail-fast 的原则
  3. enumerator 结果将是 undefined 如果(2)
  4. Enumeration<E> 只能对Legacy Class使用
  5. Enumeration<E> 只有读取权限
public Enumeration<E> elements() {
    return new Enumeration<E>() {
        int count = 0;

        public boolean hasMoreElements() {
            return count < elementCount;
        }

        public E nextElement() {
            synchronized (Vector.this) {
                if (count < elementCount) {
                    return elementData(count++);
                }
            }
            throw new NoSuchElementException("Vector Enumeration");
        }
    };
}
for (Enumeration<E> e = v.elements(); e.hasMoreElements();)
     System.out.println(e.nextElement());

总之,使用Enumeration 需要实现:
elements(),hasMoreElements, nextElement

4. 对比与分析

ArrayList vs Vector

  1. Vector is synchronized, ArrayList is not
  2. Vector grow by 100%, ArrayList 50%
  3. Vector is legacy, ArrayList is fresh.
  4. Vector can use numerator and iterator

ArrayList vs LinkedList

  1. ArrayList uses dynamic array,
    LinkedList uses doubly linked list
  2. ArrayList is slower on data manipulation
    LinkedList is slower on data query
  3. ArrayList has faster access than LinkedList

存储类型分析
储存类型看接口定义!
List -> 有序,重复,多NULL

下一篇文章

Queue 家族实现类:PriorityQueue