前言
本文从源码入手聊一聊 List 的三种实现类
1. ArrayList
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
ArrayList 构造器与静态对象
ArrayList 通过动态数组来实现List接口,查询函数时间复杂度通常为O(1),修改函数时间复杂度为O(n)。因为数组动态扩容特性,add复杂度为 amortized O(1)
下面我们先来看一下类的静态对象:
@java.io.Serial
private static final long serialVersionUID = 8683452581122892189L;
private static final int DEFAULT_CAPACITY = 10;
private static final Object[] EMPTY_ELEMENTDATA = {};
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
transient Object[] elementData;
private int size;
serialVersionUID
每个序列化类产生的版本标志: Java Serialization
DEFAULT_CAPACITY
定义了数组的默认初始长度为10
EMPTY_ELEMENTDATA
用来表示空数组 {}
DEFAULTCAPACITY_EMPTY_ELEMENTDATA
用来表示初始数组的空数组 {}
elementData
数据存储数组,当数组为默认空集时插入会自动扩容
size 数组长度
下面我们继续看一下部分构造器的实现:
// List<Integer> list = new ArrayList<>(20);
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException
("Illegal Capacity: "+ initialCapacity);
}
}
按照输入的数组长度构造 elementData
如果错误,抛出 IllegalArgumentException 异常
public ArrayList(Collection<? extends E> c) {
Object[] a = c.toArray();
if ((size = a.length) != 0) {
if (c.getClass() == ArrayList.class) {
elementData = a;
} else {
elementData = Arrays.copyOf(a, size, Object[].class);
}
} else {
// replace with empty array.
elementData = EMPTY_ELEMENTDATA;
}
}
按照collection iterator的顺序来构造新的List
此函数用到了 Arrays.copyOf() 来复制数组
ArrayList 扩容
话不多说,首先来看一下扩容的 grow() 函数细节:
private Object[] grow(int minCapacity) {
int oldCapacity = elementData.length;
if (oldCapacity > 0 || elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
int newCapacity = ArraysSupport.newLength
(oldCapacity,
/* minimum growth */
minCapacity - oldCapacity,
/* preferred growth */
oldCapacity >> 1);
return elementData = Arrays.copyOf(elementData, newCapacity);
} else {
return elementData = new Object[Math.max(DEFAULT_CAPACITY, minCapacity)];
}
}
扩容出现场合:初始化数组的第一次插入操作和常规扩容。常规扩容会用到 ArraysSupport.newLength来计算数组的长度。初始化数组的扩容会直接按照初始长度和input长度的较大值来决定新的数组长度。
grow() 函数并没有什么有趣的细节
我们继续深挖 ArraysSupport.newLength
public static int newLength(int oldLength, int minGrowth, int prefGrowth) {
int newLength = Math.max(minGrowth, prefGrowth) + oldLength;
if (newLength - MAX_ARRAY_LENGTH <= 0) {
return newLength;
}
return hugeLength(oldLength, minGrowth);
}
这下就很清晰了, prefGrowth = oldCapacity >> 1
如果minGrowth 大于50%,按照它的长度来扩容
如果 minGroth 小于50%,按照48%-50%标准扩容
最后如果扩容长度超过了Integer.MAX_VALUE - 8;
最大扩容到整数上限
ArrayList 长度上限为 Integer.MAX_VALUE
单次扩容50%
ArrayList 增删
简单检查了下类的扩容机制,具体看一下 add() 函数
public boolean add(E e) {
modCount++;
add(e, elementData, size);
return true;
}
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1;
}
似乎也并没有什么特别的
就是扩容然后把新的元素 E e 加入到数组中
接下来我们再来看一下remove()函数
可能涉及容量缩减吗?
public E remove(int index) {
Objects.checkIndex(index, size);
final Object[] es = elementData;
@SuppressWarnings("unchecked") E oldValue = (E) es[index];
fastRemove(es, index);
return oldValue;
}
好吧,又是一个平淡无奇的函数
但是注意函数调用了fastRemove()来执行删除:
private void fastRemove(Object[] es, int i) {
modCount++;
final int newSize;
if ((newSize = size - 1) > i)
System.arraycopy(es, i + 1, es, i, newSize - i);
es[size = newSize] = null;
}
//arraycopy: Copies an array from the specified source array,
//beginning at the specified position,
//to the specified position of the destination array.
可以看出arraycopy用覆盖法删除掉了 es[i]
System.arraycopy 直接对内存操作效率虽高
但是无法开辟新的内存空间
modCount 解析
浏览源码,你会发现modCount这玩意的出镜率贼高
点击发现 modCount来自于
public abstract class AbstractList<E>
懒癌晚期,直接贴官方解释:
/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {@code iterator} and {@code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {@code ConcurrentModificationException} in
* response to the {@code next}, {@code remove}, {@code previous},
* {@code set} or {@code add} operations. This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.
简单来说是为 itearator 和其子类设计的
当ArrayList的内部数组被非法改动后
抛弃ConcurrentModificationException 异常
部分报错方法:next() remove() set()
modCount makes fail-fast behavior
possible instead of non-deterministic behavior
2. LinkedList
下面来聊一聊 List 的链表实现原理。
public class LinkedList<E>
extends AbstractSequentialList<E>
implements List<E>, Deque<E>, Cloneable, java.io.Serializable
LinkedList 构造器与静态对象
LinkedList 通过双向链表来实现List接口,类中提供对数组内存操纵的方法。双链表特性导致增删操作O(1)复杂度,遍历查询复杂度O(n)。
老规矩,先来看一下源码的链表实现:
@java.io.Serial
private static final long serialVersionUID = 876323262645176354L;
transient int size = 0;
transient Node<E> first;
transient Node<E> last;
private static class Node<E> {
E item;
Node<E> next;
Node<E> prev;
Node(Node<E> prev, E element, Node<E> next) {
this.item = element;
this.next = next;
this.prev = prev;
}
}
Node 是个数据结构储存着双向链表
size 来表示大小
构造器也是简单粗暴:
public LinkedList(Collection<? extends E> c) {
this();
addAll(c);
}
LinkedList 查询
LinkedList 查询会比ArrayList低效
我们来看一下 contains() 函数:
public boolean contains(Object o) {
return indexOf(o) >= 0;
}
public int indexOf(Object o) {
int index = 0;
if (o == null) {
for (Node<E> x = first; x != null; x = x.next) {
if (x.item == null)
return index;
index++;
}
} else {
for (Node<E> x = first; x != null; x = x.next) {
if (o.equals(x.item))
return index;
index++;
}
}
return -1;
}
contains 调用了 indexOf 来完成查询操作
当 Object o 为 null 的时候,遍历链表寻找 null
其他时候正常遍历链表,查询复杂度为 O(n)
同理我们再来看一下get()的实现
public E get(int index) {
checkElementIndex(index);
return node(index).item;
}
Node<E> node(int index) {
// assert isElementIndex(index);
if (index < (size >> 1)) {
Node<E> x = first;
for (int i = 0; i < index; i++)
x = x.next;
return x;
} else {
Node<E> x = last;
for (int i = size - 1; i > index; i--)
x = x.prev;
return x;
}
}
get()方法也是一个O(n)的操作,遍历数组找Node
node()方法会根据位置来决定遍历起点(头还是尾)
LinkedList 增删
如果我们知道链表的当前Node,删除操作是 O(1)
那么Java是怎么实现的呢?
public boolean remove(Object o) {
if (o == null) {
for (Node<E> x = first; x != null; x = x.next) {
if (x.item == null) {
unlink(x);
return true;
}
}
} else {
for (Node<E> x = first; x != null; x = x.next) {
if (o.equals(x.item)) {
unlink(x);
return true;
}
}
}
return false;
}
unlink是个双链表删除当前Node的辅助函数
我们会发现remove也是遍历列表后直接调用unlink
删除表中的元素复杂度为 O(n)
LinkedList 在数组中查询和删除都是 O(n)
LinkedList 删除和添加单一元素的复杂度为 O(1)
3. Vector
Vector作为平常不怎么用的类被我决定留到了最后。
public class Vector<E>
extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
Vector 构造器与静态对象
话不多说,直接从静态对象开始抓起:
protected Object[] elementData;
protected int elementCount;
protected int capacityIncrement;
private static final long serialVersionUID = -2767605614048989439L;
elementData 存储了类的数据的数组
elementCount数组的大小
capacityIncrement 类似于ArrayList中的扩容因子
需要注意的是,官方文档中提到:
这就是说在没有其他设置的情况下,
扩容因子为100%而不是50%
If the capacity increment <= 0
the capacity of the vector is doubled
下面来看一下构造器吧:
public Vector(int initialCapacity, int capacityIncrement) {
super();
if (initialCapacity < 0)
throw new IllegalArgumentException("IllegalCapacity: "+ initialCapacity);
this.elementData = new Object[initialCapacity];
this.capacityIncrement = capacityIncrement;
}
public Vector(int initialCapacity) {
this(initialCapacity, 0);
}
这两个构造器其实只告诉了我们一件事情:
没有自定义扩容邀请请使用ArrayList
Vector 查询
既然 Vector 和 ArrayList 有那么多相似处
他们的实现会大同小异吗?
public boolean contains(Object o) {
return indexOf(o, 0) >= 0;
}
有点打脸了,让我再瞅瞅 indexOf
public synchronized int indexOf(Object o, int index){
if (o == null) {
for (int i = index ; i < elementCount ; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = index ; i < elementCount ; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}
好了解谜了:
Vector 是线程安全版本的ArrayList
Vector Enumeration
鉴于Vector和ArrayList的相似处
我们最后聊一聊属于Vector使用的遍历器
- 可以使用
iterator或者enumerator实现遍历 - 两种接口的遍历都遵循
fail-fast的原则 enumerator结果将是undefined如果(2)Enumeration<E>只能对Legacy Class使用Enumeration<E>只有读取权限
public Enumeration<E> elements() {
return new Enumeration<E>() {
int count = 0;
public boolean hasMoreElements() {
return count < elementCount;
}
public E nextElement() {
synchronized (Vector.this) {
if (count < elementCount) {
return elementData(count++);
}
}
throw new NoSuchElementException("Vector Enumeration");
}
};
}
for (Enumeration<E> e = v.elements(); e.hasMoreElements();)
System.out.println(e.nextElement());
总之,使用Enumeration 需要实现:
elements(),hasMoreElements, nextElement
4. 对比与分析
ArrayList vs Vector
- Vector is synchronized, ArrayList is not
- Vector grow by 100%, ArrayList 50%
- Vector is legacy, ArrayList is fresh.
- Vector can use numerator and iterator
ArrayList vs LinkedList
- ArrayList uses dynamic array,
LinkedList uses doubly linked list- ArrayList is slower on data manipulation
LinkedList is slower on data query- ArrayList has faster access than LinkedList
存储类型分析
储存类型看接口定义!
List -> 有序,重复,多NULL
下一篇文章
Queue 家族实现类:PriorityQueue