1 ArrayMap 简介
ArrayMap是Android特有的Map类型数据结构,位于android.util包中,通过双数组实现,存储hash值的数组保持递增,与SparseArray存储结构不同的是存储key/value的同在另外一个数组中,通过二分查找可以保证O(log2n)进行插入、遍历和删除等操作,如上图所示,ArrayMap有以下特点:
- SparseArray仅支持key为基本数据类型的,ArrayMap支持key为非基本数据类型
- 空间复杂度低于HashMap,主要有以下原因,1⃣️是不需要封装成数据结构Entry来存储;2⃣️是自动增长/伸缩数组大小;3⃣️是采用了缓存数组
- 用数组实现,key递增插入数组,通过二分查找实现遍历,不同于HashMap的数组+链表/红黑树,因此时间复杂度低于HashMap
- 部分接口未进行index效验,使用时防止出现IndexOutOfBoundsException,比如keyAt(int index)等
2 ArrayMap 源码分析
2.1 接口分析
/** 存储数组置为空数组 */
public void clear();
/** 仅数据置为null,数组对象保留 */
public void erase();
/** 添加key-value */
public V put(K key, V value);
/** 在已有元素尾部添加元素key-value,效率高 */
public void append(K key, V value);
/** 将集合所有数据放入 */
public void putAll(ArrayMap<? extends K, ? extends V> array);
/** 删除指定key的key-value */
public V remove(Object key);
/** 删除指定index位置的key-value */
public V removeAt(int index);
/** 包含了所有数据信息 */
public String toString();
2.2 源码分析
a. 成员变量
public final class ArrayMap<K, V> implements Map<K, V> {
/**
* The minimum amount by which the capacity of a ArrayMap will increase.
* This is tuned to be relatively space-efficient.
* 最小容量
*/
private static final int BASE_SIZE = 4;
/**
* Maximum number of entries to have in array caches.
* 缓存数组数量
*/
private static final int CACHE_SIZE = 10;
/**
* Special hash array value that indicates the container is immutable.
*/
static final int[] EMPTY_IMMUTABLE_INTS = new int[0];
/**
* @hide Special immutable empty ArrayMap.
*/
public static final ArrayMap EMPTY = new ArrayMap<>(-1);
/**
* Caches of small array objects to avoid spamming garbage. The cache
* Object[] variable is a pointer to a linked list of array objects.
* The first entry in the array is a pointer to the next array in the
* list; the second entry is a pointer to the int[] hash code array for it.
*/
/** 容量大小为BASE_SIZE的缓存 */
static Object[] mBaseCache;
/** 容量大小为BASE_SIZE的缓存数量 */
static int mBaseCacheSize;
/** 容量大小为BASE_SIZE * 2的缓存 */
static Object[] mTwiceBaseCache;
/** 容量大小为BASE_SIZE * 2的缓存数量 */
static int mTwiceBaseCacheSize;
/** key的hash计算方式,比如mIdentityHashCode ? System.identityHashCode(key) : key.hashCode() */
final boolean mIdentityHashCode;
/** 存储hash的数组 */
int[] mHashes;
/** 存储key-value的数组,大小是mHashes的二倍 */
Object[] mArray;
/** 大小 */
int mSize;
MapCollections<K, V> mCollections;
}
b. 构造方法
public ArrayMap() {
this(0, false);
}
public ArrayMap(int capacity) {
this(capacity, false);
}
public ArrayMap(int capacity, boolean identityHashCode) {
mIdentityHashCode = identityHashCode;
// If this is immutable, use the sentinal EMPTY_IMMUTABLE_INTS
// instance instead of the usual EmptyArray.INT. The reference
// is checked later to see if the array is allowed to grow.
if (capacity < 0) {
mHashes = EMPTY_IMMUTABLE_INTS;
mArray = EmptyArray.OBJECT;
} else if (capacity == 0) {
mHashes = EmptyArray.INT;
mArray = EmptyArray.OBJECT;
} else {
// 自动分配数组大小,优先使用缓存,缓存机制下面分析
allocArrays(capacity);
}
mSize = 0;
}
public ArrayMap(ArrayMap<K, V> map) {
this();
if (map != null) {
putAll(map);
}
}
c. 缓存机制
ArrayMap全局存在两周缓存池,避免了频繁分配数组对象带来的开销,分别为数组对象长度为4和8的对象缓存,缓存最大容量都是10,可以思考下为什么仅仅缓存长度为4和8都数组对象?(猜测1⃣️是ArrayMap默认的容量是4、8、1.5n增长,大多数场景下存储数据比较小,集合大小为4和8可以满足,容量太大考虑到时间复杂度也不建议使用ArrayMap存储;2⃣️存储更大容量的数组对象会占用太多内存,在时间和空间需要做折衷处理,更好的做法是支持扩展内存池)
缓存的回收和分配逻辑比较抽象,以对象长度为4的缓存池存储方式如下图所示
/** 分配数组对象 */
private void allocArrays(final int size) {
if (mHashes == EMPTY_IMMUTABLE_INTS) {
throw new UnsupportedOperationException("ArrayMap is immutable");
}
// 分配数组大小符合条件优先从缓存池获取
if (size == (BASE_SIZE*2)) { // 数组大小为2陪BASE_SIZE
synchronized (ArrayMap.class) {
if (mTwiceBaseCache != null) {
final Object[] array = mTwiceBaseCache;
mArray = array;
mTwiceBaseCache = (Object[])array[0];
mHashes = (int[])array[1];
array[0] = array[1] = null;
mTwiceBaseCacheSize--;
if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes
+ " now have " + mTwiceBaseCacheSize + " entries");
return;
}
}
} else if (size == BASE_SIZE) { // 数组大小为BASE_SIZE
synchronized (ArrayMap.class) {
if (mBaseCache != null) {
final Object[] array = mBaseCache;
mArray = array;
mBaseCache = (Object[])array[0];
mHashes = (int[])array[1];
array[0] = array[1] = null;
mBaseCacheSize--;
if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes
+ " now have " + mBaseCacheSize + " entries");
return;
}
}
}
// 没有从缓存池获取对象后直接new
mHashes = new int[size];
mArray = new Object[size<<1];
}
/** 缓存数组对象 */
private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
if (hashes.length == (BASE_SIZE*2)) {
synchronized (ArrayMap.class) {
// 已缓存数目小于缓存池最大缓存容量
if (mTwiceBaseCacheSize < CACHE_SIZE) {
array[0] = mTwiceBaseCache;
array[1] = hashes;
for (int i=(size<<1)-1; i>=2; i--) {
array[i] = null;
}
mTwiceBaseCache = array;
mTwiceBaseCacheSize++;
if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
+ " now have " + mTwiceBaseCacheSize + " entries");
}
}
} else if (hashes.length == BASE_SIZE) {
synchronized (ArrayMap.class) {
if (mBaseCacheSize < CACHE_SIZE) {
array[0] = mBaseCache;
array[1] = hashes;
for (int i=(size<<1)-1; i>=2; i--) {
array[i] = null;
}
mBaseCache = array;
mBaseCacheSize++;
if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
+ " now have " + mBaseCacheSize + " entries");
}
}
}
}
d. put方法
public V put(K key, V value) {
final int hash;
int index;
// 支持key为null
if (key == null) {
hash = 0;
index = indexOfNull();
} else {
hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
// index>0代表该key存在,index<0不存在
index = indexOf(key, hash);
}
if (index >= 0) {
// 直接替换value
index = (index<<1) + 1;
final V old = (V)mArray[index];
mArray[index] = value;
return old;
}
index = ~index;
// 需要扩容
if (mSize >= mHashes.length) {
// 计算对象数组长度
final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
: (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);
if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);
final int[] ohashes = mHashes;
final Object[] oarray = mArray;
// 从新分配对象
allocArrays(n);
// 将原来数据存在进去
if (mHashes.length > 0) {
if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
System.arraycopy(oarray, 0, mArray, 0, oarray.length);
}
// 缓存对象数组到缓存池
freeArrays(ohashes, oarray, mSize);
}
// 要插入位置及其后方元素向后移动1位
if (index < mSize) {
if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
+ " to " + (index+1));
System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
}
// 插入元素
mHashes[index] = hash;
mArray[index<<1] = key;
mArray[(index<<1)+1] = value;
mSize++;
return null;
}
e. remove方法
public V remove(Object key) {
// 二分查找计算该key对应的index位置
final int index = indexOfKey(key);
if (index >= 0) {
return removeAt(index);
}
return null;
}
public V removeAt(int index) {
final Object old = mArray[(index << 1) + 1];
if (mSize <= 1) { // mSize==1移除一个元素后就为null对象
// Now empty.
if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
freeArrays(mHashes, mArray, mSize);
mHashes = EmptyArray.INT;
mArray = EmptyArray.OBJECT;
mSize = 0;
} else {
// 容量小于1/3就减小容量
if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
// Shrunk enough to reduce size of arrays. We don't allow it to
// shrink smaller than (BASE_SIZE*2) to avoid flapping between
// that and BASE_SIZE.
final int n = mSize > (BASE_SIZE*2) ? (mSize + (mSize>>1)) : (BASE_SIZE*2);
if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);
final int[] ohashes = mHashes;
final Object[] oarray = mArray;
allocArrays(n);
mSize--;
if (index > 0) {
if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
System.arraycopy(ohashes, 0, mHashes, 0, index);
System.arraycopy(oarray, 0, mArray, 0, index << 1);
}
if (index < mSize) {
if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + mSize
+ " to " + index);
System.arraycopy(ohashes, index + 1, mHashes, index, mSize - index);
System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
(mSize - index) << 1);
}
} else {
mSize--;
if (index < mSize) {
if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + mSize
+ " to " + index);
System.arraycopy(mHashes, index + 1, mHashes, index, mSize - index);
System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
(mSize - index) << 1);
}
mArray[mSize << 1] = null;
mArray[(mSize << 1) + 1] = null;
}
}
return (V)old;
}
3 ArrayMap 总结
- 双数组实现map的思想,根据不同的数据结构提供特有的方法来提升效率,比如append,当确定要拆入的key是递增的,插入效率就很高
- 池化思想,ArrayMap为了更好的使用内存使用了缓存池,缓存了数组长度4和8的数组到缓存池,自动进行数组对象的扩充和收缩
- SpareArray系列支持key为基本数据的map操作,ArrayMap支持了key为非基本数据类型的操作,由于二分查找效率以上两者都不适合支持大数据,大数据存储还是使用HashMap
本文使用 mdnice 排版