JDK1.8 HashMap扩容原理,树退化,节点迁移的优化

376 阅读4分钟

聊一下resize()时的操作

如下是resize()方法的说明

/**
 * Initializes or doubles table size.  If null, allocates in
 * accord with initial capacity target held in field threshold.
 * Otherwise, because we are using power-of-two expansion, the
 * elements from each bin must either stay at same index, or move
 * with a power of two offset in the new table.
 *
 * @return the table
 */
final Node<K,V>[] resize() {

resize()方法两个作用,按方法注解说明,一个是承担初始化表操作,一个是对表进行扩容操作,然后扩容后需要考虑到元素重新分配到新表对应的hash槽(数组索引)的问题;新槽位,要么使用原先的的槽位对应的索引,要么需要在原槽位上加上原先的表的size作为新表的索引;
道理很简单,假如原先为表为16的容量,有两个元素a,b的hash值分别为 3 和 19,算出来的对应的槽位都是3,扩容之后容量为32,那么a和b对应的hash槽位分别为3和19,a保持不变,b 的槽位符合(原先槽位+元素组大小) 3+16的设定。

先看下扩容的代码

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    // 初始化时oldCap = 0
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        // 容量最大为int最大值,不能再扩容了
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // 没超过最大值,容量扩容为一倍..
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        // 这里对应是初始化表时,定义容量大小及需要扩容时的容量阈值
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    // 分配新的数组
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
  
    // 旧元素迁移到新素组,看下面.....
  
    返回新素组
    return newTab;
}

旧元素迁移到新数组代码,着重学习下优化的点

if (oldTab != null) {
    // 遍历旧素组每一个元素
    for (int j = 0; j < oldCap; ++j) {
        Node<K,V> e;
        if ((e = oldTab[j]) != null) {
            oldTab[j]
            if (e.next == null)
                // 如果当前槽只有一个元素,就计算新的hash值,放到新表的槽中。
                newTab[e.hash & (newCap - 1)] = e;
            else if (e instanceof TreeNode)
                // 如果当槽已被树化,进行一些处理,见下面分解
                ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
            else { // preserve order
                // 当前节点已经链表化,需要对链表中每个元素重新分配hash槽,
                // 定义两个链表头,这里定义两个,对应一个是使用原槽位,一个使用原槽位+oldCap的新槽位,这里是一个优化点,  
                不是链表中每个节点都迁移,而是讲两个可能迁移的槽位对应的节点先找到并组成链表,再整体迁移到新的槽位。
                Node<K,V> loHead = null, loTail = null;
                Node<K,V> hiHead = null, hiTail = null;
                Node<K,V> next;
                do {
                    next = e.next;
                    // 如果& 旧值为0,认定为旧索引 
                    if ((e.hash & oldCap) == 0) {
                        if (loTail == null)
                            loHead = e;
                        else
                            loTail.next = e;
                        loTail = e;
                    }
                    // 新槽位
                    else {
                        if (hiTail == null)
                            hiHead = e;
                        else
                            hiTail.next = e;
                        hiTail = e;
                    }
                } while ((e = next) != null);
                if (loTail != null) {
                    loTail.next = null;
                    // 整个链表放到旧索引对应的槽位
                    newTab[j] = loHead;
                }
                if (hiTail != null) {
                    hiTail.next = null;
                    // 整个链表放到新索引对应的槽位
                    newTab[j + oldCap] = hiHead;
                }
            }
        }
    }
}

可以看到这里判断迁移到新表哪个槽位的的核心方法为if ((e.hash & oldCap) == 0) ,结合前面的说明,也很容易理解;

树退化+迁移

上面的代码中有一段树节点迁移的代码

else if (e instanceof TreeNode)
    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);

看下对应的代码 迁移的操作基本和上面链表迁移一样,注释上也说,把一个数拆成高低位的两颗数,如果数的大小太小(小于=6),则把树拆成链表,这个方法只用于resize()...

/**
 * Splits nodes in a tree bin into lower and upper tree bins,
 * or untreeifies if now too small. Called only from resize;
 * see above discussion about split bits and indices.
 *
 * @param map the map
 * @param tab the table for recording bin heads
 * @param index the index of the table being split
 * @param bit the bit of hash to split on
 */
final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    TreeNode<K,V> b = this;
    // Relink into lo and hi lists, preserving order
    TreeNode<K,V> loHead = null, loTail = null;
    TreeNode<K,V> hiHead = null, hiTail = null;
    int lc = 0, hc = 0;
    for (TreeNode<K,V> e = b, next; e != null; e = next) {
        next = (TreeNode<K,V>)e.next;
        e.next = null;
        // 按新的表拆成高低两个索引的树
        if ((e.hash & bit) == 0) {
            if ((e.prev = loTail) == null)
                loHead = e;
            else
                loTail.next = e;
            loTail = e;
            ++lc;
        }
        else {
            if ((e.prev = hiTail) == null)
                hiHead = e;
            else
                hiTail.next = e;
            hiTail = e;
            ++hc;
        }
    }

    // 树的size太小直接转换成链表loHead.untreeify(map);
    if (loHead != null) {
        // 树元素<=6链表化
        if (lc <= UNTREEIFY_THRESHOLD)
            tab[index] = loHead.untreeify(map);
        else {
            tab[index] = loHead;
            if (hiHead != null) // (else is already treeified)
                loHead.treeify(tab);
        }
    }
    if (hiHead != null) {
        if (hc <= UNTREEIFY_THRESHOLD)
            tab[index + bit] = hiHead.untreeify(map);
        else {
            tab[index + bit] = hiHead;
            if (loHead != null)
                hiHead.treeify(tab);
        }
    }
}