开篇简介

最近学习了HashMap在多线程场景下死循环的问题，但是发现大多博主对于这个问题复现的方案都是：模拟多线程模拟并发，对问题的复现基本是靠运气，理论无法与现实切合。作为严谨的程序员，靠运气显然是不可靠的，所以也就有了本文。

本文将分为以下三部分

HashMap死循环问题的源码级讲解
基于IntelliJ IDEA的多线程Debug
100%复现Hashmap死循环问题

推荐大家在学习完1，2两点后先自己尝试复现下。

HashMap的死循环问题(JDK1.7)

问题简介：在JDK1.7版本下，多线程调用HashMap的put方法后，会导致get方法死循环，从而导致cpu100%占用。

HashMap基础结构

 static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        int hash;

        /**
         * Creates new entry.
         */
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }

put方法源码简析

在利用多线程Debug复现问题前 , 我们先通过阅读源码的形式了解下主要相关的几个方法.

put方法主要逻辑

我们从HashMap.put方法作为入口 , put方法的主要逻辑分为5大步.

散列表若为空 , 初始化散列表
处理key为null的节点
计算hash && 散列位置
已存在的key , 替换旧值
未存在的key , 新增节点

public V put(K key, V value) {
    // 散列表为空 , 进行初始化操作
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
    // key = null 特殊处理
        if (key == null){
            return putForNullKey(value);
        }
    // 根据key计算hash值
        int hash = hash(key);
    // 根据hash计算数组位置
        int i = indexFor(hash, table.length);
    // 循环链表
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            // 若当前key已存在,则更新value , 返回oldValue
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
    // 添加新节点
        addEntry(hash, key, value, i);
        return null;
    }

addEntry新增节点逻辑

逻辑上分为2步:

若需要扩容 , 则进行扩容 , 扩容后散列表发生变化 , 重新计算节点hash & 散列位置
不需要扩容 || 扩容完成 , 则使用头插法新增节点.

 void addEntry(int hash, K key, V value, int bucketIndex) {
     // 判断是否需要扩容
        if ((size >= threshold) && (null != table[bucketIndex])) {
            // 进行扩容
            resize(2 * table.length);
            // 重新计算hash 与 散列数组中的位置
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
     // 添加新节点
        createEntry(hash, key, value, bucketIndex);
    }

resize扩容方法逻辑

逻辑上分为3步:

创建新散列表 , 若当前散列表已达到最大值 , 则重置阀值后 , 直接返回
迁移原有散列表数据
重新计算扩容阀值

void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
    // 散列数组最大值限制 , 若已达到最大则直接返回
    // MAXIMUM_CAPACITY = 1 << 30;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
// 创建新散列数组
        Entry[] newTable = new Entry[newCapacity];
    // 将原有散列数组数据迁移至 新数组
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
    // 重新计算 扩容阀值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

transfer迁移历史数据方法逻辑

终于来到了 "案发现场" , transfer方法主要的目的就是将原有散列表中的数据 , 搬迁至新散列表中 . 在多线程场景下 , 因为使用了头插法则产生了成环的风险.

主要逻辑为:

循环原有散列表中的节点
利用头插法循环插入新散列表中

 void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
     // 循环old散列数组
        for (Entry<K,V> e : table) {
            // 循环当前链表
            while(null != e) {
                Entry<K,V> next = e.next;
                // 若需要重新hash 则进行rehash
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                // 计算新散列数组位置
                int i = indexFor(e.hash, newCapacity);
                // 采用头插法进行数据搬迁
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

死循环案例分析

当前数据场景：

hashMap 容量：4，扩容因子：0.75（默认）
当前已存在数据1，9，16
线程1与线程2，同时进行put操作触发扩容，从而触发数据迁移

第一步：线程2进入transfer逻辑循环，e=16，e.next=9，变量赋值后，线程2执行时间片耗尽，线程1执行数据迁移前，内存分布如下

第二步：线程1完成数据迁移，但未替换原有table，因采用头插法数据迁移后倒序，内存分布如下

第三步：线程2执行，执行前引用状态：e=16，e.next=9，table[1] = null。进入第一次循环后，内存分布如下

第四步：线程2继续执行，执行前引用状态：e=9，e.next=16（线程1迁移导致），table[1] = 16。进入第二次循环后，内存分布如下

第五步：线程2继续执行，执行前引用状态：e=16，e.next=null，table[1] = 9。进入第三次循环后引用状态：e=null，e.next=null , table[1]=16，16.next=9，9.next=16（循环引用成环）内存分布如下

多线程Debug

入门

多线程的断点设置也非常简单

设置断点
打开设置面板
选择需要进行多线程debug的断点
设置断点停顿级别为: 线程
生效

试一下是否有效 , 此时我们就可以对不同线程分别进行Debug的控制了

/**
测试环境: JDK1.7
开发工具: IDEA 2018.3
*/
public class Main {
    public static void main(String[] args) {
        Runnable testRunnable = new Runnable() {
            @Override
            public void run() {
                System.out.println("当前线程:" + Thread.currentThread().getName());
            }
        };
        Thread testThread1 = new Thread(testRunnable);
        testThread1.start();

        Thread testThread2 = new Thread(testRunnable);
        testThread2.start();
    }
}

100%复现Hashmap死循环

/**
用于Debug的测试方法
*/
public class Main {
    public static void main(String[] args) {
        final HashMap testMap = new HashMap(4);
        testMap.put(1, 1);
        testMap.put(9, 9);
        testMap.put(16, 16);
        Runnable testRunnable = new Runnable() {
            public void run() {
                System.out.println(Thread.currentThread().getName());
                testMap.put(13, 13);

            }
        };
        Thread testThread1 = new Thread(testRunnable,"Test-Thread-1");
        testThread1.start();
        Thread testThread2 = new Thread(testRunnable, "Test-Thread-2");
        testThread2.start();
    }
    
    // 此方法计算可成链的key值
    public static void computeNum() {
        for (int i = 0; i < 10000; i++) {
            int hashCode = hash(i);

            if (hashCode % 4 == 1 && hashCode % 4 == hashCode % 8) {
                System.out.printf("hash:%d , i:%d", hashCode, i);
                System.out.println();
            }
        }
    }

    /**
     * 从1.7的HashMap中拷贝的hash方法
     * @param k
     * @return
     */
    public static final int hash(Object k) {
        int h = 0;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

}