HashTable源码解析

今天来讲讲工作中极少用到的 HashTable

相比于 HashTable ，日常中我们更多使用的是 HashMap，但由于 HashMap 并不是线程安全的，所以我们需要其他的数据存储结构，这里就包括了 HashTable 。当然，还有 ConcurrentHashMap 也可以保证线程安全，而且 ConcurrentHashMap 的性能要比 HashTable 的好。本篇文章只讲述 HashTable， ConcurrentHashMap 不会存在这篇文章中。

本篇文章基于 Java-11，各版本之间略有差异不过差异并不大，请以你的版本为主

关键属性

public class Hashtable<K,V>
    extends Dictionary<K,V>
    implements Map<K,V>, Cloneable, java.io.Serializable {
    
    // HashTable存放的数据
    private transient HashtableEntry<?,?>[] table;
    
    // 存放的键值对的数量
    private transient int count;
    
    // 阈值，threshold = capacity * loadFactor
    private int threshold;
    
    // 加载因子，默认是0.75
    private float loadFactor;
    
}

构造函数

// 默认容量11，加载因子为0.75
public Hashtable() {
    this(11, 0.75f);
}

public Hashtable(int initialCapacity) {
    this(initialCapacity, 0.75f);
}

// 创建HashtableEntry数组
public Hashtable(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal Load: "+loadFactor);

    if (initialCapacity==0)
        initialCapacity = 1;
    this.loadFactor = loadFactor;
    table = new HashtableEntry<?,?>[initialCapacity];
    // Android-changed: Ignore loadFactor when calculating threshold from initialCapacity
    // threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    threshold = (int)Math.min(initialCapacity, MAX_ARRAY_SIZE + 1);
}

`HashtableEntry` 结构

private static class HashtableEntry<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    HashtableEntry<K,V> next;

    protected HashtableEntry(int hash, K key, V value, HashtableEntry<K,V> next) {
        this.hash = hash;
        this.key =  key;
        this.value = value;
        this.next = next;
    }
}

与 HashMap 的 Node 结构类似，主要存放 key 、value、hash值和链表下一个值

put

// synchronized修饰方法，保证了线程安全
public synchronized V put(K key, V value) {
    // value为空抛出空指针异常
    if (value == null) {
        throw new NullPointerException();
    }

    HashtableEntry<?,?> tab[] = table;
    // 获取hashCode
    int hash = key.hashCode();
    // 获取要插入数据在数组中的索引
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> entry = (HashtableEntry<K,V>)tab[index];
    // 链表查询，如果链表中存在hash值和key都一致的HashtableEntry，则替换新值和返回旧的值
    for(; entry != null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }

    addEntry(hash, key, value, index);
    return null;
}

可以看到 put 最终调用 addEntry 插入元素，并且返回 null

private void addEntry(int hash, K key, V value, int index) {
    modCount++;

    HashtableEntry<?,?> tab[] = table;
    // 数据超过阈值，开始扩容
    if (count >= threshold) {
        // 扩容
        rehash();

        // 扩容后需要重新计算hash和索引，因为索引跟长度有关
        tab = table;
        hash = key.hashCode();
        index = (hash & 0x7FFFFFFF) % tab.length;
    }
    
    // 插入数据，这里是头插法
    HashtableEntry<K,V> e = (HashtableEntry<K,V>) tab[index];
    tab[index] = new HashtableEntry<>(hash, key, value, e);
    count++;
}

get

// synchronized修饰方法，保证了线程安全
public synchronized V get(Object key) {
    HashtableEntry<?,?> tab[] = table;
    // 获取hash值
    int hash = key.hashCode();
    // 获取该key在数组中的索引
    int index = (hash & 0x7FFFFFFF) % tab.length;
    // 链表查询
    for (HashtableEntry<?,?> e = tab[index] ; e != null ; e = e.next) {
        // 只有当hash和key都一致时才返回值
        if ((e.hash == hash) && e.key.equals(key)) {
            return (V)e.value;
        }
    }
    // 不存在该key时，返回null
    return null;
}

rehash

protected void rehash() {
    int oldCapacity = table.length;
    HashtableEntry<?,?>[] oldMap = table;

    // 新容量大小 = 旧容量大小左右1位 + 1 = 2 * oldCapacity + 1
    int newCapacity = (oldCapacity << 1) + 1;
    // 最大不超过MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8
    if (newCapacity - MAX_ARRAY_SIZE > 0) {
        if (oldCapacity == MAX_ARRAY_SIZE)
            return;
        newCapacity = MAX_ARRAY_SIZE;
    }
    HashtableEntry<?,?>[] newMap = new HashtableEntry<?,?>[newCapacity];

    modCount++;
    // 重新计算阈值
    threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    table = newMap;

    // 将旧数组上的数据存到新数组上来
    for (int i = oldCapacity ; i-- > 0 ;) {
        for (HashtableEntry<K,V> old = (HashtableEntry<K,V>)oldMap[i] ; old != null ; ) {
            HashtableEntry<K,V> e = old;
            old = old.next;

            // 重新计算索引，因为索引是跟容量大小有关
            int index = (e.hash & 0x7FFFFFFF) % newCapacity;
            // 链表存储
            e.next = (HashtableEntry<K,V>)newMap[index];
            newMap[index] = e;
        }
    }
}

上面就是 HashTable 最重要的三个方法了，put (存) 、 get (取) 和 rehash (扩容)，下面说说 HashTable 中其他方法的实现

其他方法

这里主要讲述 contains 、containsKey 、 putIfAbsent 、 remove 和 replace 。

contains

这个方法是判断 HashTable 中是否包含某个值

// synchronized修饰方法，保证了线程安全
public synchronized boolean contains(Object value) {
    // value 抛出空指针异常，因为HashTable不会存放value为空的数据
    if (value == null) {
        throw new NullPointerException();
    }

    HashtableEntry<?,?> tab[] = table;
    // 遍历整个数组
    for (int i = tab.length ; i-- > 0 ;) {
        // 链表查询
        for (HashtableEntry<?,?> e = tab[i] ; e != null ; e = e.next) {
            // 数组或链表存在该value，则返回true
            if (e.value.equals(value)) {
                return true;
            }
        }
    }
    return false;
}

containsKey

这个方法是判断 HashTable 中是否包含某个 key 值

// synchronized修饰方法，保证了线程安全
public synchronized boolean containsKey(Object key) {
    HashtableEntry<?,?> tab[] = table;
    // 计算hash和数组索引
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    for (HashtableEntry<?,?> e = tab[index] ; e != null ; e = e.next) {
        // 数组或链表中存在对应的hash和key则返回true
        if ((e.hash == hash) && e.key.equals(key)) {
            return true;
        }
    }
    return false;
}

putIfAbsent

和 put 方法差不多，只是当 HashTable 存在该 key 的键值时，不会去替换旧的数据

// synchronized修饰方法，保证了线程安全
public synchronized V putIfAbsent(K key, V value) {
    Objects.requireNonNull(value);

    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> entry = (HashtableEntry<K,V>)tab[index];
    for (; entry != null; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            // 和put的区别就在这里
            if (old == null) {
                entry.value = value;
            }
            return old;
        }
    }

    addEntry(hash, key, value, index);
    return null;
}

remove

将某个键值对从 HashTable 中移除，如果 HashTable 移除成功返回该 key 对应的值，否则返回 null

// synchronized修饰方法，保证了线程安全
public synchronized V remove(Object key) {
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    // 找到数组中索引为index 的数据
    HashtableEntry<K,V> e = (HashtableEntry<K,V>)tab[index];
    // 遍历该链表，因为需要判断该键值对在链表中的位置才能移除，并且链表的指向需要重置
    for(HashtableEntry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
        // 链表中找到了该键值对
        if ((e.hash == hash) && e.key.equals(key)) {
            modCount++;
            // 这里是非链表头找到该键值对，需要将该键值对的上一个元素指向该键值对的下一个元素
            if (prev != null) {
                prev.next = e.next;
            } else {
                // 这里说明了该键值对是链表头，则将该位置的数据置为next即可
                tab[index] = e.next;
            }
            count--;
            V oldValue = e.value;
            e.value = null;
            return oldValue;
        }
    }
    // 数组中不存在该键值对
    return null;
}

replace

看方法名就知道是替换 HashTable 的键所对应的值，如果成功替换返回旧的值，否则返回 null

// synchronized修饰方法，保证了线程安全
public synchronized V replace(K key, V value) {
    Objects.requireNonNull(value);
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> e = (HashtableEntry<K,V>)tab[index];
    for (; e != null; e = e.next) {
        // 找到了该键值对
        if ((e.hash == hash) && e.key.equals(key)) {
            V oldValue = e.value;
            e.value = value;
            return oldValue;
        }
    }
    return null;
}

`HashTable` 和 `HashMap` 的区别

HashTable 和 HashMap 都是数据存储结构，HashTable 是基于数组+链表，HashMap 在 java7 也是基于数组+链表，在 java8之后 HashMap 为了加快查找效率引入了红黑树。

HashTable 的 key 和 value 都可以为 null，当 key 为 null 时，获取到的 hash 值为0 ；HashTable 的 key 和 value 都不可以为 null，当 key 或 value 为 null 时抛出空指针异常 NullPointerException
HashMap 线程不安全，Hashtable 线程安全，因为 Hashtable 的大部分操作函数都是使用 synchronized 修饰，所以 Hashtable 的性能相对差
HashTable 继承了 Dictionary，而 HashMap 继承了 AbstractMap
HashMap 的默认容量为16，并且只能是2的指数幂，而 HashTable 默认是11。
HashMap 扩容后容量大小是上一次的两倍，Hashtable 扩容后容量大小是上一次的两倍+1

HashTable源码解析

HashTable源码解析

关键属性

构造函数

HashtableEntry 结构

put

get

rehash

其他方法

contains

containsKey

putIfAbsent

remove

replace

HashTable 和 HashMap 的区别

`HashtableEntry` 结构

`HashTable` 和 `HashMap` 的区别