HashMap源码分析

1,920 阅读6分钟

本文针对HashMap源码中的一些重要方法做讲解。

Android中的HashMap与java中HashMap实现有差异,这里以Android的源码为例进行讲解。

成员变量说明

	1. int DEFAULT_INITIAL_CAPACITY = 4//默认初始化的容量(6.0上为4,后续又修改为16)
	2. int MAXIMUM_CAPACITY = 1 << 30; //HashMap最大的容量
	3. float DEFAULT_LOAD_FACTOR = 0.75f; //判断是否扩容时用的计算因子
	4. int threshold ; //用于比较是否该扩容,其值为 capacity * load factor
	5. int size; //当前存储的数据容量
	6. HashMapEntry<K,V>[] table = (HashMapEntry<K,V>[]) EMPTY_TABLE;//数据存储在这

成员变量table

HashMap内部实现的是Map.Entry<K,V> 的,数据以数组形式保存的链表。

	transient HashMapEntry<K,V>[] table = (HashMapEntry<K,V>[]) EMPTY_TABLE;

看一下HashMapEntry里面的代码,很简单:

	 /** @hide */  // Android added.
    static class HashMapEntry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        HashMapEntry<K,V> next;
        int hash;
        
        ...
	}

保存了数据的key、value、hash以及以及下一个数据,就是典型的链表储存形式。至于put进来的数据,如何确定保存在table中的位置,请看下文。

构造函数

构造函数主要对初始化容量以及加载因子做了参数校验,以符合后续的操作

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
                                               
        //初始化容量                                       
        if (initialCapacity > MAXIMUM_CAPACITY) {
            initialCapacity = MAXIMUM_CAPACITY;
        } else if (initialCapacity < DEFAULT_INITIAL_CAPACITY) {
            initialCapacity = DEFAULT_INITIAL_CAPACITY;
        }
			
		//校验加载因子的准确性
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        // Android-Note: We always use the default load factor of 0.75f.

        // This might appear wrong but it's just awkward design. We always call
        // inflateTable() when table == EMPTY_TABLE. That method will take "threshold"
        // to mean "capacity" and then replace it with the real threshold (i.e, multiplied with
        // the load factor).
        
        //默认扩容的阀值就是初始的容量
        threshold = initialCapacity;
        init();
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
    }

数据put操作

最基本的数据保存的方法,下面细分讲解:

	public V put(K key, V value) {
		if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
        int i = indexFor(hash, table.length);
        for (HashMapEntry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
	}

操作流程如下:

  1. 内部回判断当前table是否为空,为空则初始化table,
  2. table不为空,则查询当前的key是否在table中有保存,有则用新value替换旧value,并返回旧value
  3. 若当前key未保存数据,则创建新的entry进行保存(中间有一步扩容的操作)

table初始化

	    /**
     * Inflates the table.
     */
    private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        
        //取最靠近(大于等于)toSize的2的整数倍值
        int capacity = roundUpToPowerOf2(toSize);

        // Android-changed: Replace usage of Math.min() here because this method is
        // called from the <clinit> of runtime, at which point the native libraries
        // needed by Float.* might not be loaded.
        float thresholdFloat = capacity * loadFactor;
        if (thresholdFloat > MAXIMUM_CAPACITY + 1) {
            thresholdFloat = MAXIMUM_CAPACITY + 1;
        }

		//真正初始化临界值
        threshold = (int) thresholdFloat;
        table = new HashMapEntry[capacity];
    }

看一下里面取2整数倍的方法

      private static int roundUpToPowerOf2(int number) {
        // assert number >= 0 : "number must be non-negative";
        int rounded = number >= MAXIMUM_CAPACITY
                ? MAXIMUM_CAPACITY
                : (rounded = Integer.highestOneBit(number)) != 0//最高位为0,直接返回1
                    ? (Integer.bitCount(number) > 1) ? rounded << 1 : rounded//如果高位1的个数大于1,肯定不是2的整数倍,rounded扩大一倍,保证比number大
                    : 1;

        return rounded;
    }

对于里面Integer.bitCount()不理解的同学可以参考这篇文章

Integer.bitCount实现过程

value存储

对key进行判断,若key为null则存储到table的第一位,否则查找应该存储的位置

	int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
	int i = indexFor(hash, table.length);

每次put以及get都会用到上面2个方法,hash值用于确定entry在table中的位置

看一下indexFor()

	/**
     * Returns index for hash code h.
     */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        
        //上文可以知道table的length一定为2的倍数,这里length-1转换成二进制的时候可以肯定所有的位数都为1,
        //h & (length-1)可以保证不同的hash值在table里分布
        return h & (length-1);
    }

获取到index之后,去查找对应的链表,并循环遍历是否有保存的value,有则替换,否则插入新数据

	for (HashMapEntry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }

来看一下插入数据的操作

	    /**
     * Adds a new entry with the specified key, value and hash code to
     * the specified bucket.  It is the responsibility of this
     * method to resize the table if appropriate.
     *
     * Subclass overrides this to alter the behavior of put method.
     */
    void addEntry(int hash, K key, V value, int bucketIndex) {
    
    	//这里用到了上面提到的threshold,到达临界值,同时当前插入的位置上有链表后,进行扩容,并将数据重新存入(length改变后,同一个hash值通过indexFor()计算得到的位置可能会变)
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? sun.misc.Hashing.singleWordWangJenkinsHash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

数据get操作

相对put操作,get就比较简单了

	    /**
     * Returns the entry associated with the specified key in the
     * HashMap.  Returns null if the HashMap contains no mapping
     * for the key.
     */
    final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

	// 找到table中的链表位置后遍历循环比较取出数据即可
        int hash = (key == null) ? 0 : sun.misc.Hashing.singleWordWangJenkinsHash(key);
        for (HashMapEntry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

遍历

遍历的方法主要有3个

  1. keySet();
  2. values();
  3. entrySet();

这三个方法内部都是用的各自实现的HashIterator来进行遍历

	    private abstract class HashIterator<E> implements Iterator<E> {
        HashMapEntry<K,V> next;        // next entry to return
        int expectedModCount;   // For fast-fail
        int index;              // current slot
        HashMapEntry<K,V> current;     // current entry

        HashIterator() {
            expectedModCount = modCount;
            if (size > 0) { // advance to first entry
                HashMapEntry[] t = table;
                //找到第一个不为null的HashMapEntry
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Entry<K,V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            HashMapEntry<K,V> e = next;
            if (e == null)
                throw new NoSuchElementException();

            if ((next = e.next) == null) {
            //如果当前链表遍历完了,则查找下一个不为null的链表的位置
                HashMapEntry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
            current = e;
            return e;
        }

        public void remove() {
            if (current == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Object k = current.key;
            current = null;
            HashMap.this.removeEntryForKey(k);
            expectedModCount = modCount;
        }
    }

总结

  1. HashMap插入数据不同步,所以是线程不安全的
  2. 内部使用链表 + 数组的形式保存所有的数据
  3. 每次扩容都是当前容量的2倍