HashMap的hash方法

288 阅读1分钟

前言

一点读书笔记。为什么不直接使用对象的hashcode,而是通过hash方法后再进行槽位定位?

详细

在 HashMap 中会存在一个hash函数,一般在理解这个问题的时候,总是 hashcode和Node[]的长度(初始值16)来做异或运算,得到槽位。这种做法带来的问题是:不够随机。

一个对象的hashcode是32位,仅仅使用最后的4位来进行异或运算是不划算的,冲突概率会大大增加,这里使用32位中,前16位和后16位先进行异或运算。重新获得了一个hash值。再用这个hash值和槽位进行与运算。

    /**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }