环境：jdk1.8

什么是hash

hash、散列值，是把任意长度的输入，通过散列函数变换成固定长度的输出，这种转换是一种压缩映射，散列值的长度通常远小于输入的长度，可以看做是摘要或特征值，不同的输入有可能会得到相同的hash。

hash的特性

不可逆，即可以从输入得到hash，但是不能根据hash得到原输入；计算快，1g的视频和1k的文本计算量都很小，也就是不管猪有多肥、骨头有多硬，做成香肠都是眨眨眼的时间。

hash的应用

在数据存储中，通常通过hash函数，得到元素在一定区间内的均匀分布，以提高存储空间的利用率。

hash的实现

hash算法是一种思想，没有一个具体的公式。

Java中Object的hashCode方法

hashCode()方法的返回值是int类型。

    public native int hashCode();

Java中String的hashCode实现

我们注意注释中的公式 s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] 等于 s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]*31^(n-n)

s是该字符串的字节数组，n是字节数组的长度。

这个公式很简单，其实没什么特殊意义，就是为了能够得到一个在区间 [-2^31, 2^31-1]，也就是 [-2147483648, 2147483647] 内均匀分布的int值。

    /**
     * Returns a hash code for this string. The hash code for a
     * {@code String} object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using {@code int} arithmetic, where {@code s[i]} is the
     * <i>i</i>th character of the string, {@code n} is the length of
     * the string, and {@code ^} indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

公式中为什么要用31作为乘数

是因为31是一个大小合适质数，可以做到分布区间在 [-2^31, 2^31-1]，之所以用质数是为了降低hash冲突（这一点我也不是太懂），经过验证，对超过五万个英文单词进行hashCode()计算，31作为乘数时，哈系冲突数小于7个。

还有一点是31离2的5次方，也就是2^5=32很近，31 = 32 - 1； i * 31可以被优化为(i << 5) - i这样的位移运算，获得更好的性能， Java虚拟机现在可以自动完成这个优化。

简单说一下位移运算，左移<< a<<b，a乘以2的b次方；右移>> a>>b，a除以2的b次方；

如果想知道，31如何经过验证的，可以看这个， String hashCode()方法为什么选择数字31作为乘数。

Java中String类型的hashCode实现