本次wiki分享主要探讨问题:
一:hashCode是怎么生成的,是通过什么规则生成的。
二:为什么同一个对象hash出来的值不会改变。
首先看一个简单的例子:
package hash;
/**
* @Author: sunsuhai
* @Date: 2018/12/24 17:56
*/
public class hashTest {
static class SunShuai{
private String name;
public SunShuai(String name) {
this.name = name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
public static void main(String[] args) {
SunShuai sunShuai = new SunShuai("sunshuai");
long hashCode = sunShuai.hashCode();
System.out.println(hashCode);
}
}
输出的结果为:
41359092
Process finished with exit code 0
我们进入hashCode的方法
public native int hashCode();
可以看到他是一个内部方法,接下来我们打开jvm的源码,找到这个方法
#include "java_lang_Object.h"
static JNINativeMethod methods[] = {
{"hashCode", "()I", (void *)&JVM_IHashCode},
{"wait", "(J)V", (void *)&JVM_MonitorWait},
{"notify", "()V", (void *)&JVM_MonitorNotify},
{"notifyAll", "()V", (void *)&JVM_MonitorNotifyAll},
{"clone", "()Ljava/lang/Object;", (void *)&JVM_Clone},
};
我们可以看到,hash被注册成了JVM_IHashCode方法。接下来我们打开JVM_IHashCode方法:
JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))
JVMWrapper("JVM_IHashCode");
// as implemented in the classic virtual machine; return 0 if object is NULL
return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;
JVM_END
他在这里有一个判断,如果传递的对象为null,那么返回0,否则调用FastHashCode方法。 看FastHashCode的关键代码
ObjectMonitor* monitor = NULL;
markOop temp, test;
intptr_t hash;
markOop mark = ReadStableMark (obj);
// object should remain ineligible for biased locking
assert (!mark->has_bias_pattern(), "invariant") ;
if (mark->is_neutral()) {
hash = mark->hash(); // this is a normal header
if (hash) { // if it has hash, just return it
return hash;
}
hash = get_next_hash(Self, obj); // allocate a new hash code
temp = mark->copy_set_hash(hash); // merge the hash code into header
// use (machine word version) atomic operation to install the hash
test = (markOop) Atomic::cmpxchg_ptr(temp, obj->mark_addr(), mark);
if (test == mark) {
return hash;
}
// If atomic operation failed, we must inflate the header
// into heavy weight monitor. We could add more code here
// for fast path, but it does not worth the complexity.
} else if (mark->has_monitor()) {
monitor = mark->monitor();
temp = monitor->header();
assert (temp->is_neutral(), "invariant") ;
hash = temp->hash();
if (hash) {
return hash;
}
// Skip to the following code to reduce code size
} else if (Self->is_lock_owned((address)mark->locker())) {
temp = mark->displaced_mark_helper(); // this is a lightweight monitor owned
assert (temp->is_neutral(), "invariant") ;
hash = temp->hash(); // by current thread, check if the displaced
if (hash) { // header contains hash code
return hash;
}
// WARNING:
// The displaced header is strictly immutable.
// It can NOT be changed in ANY cases. So we have
// to inflate the header into heavyweight monitor
// even the current thread owns the lock. The reason
// is the BasicLock (stack slot) will be asynchronously
// read by other threads during the inflate() function.
// Any change to stack may not propagate to other threads
// correctly.
}
// Inflate the monitor to set hash code
monitor = ObjectSynchronizer::inflate(Self, obj);
// Load displaced header and check it has hash code
mark = monitor->header();
assert (mark->is_neutral(), "invariant") ;
hash = mark->hash();
if (hash == 0) {
hash = get_next_hash(Self, obj);
temp = mark->copy_set_hash(hash); // merge hash code into header
assert (temp->is_neutral(), "invariant") ;
test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
if (test != mark) {
// The only update to the header in the monitor (outside GC)
// is install the hash code. If someone add new usage of
// displaced header, please update this code
hash = test->hash();
assert (test->is_neutral(), "invariant") ;
assert (hash != 0, "Trivial unexpected object/monitor header usage.");
}
}
// We finally get the hash
return hash;
首先第一步,这段代码中大量使用了markOop,我们想要阅读这段代码,就必须要明白markOop是干啥用的! 我们都知道在JVM中,对象在内存中的布局分为三块区域:对象头、实例数据和对齐填充。这个markOop就是对象头的数据结构,让我们看看他这里边都有啥
// Constants
enum { age_bits = 4,
lock_bits = 2,
biased_lock_bits = 1,
max_hash_bits = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
hash_bits = max_hash_bits > 31 ? 31 : max_hash_bits,
cms_bits = LP64_ONLY(1) NOT_LP64(0),
epoch_bits = 2
};
一些基础属性,我们看我们关注的点,关于hashcode的长度定义,他的最大长度并不固定,是一个基本字节数减去三个固定字节得出的,虽然hash的最大长度不固定,但是hash的长度还是有限制的,因为他有一个三目运算,限制了他的最大长度为31;
,hash_mask = right_n_bits(hash_bits),
hash_mask_in_place = (address_word)hash_mask << hash_shift
两个属性,一个事hash掩码标志,一个是hash掩码标志在哪个地方,
我们可以看到他从mark的hash方法中拿了一下hash值,这个hash值怎么来的呢?
hash = mark->hash();
让我们继续看hash()函数
intptr_t hash() const {
return mask_bits(value() >> hash_shift, hash_mask);
}
看value()方法
uintptr_t value() const { return (uintptr_t) this; }
可知:将this转换成一个指针宽度的整数(uintptr_t)。将他右移hash偏移位之后,与hash的掩码进行掩码运算,得到hash值。 那么问题来了,什么情况下hash值返回的为0?