Object.hashCode

212 阅读3分钟

本次wiki分享主要探讨问题:

一:hashCode是怎么生成的,是通过什么规则生成的。
二:为什么同一个对象hash出来的值不会改变。

首先看一个简单的例子:

package hash;
/**
 * @Author: sunsuhai
 * @Date: 2018/12/24 17:56
 */
public class hashTest {

    static class SunShuai{
        private String name;

        public SunShuai(String name) {
            this.name = name;
        }

        public String getName() {
            return name;
        }

        public void setName(String name) {
            this.name = name;
        }
    }
    public static void main(String[] args) {
        SunShuai sunShuai = new SunShuai("sunshuai");

        long hashCode = sunShuai.hashCode();
        System.out.println(hashCode);

    }
}

输出的结果为:

41359092

Process finished with exit code 0

我们进入hashCode的方法

public native int hashCode();

可以看到他是一个内部方法,接下来我们打开jvm的源码,找到这个方法

#include "java_lang_Object.h"

static JNINativeMethod methods[] = {
    {"hashCode",    "()I",                    (void *)&JVM_IHashCode},
    {"wait",        "(J)V",                   (void *)&JVM_MonitorWait},
    {"notify",      "()V",                    (void *)&JVM_MonitorNotify},
    {"notifyAll",   "()V",                    (void *)&JVM_MonitorNotifyAll},
    {"clone",       "()Ljava/lang/Object;",   (void *)&JVM_Clone},
};

我们可以看到,hash被注册成了JVM_IHashCode方法。接下来我们打开JVM_IHashCode方法:


JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))
  JVMWrapper("JVM_IHashCode");
  // as implemented in the classic virtual machine; return 0 if object is NULL
  return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;
JVM_END

他在这里有一个判断,如果传递的对象为null,那么返回0,否则调用FastHashCode方法。 看FastHashCode的关键代码

ObjectMonitor* monitor = NULL;
  markOop temp, test;
  intptr_t hash;
  markOop mark = ReadStableMark (obj);

  // object should remain ineligible for biased locking
  assert (!mark->has_bias_pattern(), "invariant") ;

  if (mark->is_neutral()) {
    hash = mark->hash();              // this is a normal header
    if (hash) {                       // if it has hash, just return it
      return hash;
    }
    hash = get_next_hash(Self, obj);  // allocate a new hash code
    temp = mark->copy_set_hash(hash); // merge the hash code into header
    // use (machine word version) atomic operation to install the hash
    test = (markOop) Atomic::cmpxchg_ptr(temp, obj->mark_addr(), mark);
    if (test == mark) {
      return hash;
    }
    // If atomic operation failed, we must inflate the header
    // into heavy weight monitor. We could add more code here
    // for fast path, but it does not worth the complexity.
  } else if (mark->has_monitor()) {
    monitor = mark->monitor();
    temp = monitor->header();
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();
    if (hash) {
      return hash;
    }
    // Skip to the following code to reduce code size
  } else if (Self->is_lock_owned((address)mark->locker())) {
    temp = mark->displaced_mark_helper(); // this is a lightweight monitor owned
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();              // by current thread, check if the displaced
    if (hash) {                       // header contains hash code
      return hash;
    }
    // WARNING:
    //   The displaced header is strictly immutable.
    // It can NOT be changed in ANY cases. So we have
    // to inflate the header into heavyweight monitor
    // even the current thread owns the lock. The reason
    // is the BasicLock (stack slot) will be asynchronously
    // read by other threads during the inflate() function.
    // Any change to stack may not propagate to other threads
    // correctly.
  }

  // Inflate the monitor to set hash code
  monitor = ObjectSynchronizer::inflate(Self, obj);
  // Load displaced header and check it has hash code
  mark = monitor->header();
  assert (mark->is_neutral(), "invariant") ;
  hash = mark->hash();
  if (hash == 0) {
    hash = get_next_hash(Self, obj);
    temp = mark->copy_set_hash(hash); // merge hash code into header
    assert (temp->is_neutral(), "invariant") ;
    test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
    if (test != mark) {
      // The only update to the header in the monitor (outside GC)
      // is install the hash code. If someone add new usage of
      // displaced header, please update this code
      hash = test->hash();
      assert (test->is_neutral(), "invariant") ;
      assert (hash != 0, "Trivial unexpected object/monitor header usage.");
    }
  }
  // We finally get the hash
  return hash;

首先第一步,这段代码中大量使用了markOop,我们想要阅读这段代码,就必须要明白markOop是干啥用的! 我们都知道在JVM中,对象在内存中的布局分为三块区域:对象头、实例数据和对齐填充。这个markOop就是对象头的数据结构,让我们看看他这里边都有啥


  // Constants
  enum { age_bits                 = 4,
         lock_bits                = 2,
         biased_lock_bits         = 1,
         max_hash_bits            = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
         hash_bits                = max_hash_bits > 31 ? 31 : max_hash_bits,
         cms_bits                 = LP64_ONLY(1) NOT_LP64(0),
         epoch_bits               = 2
  };

一些基础属性,我们看我们关注的点,关于hashcode的长度定义,他的最大长度并不固定,是一个基本字节数减去三个固定字节得出的,虽然hash的最大长度不固定,但是hash的长度还是有限制的,因为他有一个三目运算,限制了他的最大长度为31;

         ,hash_mask               = right_n_bits(hash_bits),
         hash_mask_in_place       = (address_word)hash_mask << hash_shift

两个属性,一个事hash掩码标志,一个是hash掩码标志在哪个地方,

我们可以看到他从mark的hash方法中拿了一下hash值,这个hash值怎么来的呢?

hash = mark->hash();

让我们继续看hash()函数

  intptr_t hash() const {
    return mask_bits(value() >> hash_shift, hash_mask);
  }

看value()方法

 uintptr_t value() const { return (uintptr_t) this; }

可知:将this转换成一个指针宽度的整数(uintptr_t)。将他右移hash偏移位之后,与hash的掩码进行掩码运算,得到hash值。 那么问题来了,什么情况下hash值返回的为0?