和String说的那些悄悄话其二

437 阅读2分钟

小知识,大挑战!本文正在参与“ 程序员必备小知识 ”创作活动

本文同时参与 「掘力星计划」 ,赢取创作大礼包,挑战创作激励金

一,聊聊String的三个公有方法

1, intern()方法的使用,先看下源码(jdk1.8):

/** jdk 1.8
 * Returns a canonical representation for the string object.
 * <p>
 * A pool of strings, initially empty, is maintained privately by the
 * class {@code String}.
 * <p>
 * When the intern method is invoked, if the pool already contains a
 * string equal to this {@code String} object as determined by
 * the {@link #equals(Object)} method, then the string from the pool is
 * returned. Otherwise, this {@code String} object is added to the
 * pool and a reference to this {@code String} object is returned.
 * <p>
 * It follows that for any two strings {@code s} and {@code t},
 * {@code s.intern() == t.intern()} is {@code true}
 * if and only if {@code s.equals(t)} is {@code true}.
 * <p>
 * All literal strings and string-valued constant expressions are
 * interned. String literals are defined in section 3.10.5 of the
 * <cite>The Java&trade; Language Specification</cite>.
 *
 * @return  a string that has the same contents as this string, but is
 *          guaranteed to be from a pool of unique strings.
 */
public native String intern();

我们知道,该方法是native方法,当字符串常量池有当前字符串时,返回字符串常量池的引用,这一点在不同版本的jdk中是一致的。不同的是,当字符串常量池中没有的话,在字符串常量池中的操作区别如下:

2, hashCode()方法,String重写了hashCode()的实现,源码如下:

/**
 * Returns a hash code for this string. The hash code for a
 * {@code String} object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using {@code int} arithmetic, where {@code s[i]} is the
 * <i>i</i>th character of the string, {@code n} is the length of
 * the string, and {@code ^} indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash; // #1 hash for what
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i]; // #2 why 31
        }
        hash = h;
    }
    return h;
}

看完上面这个源码,大家可能会有两个疑问:

1,hash的作用是啥?

hash是int类型数据,用来给字符串缓存hashCode的。

2,为什么#2处用的是31?

这里引用《Effective Java》中的原话解答下,

The value 31 was chosen because it is an odd prime. 
If it were even and the multiplication overflowed, 
information would be lost, as multiplication by 2 is equivalent to shifting. 
The advantage of using a prime is less clear, but it is traditional.
A nice property of 31 is that the multiplication can be replaced by a shift 
and a subtraction for better performance: 31 * i == (i << 5) - i. 
Modern VMs do this sort of optimization automatically.
// 简单翻译下
选择数字31是因为它是一个奇质数,如果选择一个偶数会在乘法运算中产生溢出,
导致数值信息丢失,因为乘二相当于移位运算。选择质数的优势并不是特别的明显,
但这是一个传统。同时,数字31有一个很好的特性,即乘法运算可以被移位和减法运算取代,
来获取更好的性能:31 * i == (i << 5) - i,现代的 Java 虚拟机可以自动的完成这个优化。

3, equals()方法,String同样重写了Object的equals方法,源码及备注如下:

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
        // 同一个对象返回true
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])
                    return false;
                i++;
            }
            return true;
            // 字符串值相同,返回true
        }
    }
    return false;
}

二,String为什么要设计成不可变的?

首先看下源码的定义:

public final class String // #1 类被final修饰,不可被继承
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[]; // #2 值被final修饰,数据不可变

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    // region 省略其他方法
    // *************
    // endregion
}

之所以设置成不可变的有以下三点考虑,

1,字符串常量池的需要

String值不可变,是字符串常量池得以实现的必要条件。

2,确保哈希值唯一性

使得HasnMap类的Map容器方便实现key-value功能。

3,基于安全考虑

若可变,容易被任意修改。针对需要改变的场景,可考虑使用StringBuilder和StringBuffer。

三,同StringBuilder, StringBuffer的区别与联系

总结,探讨完String的那些小秘密后,接下来我们共同探讨下Java中容器那些事,比如List, Set, Map等数据结构的设计及实践。