Java String——源码分析重新认识

369 阅读5分钟

关于String

对于任何编程语言来说,接触字符串都是不可避免,Java也不例外。Java中String类位于java.lang包下,是整个Java语言的基石。同时String类使用final关键词修饰,意味着外部调用者无法通过继承和重写来更改其功能。Java中的字符串与语言相比,也有其特殊性。

本文深入地理解Java字符串,主要内容有:

  • String的初始化

  • String的不变性

  • “+”操作符

  • String类源码

String初始化

首先要强调的是,String并不是Java中的基础类型,它也是一个对象。在源代码层面来说,String有多种不同的初始化方法。

字面量法:


String a = "abc";

String b = "hello world";

这种方法首先从常量池中查找是否有相同值的字符串对象,如果有,则直接将对象地址赋予引用变量;如果没有,在首先在常量池区域中创建一个新的字符串对象,然后将地址赋予引用变量。

构造方法法:


String a = new String("abc");

String b = new String("hello world");

String类的构造方法有:


public String() {} // 构造空串(注意与null的区别)

public String(String original) {} // 基于另外一个字符串构造一个新字符串对象

public String(char value[]) {} // 使用char数组构造字符串

public String(char value[], int offset, int count){} // 使用char数组以及偏移参数构造

public String(int[] codePoints, int offset, int count) {} // 基于Uncode编码数组以及偏移量构造

这种初始化方法与一般对象的初始化方法完全一样。与字面量法不同的是,每次调用构造方法都会在堆内存中创建一个新的字符串对象。下面的例子可以清楚地显示它们的区别:


public class StringEqual {

public static void main(String[] args) {

String a1 = "123";

String b1 = "123";

System.out.println(a1 == b1); // true

String a2 = new String("123");

String b2 = new String("123");

System.out.println(a2 == b2); // false

}

}

image.png 注:常量池(准确地说是运行时常量池),在JDK1.6及以前都是方法区中的一部分,在JDK1.7之后被移入堆区,用来存放编译时生成的各种字面量和符号引用

a1 == b1为true表明a1和b1指向同一个对象,而a2和b2分别指向不同的对象。

String的不可变性

为什么String要设计为不可变呢?

主要原因如下:

  1. 字符串池(String pool)的需求。之前已经说过,通过字面量发初始化一个Java字符串时,会将这个字符串保存在常量池中。如果定义了另外一个相同值的字符串变量,则直接指向之前初始化的那个对象。如果字符串是可变的,改变另一个字符串变量,就会使另一个字符串变量指向错误的值。

  2. 缓存字符串hashcode码的需要。字符串的hashcode是经常被使用的,字符串的不变性确保了hashcode的值一直是一样的,在需要hashcode时,就不需要每次都计算,这样会很高效。

  3. 出于安全性考虑。字符串经常作为网络连接、数据库连接等参数,不可变就可以保证连接的安全性。

“+”操作符

  • “+”操作符的操作实际上是通过StringBuilder来实现的(源码介绍中有这么一段话:String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method.)

  • +或substring等操作产生的结果并不是共享的,因为“+”和substring产生的结果并不是字面量,而是String对象,他们都是由String类的构造方法产生的对象

String类源码

在String源码的介绍中,有这么一段话:


The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class.

Strings are constant; their values cannot be changed after they are created. String buffers support mutable strings. Because String objects are immutable they can be shared. For example:

String str = "abc";

is equivalent to:

char data[] = {'a', 'b', 'c'};

String str = new String(data);

Here are some more examples of how strings can be used:

System.out.println("abc");

String cde = "cde";

System.out.println("abc" + cde);

String c = "abc".substring(2,3);

String d = cde.substring(1, 2);

The class String includes methods for examining individual characters of the sequence, for comparing strings, for searching strings, for extracting substrings, and for creating a copy of a string with all characters translated to uppercase or to lowercase. Case mapping is based on the Unicode Standard version specified by the Character class.

The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java. For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.

Unless otherwise noted, passing a null argument to a constructor or method in this class will cause a NullPointerException to be thrown.

A String represents a string in the UTF-16 format in which supplementary characters are represented by surrogate pairs (see the section Unicode Character Representations in the Character class for more information). Index values refer to char code units, so a supplementary character uses two positions in a String.

String类有一个重要的的成员变量


private final char value[];

String类的很多操作都是借助于value这个字符数组来实现的,比如截取字符串中的一段子字符串的操作substring:


public String substring(int beginIndex) {

if (beginIndex < 0) {

throw new StringIndexOutOfBoundsException(beginIndex);

}

int subLen = value.length - beginIndex;

if (subLen < 0) {

throw new StringIndexOutOfBoundsException(subLen);

}

return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);

}

以及StringBuilder的reverse方法的实现,都是借助的内部的char[]数组,利用双指针的思想进行字符串的翻转。

本文已参与「新人创作礼」活动, 一起开启掘金创作之路。