Guide to Java String Pool – Java字符串池指南

最后修改: 2017年 11月 21日


1. Overview


The String object is the most used class in the Java language.


In this quick article, we’ll explore the Java String Pool — the special memory region where Strings are stored by the JVM.

在这篇文章中,我们将探讨Java字符串池 – JVM存储字符串的特殊内存区域

2. String Interning


Thanks to the immutability of Strings in Java, the JVM can optimize the amount of memory allocated for them by storing only one copy of each literal String in the pool. This process is called interning.


When we create a String variable and assign a value to it, the JVM searches the pool for a String of equal value.


If found, the Java compiler will simply return a reference to its memory address, without allocating additional memory.


If not found, it’ll be added to the pool (interned) and its reference will be returned.


Let’s write a small test to verify this:


String constantString1 = "Baeldung";
String constantString2 = "Baeldung";

3. Strings Allocated Using the Constructor


When we create a String via the new operator, the Java compiler will create a new object and store it in the heap space reserved for the JVM.


Every String created like this will point to a different memory region with its own address.


Let’s see how this is different from the previous case:


String constantString = "Baeldung";
String newString = new String("Baeldung");

4. String Literal vs String Object

4.String Literal vs String Object

When we create a String object using the new() operator, it always creates a new object in heap memory. On the other hand, if we create an object using String literal syntax e.g. “Baeldung”, it may return an existing object from the String pool, if it already exists. Otherwise, it will create a new String object and put in the string pool for future re-use.

当我们使用new()操作符创建一个String对象时,它总是在堆内存中创建一个新对象。另一方面,如果我们使用String字面语法创建一个对象,例如 “Baeldung”,它可能会从String池中返回一个现有的对象,如果它已经存在的话。否则,它将创建一个新的String对象并放在字符串池中,以便将来再使用。

At a high level, both are the String objects, but the main difference comes from the point that new() operator always creates a new String object. Also, when we create a String using literal – it is interned.


This will be much more clear when we compare two String objects created using String literal and the new operator:


String first = "Baeldung"; 
String second = "Baeldung"; 
System.out.println(first == second); // True

In this example, the String objects will have the same reference.


Next, let’s create two different objects using new and check that they have different references:


String third = new String("Baeldung");
String fourth = new String("Baeldung"); 
System.out.println(third == fourth); // False

Similarly, when we compare a String literal with a String object created using new() operator using the == operator, it will return false:


String fifth = "Baeldung";
String sixth = new String("Baeldung");
System.out.println(fifth == sixth); // False

In general, we should use the String literal notation when possible. It is easier to read and it gives the compiler a chance to optimize our code.


5. Manual Interning


We can manually intern a String in the Java String Pool by calling the intern() method on the object we want to intern.


Manually interning the String will store its reference in the pool, and the JVM will return this reference when needed.


Let’s create a test case for this:


String constantString = "interned Baeldung";
String newString = new String("interned Baeldung");


String internedString = newString.intern();


6. Garbage Collection


Before Java 7, the JVM placed the Java String Pool in the PermGen space, which has a fixed size — it can’t be expanded at runtime and is not eligible for garbage collection.

在Java 7之前,JVM 将Java字符串池放在PermGen空间中,该空间有一个固定的大小–它不能在运行时扩展,也没有资格进行垃圾回收

The risk of interning Strings in the PermGen (instead of the Heap) is that we can get an OutOfMemory error from the JVM if we intern too many Strings.


From Java 7 onwards, the Java String Pool is stored in the Heap space, which is garbage collected by the JVM. The advantage of this approach is the reduced risk of OutOfMemory error because unreferenced Strings will be removed from the pool, thereby releasing memory.

从Java 7开始,Java字符串池被存储在Heap空间,由JVM进行垃圾收集这种方法的优点是减少了OutOfMemory错误的风险,因为未引用的字符串将被从池中移除,从而释放了内存。

7. Performance and Optimizations


In Java 6, the only optimization we can perform is increasing the PermGen space during the program invocation with the MaxPermSize JVM option:

在Java 6中,我们唯一可以进行的优化是在程序调用期间用MaxPermSize JVM选项增加PermGen空间。


In Java 7, we have more detailed options to examine and expand/reduce the pool size. Let’s see the two options for viewing the pool size:

在Java 7中,我们有更详细的选项来检查和扩大/减少池的大小。让我们来看看查看池子大小的两个选项。


If we want to increase the pool size in terms of buckets, we can use the StringTableSize JVM option:

如果我们想以桶为单位增加池的大小,我们可以使用StringTableSize JVM选项。


Prior to Java 7u40, the default pool size was 1009 buckets but this value was subject to a few changes in more recent Java versions. To be precise, the default pool size from Java 7u40 until Java 11 was 60013 and now it increased to 65536.

在Java 7u40之前,默认的池子大小是1009个桶,但这个值在最近的Java版本中会有一些变化。准确地说,从Java 7u40到Java 11的默认池大小是60013,现在增加到65536。

Note that increasing the pool size will consume more memory but has the advantage of reducing the time required to insert the Strings into the table.


8. A Note About Java 9

8.关于Java 9的说明

Until Java 8, Strings were internally represented as an array of characters – char[], encoded in UTF-16, so that every character uses two bytes of memory.

在Java 8之前,字符串在内部被表示为一个字符数组–char[],以UTF-16编码,因此每个字符使用两个字节的内存。

With Java 9 a new representation is provided, called Compact Strings. This new format will choose the appropriate encoding between char[] and byte[] depending on the stored content.

在Java 9中提供了一种新的表示方法,称为Compact Strings。这种新格式将根据存储的内容在char[]byte[]之间选择适当的编码。

Since the new String representation will use the UTF-16 encoding only when necessary, the amount of heap memory will be significantly lower, which in turn causes less Garbage Collector overhead on the JVM.


9. Conclusion


In this guide, we showed how the JVM and the Java compiler optimize memory allocations for String objects via the Java String Pool.

在本指南中,我们展示了JVM和Java编译器如何通过Java String Pool优化String对象的内存分配。

All code samples used in the article are available over on GitHub.
