Why String is Immutable in Java? – 为什么Java中的字符串是不可变的?

最后修改: 2018年 8月 8日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In Java, Strings are immutable. An obvious question that is quite prevalent in interviews is “Why Strings are designed as immutable in Java?”

在Java中,字符串是不可变的。在采访中,一个很明显的问题是:”为什么Java中的字符串被设计成不可变的?”

James Gosling, the creator of Java, was once asked in an interview when should one use immutables, to which he answers:

Java的创造者James Gosling,曾经在一次采访中被问到什么时候应该使用不可变量,他回答说。

I would use an immutable whenever I can.

只要有可能,我就会使用不可变的。

He further supports his argument stating features that immutability provides, such as caching, security, easy reuse without replication, etc.

他进一步支持他的论点,指出不可更改性所提供的功能,如缓存、安全、无需复制即可轻松重用等。

In this tutorial, we’ll further explore why the Java language designers decided to keep String immutable.

在本教程中,我们将进一步探讨为什么Java语言设计者决定保持String不可变。

2. What Is an Immutable Object?

2.什么是不可改变的对象?

An immutable object is an object whose internal state remains constant after it has been entirely created. This means that once the object has been assigned to a variable, we can neither update the reference nor mutate the internal state by any means.

不可变的对象是一个对象,其内部状态在完全创建后保持不变。这意味着,一旦该对象被分配给一个变量,我们既不能更新引用,也不能通过任何方式改变内部状态。

We have a separate article that discusses immutable objects in detail. For more information, read the Immutable Objects in Java article.

我们有一篇单独的文章,详细讨论了不可变型对象。欲了解更多信息,请阅读Java中的不可变对象文章。

3. Why Is String Immutable in Java?

3.为什么String在Java中是不可变的?

The key benefits of keeping this class as immutable are caching, security, synchronization, and performance.

将这个类保持为不可变的关键好处是缓存、安全、同步和性能。

Let’s discuss how these things work.

让我们讨论一下这些东西是如何工作的。

3.1. Introduce to String Pool

3.1.引入String Pool

The String is the most widely used data structure. Caching the String literals and reusing them saves a lot of heap space because different String variables refer to the same object in the String pool. String intern pool serves exactly this purpose.

String是使用最广泛的数据结构。缓存String字面意义并重用它们可以节省大量的堆空间,因为不同的String变量会引用String池中的同一个对象。String实习池正是为此服务的。

Java String Pool is the special memory region where Strings are stored by the JVM. Since Strings are immutable in Java, the JVM optimizes the amount of memory allocated for them by storing only one copy of each literal String in the pool. This process is called interning:

Java字符串池是JVM存储字符串的特殊内存区域。由于字符串在Java中是不可改变的,JVM通过在池中只存储每个字面字符串的一个副本来优化分配给它们的内存量。这个过程被称为interning。

String s1 = "Hello World";
String s2 = "Hello World";
         
assertThat(s1 == s2).isTrue();

Because of the presence of the String pool in the preceding example, two different variables are pointing to same String object from the pool, thus saving crucial memory resource.

由于前面的例子中存在String池,两个不同的变量都指向池中的同一个String对象,从而节省了关键的内存资源。

Why String Is Immutable In Java

We have a separate article dedicated to Java String Pool. For more information, head on over to that article.

我们有一篇单独的文章专门介绍Java String Pool。欲了解更多信息,请前往该文章

3.2. Security

3.2.安全

The String is widely used in Java applications to store sensitive pieces of information like usernames, passwords, connection URLs, network connections, etc. It’s also used extensively by JVM class loaders while loading classes.

String在Java应用程序中被广泛用于存储敏感信息,如用户名、密码、连接URL、网络连接等。它也被JVM类加载器在加载类时广泛使用。

Hence securing String class is crucial regarding the security of the whole application in general. For example, consider this simple code snippet:

因此,确保String类的安全对于整个应用程序的安全至关重要。例如,考虑这个简单的代码片段。

void criticalMethod(String userName) {
    // perform security checks
    if (!isAlphaNumeric(userName)) {
        throw new SecurityException(); 
    }
	
    // do some secondary tasks
    initializeDatabase();
	
    // critical task
    connection.executeUpdate("UPDATE Customers SET Status = 'Active' " +
      " WHERE UserName = '" + userName + "'");
}

In the above code snippet, let’s say that we received a String object from an untrustworthy source. We’re doing all necessary security checks initially to check if the String is only alphanumeric, followed by some more operations.

在上面的代码片段中,假设我们从一个不可信的来源收到一个String对象。我们首先进行所有必要的安全检查,以检查String是否只是字母数字,然后再进行一些操作。

Remember that our unreliable source caller method still has reference to this userName object.

记住,我们不可靠的源调用者方法仍然有对这个userName对象的引用。

If Strings were mutable, then by the time we execute the update, we can’t be sure that the String we received, even after performing security checks, would be safe. The untrustworthy caller method still has the reference and can change the String between integrity checks. Thus making our query prone to SQL injections in this case. So mutable Strings could lead to degradation of security over time.

如果字符串是可变的,那么当我们执行更新时,我们无法确定我们收到的字符串,即使在执行安全检查后,也是安全的。不值得信任的调用者方法仍然拥有引用,并且可以在完整性检查之间改变字符串。因此,在这种情况下,我们的查询很容易被SQL注入。因此,易变的String可能会导致安全性随着时间的推移而降低。

It could also happen that the String userName is visible to another thread, which could then change its value after the integrity check.

也可能发生的情况是,String userName对另一个线程是可见的,然后它可能在完整性检查后改变其值。

In general, immutability comes to our rescue in this case because it’s easier to operate with sensitive code when values don’t change because there are fewer interleavings of operations that might affect the result.

一般来说,在这种情况下,不可变性会对我们有所帮助,因为当值不发生变化时,对敏感代码的操作会更容易,因为可能影响结果的操作交错较少。

3.3. Synchronization

3.3.同步化

Being immutable automatically makes the String thread safe since they won’t be changed when accessed from multiple threads.

不可变的特性自动使String线程安全,因为它们在被多个线程访问时不会被改变。

Hence immutable objects, in general, can be shared across multiple threads running simultaneously. They’re also thread-safe because if a thread changes the value, then instead of modifying the same, a new String would be created in the String pool. Hence, Strings are safe for multi-threading.

因此,不可变的对象,一般来说,可以在同时运行的多个线程中共享。它们也是线程安全的,因为如果一个线程改变了值,那么将在String池中创建一个新的String,而不是修改同一个值。因此,字符串对于多线程是安全的。

3.4. Hashcode Caching

3.4.哈希码缓存

Since String objects are abundantly used as a data structure, they are also widely used in hash implementations like HashMap, HashTable, HashSet, etc. When operating upon these hash implementations, hashCode() method is called quite frequently for bucketing.

由于String对象被大量用作数据结构,它们也被广泛用于散列实现,如HashMapHashTableHashSet,等等。当对这些哈希实现进行操作时,hashCode()方法会被频繁调用,以进行分桶。

The immutability guarantees Strings that their value won’t change. So the hashCode() method is overridden in String class to facilitate caching, such that the hash is calculated and cached during the first hashCode() call and the same value is returned ever since.

不变性保证了字符串的值不会改变。因此,hashCode()方法在String类中被重载,以方便缓存,这样,在第一次调用hashCode()时,哈希值被计算和缓存,此后返回相同的值。

This, in turn, improves the performance of collections that uses hash implementations when operated with String objects.

这反过来又提高了使用哈希实现的集合在操作String对象时的性能。

On the other hand, mutable Strings would produce two different hashcodes at the time of insertion and retrieval if contents of String was modified after the operation, potentially losing the value object in the Map.

另一方面,如果String的内容在操作后被修改,可变的String将在插入和检索时产生两个不同的哈希代码,从而可能丢失Map中的值对象。

3.5. Performance

3.5.性能

As we saw previously, String pool exists because Strings are immutable. In turn, it enhances the performance by saving heap memory and faster access of hash implementations when operated with Strings.

正如我们之前看到的,String池的存在是因为String是不可变的。反过来,当使用字符串操作时,它通过节省堆内存和更快地访问哈希实现来增强性能。

Since String is the most widely used data structure, improving the performance of String have a considerable effect on improving the performance of the whole application in general.

由于String是最广泛使用的数据结构,改善String的性能对改善整个应用程序的性能有相当大的影响。

4. Conclusion

4.结论

Through this article, we can conclude that Strings are immutable precisely so that their references can be treated as a normal variable and one can pass them around, between methods and across threads, without worrying about whether the actual String object it’s pointing to will change.

通过这篇文章,我们可以得出这样的结论:字符串是不可变的,正是为了使它们的引用可以被视为一个普通的变量,人们可以在方法之间和跨线程之间传递它们,而不必担心它所指向的实际字符串对象是否会改变。

We also learned as what might be the other reasons that prompted the Java language designers to make this class as immutable.

我们还了解到,促使Java语言设计者将这个类作为不可变的其他原因。