Common Concurrency Pitfalls in Java – Java中常见的并发性陷阱

最后修改: 2019年 11月 23日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.绪论

In this tutorial, we’re going to see some of the most common concurrency problems in Java. We’ll also learn how to avoid them and their main causes.

在本教程中,我们将看到一些Java中最常见的并发性问题。我们还将学习如何避免它们以及它们的主要原因。

2. Using Thread-Safe Objects

2.使用线程安全的对象

2.1. Sharing Objects

2.1.共享对象

Threads communicate primarily by sharing access to the same objects. So, reading from an object while it changes can give unexpected results. Also, concurrently changing an object can leave it in a corrupted or inconsistent state.

线程主要通过共享对相同对象的访问进行通信。因此,在一个对象发生变化时从该对象中读取数据可能会产生意想不到的结果。此外,同时改变一个对象可能会使其处于损坏或不一致的状态。

The main way we can avoid such concurrency issues and build reliable code is to work with immutable objects. This is because their state cannot be modified by the interference of multiple threads.

我们可以避免这种并发问题并构建可靠代码的主要方法是使用immutable对象。这是因为它们的状态不能被多个线程的干扰所修改。

However, we can’t always work with immutable objects. In these cases, we have to find ways to make our mutable objects thread-safe.

然而,我们不能总是用不可变的对象工作。在这些情况下,我们必须想办法使我们的可变对象成为线程安全的。

2.2. Making Collections Thread-Safe

2.2.使集合成为线程安全的

Like any other object, collections maintain state internally. This could be altered by multiple threads changing the collection concurrently. So, one way we can safely work with collections in a multithreaded environment is to synchronize them:

像任何其他对象一样,集合在内部保持状态。这可能会被多个线程同时改变集合的状态而改变。因此,我们可以在多线程环境中安全地使用集合的一个方法是同步它们

Map<String, String> map = Collections.synchronizedMap(new HashMap<>());
List<Integer> list = Collections.synchronizedList(new ArrayList<>());

In general, synchronization helps us to achieve mutual exclusion. More specifically, these collections can be accessed by only one thread at a time. Thus, we can avoid leaving collections in an inconsistent state.

一般来说,同步化可以帮助我们实现互斥。更具体地说,这些集合一次只能被一个线程访问。因此,我们可以避免让集合处于不一致的状态。

2.3. Specialist Multithreaded Collections

2.3.专业的多线程集合

Now let’s consider a scenario where we need more reads than writes. By using a synchronized collection, our application can suffer major performance consequences. If two threads want to read the collection at the same time, one has to wait until the other finishes.

现在让我们来考虑一个场景,即我们需要更多的读而不是写。通过使用同步集合,我们的应用程序可能会遭受重大的性能影响。如果两个线程想同时读取该集合,其中一个必须等待另一个完成。

For this reason, Java provides concurrent collections such as CopyOnWriteArrayList and ConcurrentHashMap that can be accessed simultaneously by multiple threads:

为此,Java提供了一些并发的集合,如CopyOnWriteArrayListConcurrentHashMap,可以被多个线程同时访问。

CopyOnWriteArrayList<String> list = new CopyOnWriteArrayList<>();
Map<String, String> map = new ConcurrentHashMap<>();

The CopyOnWriteArrayList achieves thread-safety by creating a separate copy of the underlying array for mutative operations like add or remove. Although it has a poorer performance for write operations than a Collections.synchronizedList, it provides us with better performance when we need significantly more reads than writes.

CopyOnWriteArrayList通过为添加或删除等突变操作创建底层数组的单独副本来实现线程安全。尽管它在写操作方面的性能比Collections.synchronizedList差,但当我们需要的读操作明显多于写操作时,它为我们提供了更好的性能。

ConcurrentHashMap is fundamentally thread-safe and is more performant than the Collections.synchronizedMap wrapper around a non-thread-safe Map. It’s actually a thread-safe map of thread-safe maps, allowing different activities to happen simultaneously in its child maps.

ConcurrentHashMap从根本上说是线程安全的,比Collections.synchronizedMap对非线程安全的Map的包装更具有性能。它实际上是一个线程安全地图的线程安全地图,允许不同的活动同时发生在其子地图中。

2.4. Working with Non-Thread-Safe Types

2.4.与非线程安全类型一起工作

We often use built-in objects like SimpleDateFormat to parse and format date objects. The SimpleDateFormat class mutates its internal state while doing its operations.

我们经常使用像SimpleDateFormat的内置对象来解析和格式化日期对象。SimpleDateFormat类在进行操作时,会突变其内部状态。

We need to be very careful with them because they are not thread-safe. Their state can become inconsistent in a multithreaded application due to things like race conditions.

我们需要对它们非常小心,因为它们不是线程安全的。在多线程应用程序中,由于竞赛条件等原因,它们的状态可能变得不一致。

So, how can we use the SimpleDateFormat safely? We have several options:

那么,我们如何才能安全地使用SimpleDateFormat?我们有几个选择。

  • Create a new instance of SimpleDateFormat every time it’s used
  • Restrict the number of objects created by using a ThreadLocal<SimpleDateFormat> object. It guarantees that each thread will have its own instance of SimpleDateFormat
  • Synchronize concurrent access by multiple threads with the synchronized keyword or a lock

SimpleDateFormat is just one example of this. We can use these techniques with any non-thread-safe type.

SimpleDateFormat只是其中的一个例子。我们可以对任何非线程安全的类型使用这些技术。

3. Race Conditions

3.比赛条件

A race condition occurs when two or more threads access shared data and they try to change it at the same time. Thus, race conditions can cause runtime errors or unexpected outcomes.

当两个或多个线程访问共享数据,并且它们试图同时改变数据时,就会发生竞赛条件。

3.1. Race Condition Example

3.1.竞赛条件实例

Let’s consider the following code:

让我们考虑一下下面的代码。

class Counter {
    private int counter = 0;

    public void increment() {
        counter++;
    }

    public int getValue() {
        return counter;
    }
}

The Counter class is designed so that each invocation of the increment method will add 1 to the counter. However, if a Counter object is referenced from multiple threads, the interference between threads may prevent this from happening as expected.

Counter类的设计是这样的:每次调用增量方法都会给counter添加1。然而,如果一个Counter对象被多个线程引用,线程间的干扰可能会阻止这种情况如期发生。

We can decompose the counter++ statement into 3 steps:

我们可以将counter++语句分解为3个步骤。

  • Retrieve the current value of counter
  • Increment the retrieved value by 1
  • Store the incremented value back in counter

Now, let’s suppose two threads, thread1 and thread2, invoke the increment method at the same time. Their interleaved actions might follow this sequence:

现在,让我们假设两个线程,thread1thread2,同时调用增量方法。他们的交错动作可能遵循这样的顺序。

  • thread1 reads the current value of counter; 0
  • thread2 reads the current value of counter; 0
  • thread1 increments the retrieved value; the result is 1
  • thread2 increments the retrieved value; the result is 1
  • thread1 stores the result in counter; the result is now 1
  • thread2 stores the result in counter; the result is now 1

We expected the value of the counter to be 2, but it was 1.

我们预计计数器的值是2,但它是1。

3.2. A Synchronized-Based Solution

3.2.一个基于同步的解决方案

We can fix the inconsistency by synchronizing the critical code:

我们可以通过同步关键代码来解决不一致的问题。

class SynchronizedCounter {
    private int counter = 0;

    public synchronized void increment() {
        counter++;
    }

    public synchronized int getValue() {
        return counter;
    }
}

Only one thread is allowed to use the synchronized methods of an object at any one time, so this forces consistency in the reading and writing of the counter.

在任何时候,只允许一个线程使用一个对象的synchronized方法,所以这迫使counter的读写一致。

3.3. A Built-In Solution

3.3.一个内置的解决方案

We can replace the above code with a built-in AtomicInteger object. This class offers, among others, atomic methods for incrementing an integer and is a better solution than writing our own code. Therefore, we can call its methods directly without the need for synchronization:

我们可以用一个内置的AtomicInteger对象来代替上面的代码。这个类提供了,除其他外,用于递增整数的原子方法,是比我们自己编写代码更好的解决方案。因此,我们可以直接调用它的方法,而不需要同步。

AtomicInteger atomicInteger = new AtomicInteger(3);
atomicInteger.incrementAndGet();

In this case, the SDK solves the problem for us. Otherwise, we could’ve also written our own code, encapsulating the critical sections in a custom thread-safe class. This approach helps us to minimize the complexity and to maximize the reusability of our code.

在这种情况下,SDK为我们解决了这个问题。否则,我们也可以编写自己的代码,将关键部分封装在一个自定义的线程安全类中。这种方法可以帮助我们将复杂性降到最低,并最大限度地提高代码的可重复使用性。

4. Race Conditions Around Collections

4.收藏品周围的竞赛条件

4.1. The Problem

4.1.问题

Another pitfall we can fall into is to think that synchronized collections offer us more protection than they actually do.

我们可能陷入的另一个陷阱是,认为同步集合为我们提供的保护比实际的要多。

Let’s examine the code below:

让我们研究一下下面的代码。

List<String> list = Collections.synchronizedList(new ArrayList<>());
if(!list.contains("foo")) {
    list.add("foo");
}

Every operation of our list is synchronized, but any combinations of multiple method invocations are not synchronized. More specifically, between the two operations, another thread can modify our collection leading to undesired results.

我们列表的每个操作都是同步的,但多个方法调用的任何组合都是不同步的。更具体地说,在两个操作之间,另一个线程可以修改我们的集合,导致不想要的结果。

For example, two threads could enter the if block at the same time and then update the list, each thread adding the foo value to the list.

例如,两个线程可以同时进入if块,然后更新列表,每个线程将foo值添加到列表中。

4.2. A Solution for Lists

4.2.列表的解决方案

We can protect the code from being accessed by more than one thread at a time using synchronization:

我们可以通过同步来保护代码不被多个线程同时访问。

synchronized (list) {
    if (!list.contains("foo")) {
        list.add("foo");
    }
}

Rather than adding the synchronized keyword to the functions, we’ve created a critical section concerning list, which only allows one thread at a time to perform this operation.

我们没有在函数中添加synchronized关键字,而是创建了一个关于list,的关键部分,每次只允许一个线程执行这个操作。

We should note that we can use synchronized(list) on other operations on our list object, to provide a guarantee that only one thread at a time can perform any of our operations on this object.

我们应该注意到,我们可以在列表对象的其他操作上使用synchronized(list),以提供保证每次只有一个线程可以对该对象执行任何操作

4.3. A Built-In Solution for ConcurrentHashMap

4.3.ConcurrentHashMap的内置解决方案

Now, let’s consider using a map for the same reason, namely adding an entry only if it’s not present.

现在,让我们考虑出于同样的原因使用地图,即只在一个条目不存在的情况下添加它。

The ConcurrentHashMap offers a better solution for this type of problem. We can use its atomic putIfAbsent method:

ConcurrentHashMap为这类问题提供了一个更好的解决方案。我们可以使用它的原子性putIfAbsent方法。

Map<String, String> map = new ConcurrentHashMap<>();
map.putIfAbsent("foo", "bar");

Or, if we want to compute the value, its atomic computeIfAbsent method:

或者,如果我们想计算值,它的原子computeIfAbsent方法

map.computeIfAbsent("foo", key -> key + "bar");

We should note that these methods are part of the interface to Map where they offer a convenient way to avoid writing conditional logic around insertion. They really help us out when trying to make multi-threaded calls atomic.

我们应该注意到,这些方法是Map接口的一部分,它们为避免围绕插入编写条件逻辑提供了便利。当我们试图使多线程调用原子化时,它们确实能帮助我们解决。

5. Memory Consistency Issues

5.内存一致性问题

Memory consistency issues occur when multiple threads have inconsistent views of what should be the same data.

当多个线程对应该是相同的数据有不一致的看法时,就会出现内存一致性问题。

In addition to the main memory, most modern computer architectures are using a hierarchy of caches (L1, L2, and L3 caches) to improve the overall performance. Thus, any thread may cache variables because it provides faster access compared to the main memory.

除了主内存之外,大多数现代计算机架构都在使用层次分明的缓存(L1、L2和L3缓存)来提高整体性能。因此,任何线程都可以缓存变量,因为与主内存相比,它可以提供更快的访问速度。

5.1. The Problem

5.1.该问题

Let’s recall our Counter example:

让我们回顾一下我们的Counter例子。

class Counter {
    private int counter = 0;

    public void increment() {
        counter++;
    }

    public int getValue() {
        return counter;
    }
}

Let’s consider the scenario where thread1 increments the counter and then thread2 reads its value. The following sequence of events might happen:

让我们考虑这样的情况:thread1增加counter,然后thread2读取其值。下面的事件序列可能会发生。

  • thread1 reads the counter value from its own cache; counter is 0
  • thread1 increments the counter and writes it back to its own cache; counter is 1
  • thread2 reads the counter value from its own cache; counter is 0

Of course, the expected sequence of events could happen too and the thread2 will read the correct value (1), but there is no guarantee that changes made by one thread will be visible to other threads every time.

当然,预期的事件序列也可能发生,thread2将读取正确的值(1),但不能保证一个线程所做的改变每次都能被其他线程看到。

5.2. The Solution

5.2.解决方案

In order to avoid memory consistency errors, we need to establish a happens-before relationship. This relationship is simply a guarantee that memory updates by one specific statement are visible to another specific statement.

为了避免内存一致性错误,我们需要建立一个 happens-before 关系。这种关系只是保证一个特定语句的内存更新对另一个特定语句是可见的。

There are several strategies that create happens-before relationships. One of them is synchronization, which we’ve already looked at.

有几种策略可以创建发生前的关系。其中之一是同步化,我们已经看过了。

Synchronization ensures both mutual exclusion and memory consistency. However, this comes with a performance cost.

同步化确保了相互排斥和内存一致性。然而,这需要付出性能成本。

We can also avoid memory consistency problems by using the volatile keyword. Simply put, every change to a volatile variable is always visible to other threads.

我们还可以通过使用volatile关键字来避免内存一致性问题。简单地说,对易失性变量的每一个改变对其他线程总是可见的。

Let’s rewrite our Counter example using volatile:

让我们用volatile重写我们的Counter例子。

class SyncronizedCounter {
    private volatile int counter = 0;

    public synchronized void increment() {
        counter++;
    }

    public int getValue() {
        return counter;
    }
}

We should note that we still need to synchronize the increment operation because volatile doesn’t ensure us mutual exclusion. Using simple atomic variable access is more efficient than accessing these variables through synchronized code.

我们应该注意到,我们仍然需要同步增量操作,因为volatile并不能确保我们相互排斥。使用简单的原子变量访问比通过同步代码访问这些变量更有效率。

5.3. Non-Atomic long and double Values

5.3.非原子的longdouble

So, if we read a variable without proper synchronization, we may see a stale value. For long and double values, quite surprisingly, it’s even possible to see completely random values in addition to stale ones.

因此,如果我们在没有适当同步的情况下读取一个变量,我们可能会看到一个陈旧的值。F值,相当令人惊讶的是,除了陈旧的值之外,甚至有可能看到完全随机的值。

According to JLS-17, JVM may treat 64-bit operations as two separate 32-bit operations. Therefore, when reading a long or double value, it’s possible to read an updated 32-bit along with a stale 32-bit. Consequently, we may observe random-looking long or double values in concurrent contexts.

根据JLS-17,JVM可能将64位操作视为两个独立的32位操作。因此,在读取long double 值时,有可能在读取一个更新的32位的同时也读取一个过时的32位。因此,我们可能会在并发的情况下观察到随机的longdouble值。

On the other hand, writes and reads of volatile long and double values are always atomic.

另一方面,对易失性longdouble值的写入和读取总是原子的。

6. Misusing Synchronize

6.误用同步化

The synchronization mechanism is a powerful tool to achieve thread-safety. It relies on the use of intrinsic and extrinsic locks. Let’s also remember the fact that every object has a different lock and only one thread can acquire a lock at a time.

同步机制是实现线程安全的一个强大工具。它依赖于内在锁和外在锁的使用。让我们也记住这样一个事实:每个对象都有一个不同的锁,而且一次只能有一个线程获得一个锁。

However, if we don’t pay attention and carefully choose the right locks for our critical code, unexpected behavior can occur.

然而,如果我们不注意,不仔细为我们的关键代码选择正确的锁,就会出现意想不到的行为。

6.1. Synchronizing on this Reference

6.1.在this参考文献上进行同步化

The method-level synchronization comes as a solution to many concurrency issues. However, it can also lead to other concurrency issues if it’s overused. This synchronization approach relies on the this reference as a lock, which is also called an intrinsic lock.

方法级同步是作为许多并发性问题的解决方案。然而,如果过度使用,它也会导致其他并发问题。这种同步方法依赖于this引用作为锁,这也被称为内在锁。

We can see in the following examples how a method-level synchronization can be translated into a block-level synchronization with the this reference as a lock.

我们可以在下面的例子中看到,方法级的同步可以转化为以this引用为锁的块级同步。

These methods are equivalent:

这些方法是等同的。

public synchronized void foo() {
    //...
}
public void foo() {
    synchronized(this) {
      //...
    }
}

When such a method is called by a thread, other threads cannot concurrently access the object. This can reduce concurrency performance as everything ends up running single-threaded. This approach is especially bad when an object is read more often than it is updated.

当这样的方法被一个线程调用时,其他线程不能并发地访问该对象。这可能会降低并发性能,因为所有东西最终都是单线程运行的。当一个对象被读取的次数多于被更新的次数时,这种方法就特别糟糕。

Moreover, a client of our code might also acquire the this lock. In the worst-case scenario, this operation can lead to a deadlock.

此外,我们代码的客户端也可能获得this锁。在最坏的情况下,这种操作会导致死锁。

6.2. Deadlock

6.2.死锁

Deadlock describes a situation where two or more threads block each other, each waiting to acquire a resource held by some other thread.

死锁描述了一种情况,即两个或多个线程相互阻塞,每个线程都在等待获取由其他线程持有的资源。

Let’s consider the example:

让我们考虑一下这个例子。

public class DeadlockExample {

    public static Object lock1 = new Object();
    public static Object lock2 = new Object();

    public static void main(String args[]) {
        Thread threadA = new Thread(() -> {
            synchronized (lock1) {
                System.out.println("ThreadA: Holding lock 1...");
                sleep();
                System.out.println("ThreadA: Waiting for lock 2...");

                synchronized (lock2) {
                    System.out.println("ThreadA: Holding lock 1 & 2...");
                }
            }
        });
        Thread threadB = new Thread(() -> {
            synchronized (lock2) {
                System.out.println("ThreadB: Holding lock 2...");
                sleep();
                System.out.println("ThreadB: Waiting for lock 1...");

                synchronized (lock1) {
                    System.out.println("ThreadB: Holding lock 1 & 2...");
                }
            }
        });
        threadA.start();
        threadB.start();
    }
}

In the above code we can clearly see that first threadA acquires lock1 and threadB acquires lock2. Then, threadA tries to get the lock2 which is already acquired by threadB and threadB tries to get the lock1 which is already acquired by threadA. So, neither of them will proceed meaning they are in a deadlock.

在上面的代码中,我们可以清楚地看到,首先threadA获取了lock1threadB获取了lock2。然后,threadA试图获得已经被threadB获取的lock2threadB试图获得已经被threadA获取的lock1。因此,他们都不会继续进行,这意味着他们处于死锁状态。

We can easily fix this issue by changing the order of locks in one of the threads.

我们可以通过改变其中一个线程的锁的顺序来轻松解决这个问题。

We should note that this is just one example, and there are many others that can lead to a deadlock.

我们应该注意到,这只是一个例子,还有很多其他的例子可以导致僵局。

7. Conclusion

7.结语

In this article, we explored several examples of concurrency issues that we’re likely to encounter in our multithreaded applications.

在这篇文章中,我们探讨了我们在多线程应用中可能遇到的几个并发问题的例子。

First, we learned that we should opt for objects or operations that are either immutable or thread-safe.

首先,我们了解到,我们应该选择不可变或线程安全的对象或操作。

Then, we saw several examples of race conditions and how we can avoid them using the synchronization mechanism. Furthermore, we learned about memory-related race conditions and how to avoid them.

然后,我们看到了几个竞赛条件的例子,以及我们如何使用同步机制来避免它们。此外,我们还了解了与内存有关的竞赛条件以及如何避免它们。

Although the synchronization mechanism helps us to avoid many concurrency issues, we can easily misuse it and create other issues. For this reason, we examined several problems we might face when this mechanism is badly used.

尽管同步机制帮助我们避免了许多并发问题,但我们也很容易滥用它,造成其他问题。出于这个原因,我们研究了当这个机制被严重使用时我们可能面临的几个问题。

As usual, all the examples used in this article are available over on GitHub.

像往常一样,本文中使用的所有例子都可以在GitHub上找到