A Guide to ConcurrentMap – 并行地图指南

最后修改: 2017年 2月 2日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Maps are naturally one of the most widely style of Java collection.

Maps自然是Java集合中最广泛的风格之一。

And, importantly, HashMap is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations.

而且,重要的是,HashMap并不是一个线程安全的实现,而Hashtable确实通过同步操作提供了线程安全。

Even though Hashtable is thread safe, it is not very efficient. Another fully synchronized Map, Collections.synchronizedMap, does not exhibit great efficiency either. If we want thread-safety with high throughput under high concurrency, these implementations aren’t the way to go.

即使Hashtable是线程安全的,它的效率也不是很高。另一个完全同步的Map,Collections.synchronizedMap,也没有表现出很高的效率。如果我们想在高并发情况下获得高吞吐量的线程安全,那么这些实现并不是我们要走的路。

To solve the problem, the Java Collections Framework introduced ConcurrentMap in Java 1.5.

为了解决这个问题,Java集合框架 Java 1.5中引入了ConcurrentMap

The following discussions are based on Java 1.8.

下面的讨论是基于Java 1.8

2. ConcurrentMap

2.ConcurrentMap

ConcurrentMap is an extension of the Map interface. It aims to provides a structure and guidance to solving the problem of reconciling throughput with thread-safety.

ConcurrentMapMap接口的一个扩展。它的目的是为解决吞吐量与线程安全之间的协调问题提供一个结构和指导。

By overriding several interface default methods, ConcurrentMap gives guidelines for valid implementations to provide thread-safety and memory-consistent atomic operations.

通过重写几个接口的默认方法,ConcurrentMap为有效的实现提供了指导方针,以提供线程安全和内存一致的原子操作。

Several default implementations are overridden, disabling the null key/value support:

几个默认的实现被重写,禁用null键/值支持。

  • getOrDefault
  • forEach
  • replaceAll
  • computeIfAbsent
  • computeIfPresent
  • compute
  • merge

The following APIs are also overridden to support atomicity, without a default interface implementation:

以下APIs也被重载以支持原子性,没有默认的接口实现。

  • putIfAbsent
  • remove
  • replace(key, oldValue, newValue)
  • replace(key, value)

The rest of actions are directly inherited with basically consistent with Map.

其余的动作是直接继承的,与Map基本一致。

3. ConcurrentHashMap

3.ConcurrentHashMap

ConcurrentHashMap is the out-of-box ready ConcurrentMap implementation.

ConcurrentHashMap是开箱即用的ConcurrentMap实现。

For better performance, it consists of an array of nodes as table buckets (used to be table segments prior to Java 8) under the hood, and mainly uses CAS operations during updating.

为了获得更好的性能,它由一个节点数组作为表桶(在Java 8之前曾是表段)组成,在更新时主要使用CAS操作。

The table buckets are initialized lazily, upon the first insertion. Each bucket can be independently locked by locking the very first node in the bucket. Read operations do not block, and update contentions are minimized.

表桶在第一次插入时被懒散地初始化。每个表桶都可以通过锁定表桶中的第一个节点而被独立锁定。读取操作不会阻塞,更新争论也最小化。

The number of segments required is relative to the number of threads accessing the table so that the update in progress per segment would be no more than one most of time.

所需的段数是相对于访问表的线程数而言的,因此每个段的更新进度在大部分时间内不会超过一个。

Before Java 8, the number of “segments” required was relative to the number of threads accessing the table so that the update in progress per segment would be no more than one most of time.

Java 8之前,所需的 “段 “的数量与访问该表的线程数量相对应,因此每个段中的更新在大多数情况下不会超过一个。

That’s why constructors, compared to HashMap, provides the extra concurrencyLevel argument to control the number of estimated threads to use:

这就是为什么与HashMap相比,构造函数提供了额外的concurrencyLevel参数来控制要使用的估计线程数。

public ConcurrentHashMap(
public ConcurrentHashMap(
 int initialCapacity, float loadFactor, int concurrencyLevel)

The other two arguments: initialCapacity and loadFactor worked quite the same as HashMap.

其他两个参数。initialCapacityloadFactor的工作原理与HashMap相当。

However, since Java 8, the constructors are only present for backward compatibility: the parameters can only affect the initial size of the map.

然而,从Java 8开始,构造函数的出现只是为了向后兼容:参数只能影响地图的初始大小

3.1. Thread-Safety

3.1.线程安全

ConcurrentMap guarantees memory consistency on key/value operations in a multi-threading environment.

ConcurrentMap保证了多线程环境下键/值操作的内存一致性。

Actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object in another thread.

在一个线程中,将一个对象作为键或值放入ConcurrentMap之前的操作发生在另一个线程中访问或移除该对象之后的操作。

To confirm, let’s have a look at a memory inconsistent case:

为了证实这一点,让我们看看一个内存不一致的案例。

@Test
public void givenHashMap_whenSumParallel_thenError() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    List<Integer> sumList = parallelSum100(map, 100);

    assertNotEquals(1, sumList
      .stream()
      .distinct()
      .count());
    long wrongResultCount = sumList
      .stream()
      .filter(num -> num != 100)
      .count();
    
    assertTrue(wrongResultCount > 0);
}

private List<Integer> parallelSum100(Map<String, Integer> map, 
  int executionTimes) throws InterruptedException {
    List<Integer> sumList = new ArrayList<>(1000);
    for (int i = 0; i < executionTimes; i++) {
        map.put("test", 0);
        ExecutorService executorService = 
          Executors.newFixedThreadPool(4);
        for (int j = 0; j < 10; j++) {
            executorService.execute(() -> {
                for (int k = 0; k < 10; k++)
                    map.computeIfPresent(
                      "test", 
                      (key, value) -> value + 1
                    );
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(5, TimeUnit.SECONDS);
        sumList.add(map.get("test"));
    }
    return sumList;
}

For each map.computeIfPresent action in parallel, HashMap does not provide a consistent view of what should be the present integer value, leading to inconsistent and undesirable results.

对于每个map.computeIfPresent的并行操作,HashMap并没有提供一个一致的观点,即什么应该是现在的整数值,从而导致不一致和不理想的结果。

As for ConcurrentHashMap, we can get a consistent and correct result:

至于ConcurrentHashMap,我们可以得到一个一致且正确的结果。

@Test
public void givenConcurrentMap_whenSumParallel_thenCorrect() 
  throws Exception {
    Map<String, Integer> map = new ConcurrentHashMap<>();
    List<Integer> sumList = parallelSum100(map, 1000);

    assertEquals(1, sumList
      .stream()
      .distinct()
      .count());
    long wrongResultCount = sumList
      .stream()
      .filter(num -> num != 100)
      .count();
    
    assertEquals(0, wrongResultCount);
}

3.2. Null Key/Value

3.2.Null键/值

Most APIs provided by ConcurrentMap does not allow null key or value, for example:

例如,大多数由ConcurrentMap提供的API不允许null键或值。

@Test(expected = NullPointerException.class)
public void givenConcurrentHashMap_whenPutWithNullKey_thenThrowsNPE() {
    concurrentMap.put(null, new Object());
}

@Test(expected = NullPointerException.class)
public void givenConcurrentHashMap_whenPutNullValue_thenThrowsNPE() {
    concurrentMap.put("test", null);
}

However, for compute* and merge actions, the computed value can be null, which indicates the key-value mapping is removed if present or remains absent if previously absent.

然而,对于compute*merge操作,计算值可以是null,这表明如果存在键值映射,则将其删除,如果以前不存在,则保持不存在

@Test
public void givenKeyPresent_whenComputeRemappingNull_thenMappingRemoved() {
    Object oldValue = new Object();
    concurrentMap.put("test", oldValue);
    concurrentMap.compute("test", (s, o) -> null);

    assertNull(concurrentMap.get("test"));
}

3.3. Stream Support

3.3.流支持

Java 8 provides Stream support in the ConcurrentHashMap as well.

Java 8ConcurrentHashMap中也提供了Stream支持。

Unlike most stream methods, the bulk (sequential and parallel) operations allow concurrent modification safely. ConcurrentModificationException won’t be thrown, which also applies to its iterators. Relevant to streams, several forEach*, search, and reduce* methods are also added to support richer traversal and map-reduce operations.

与大多数流方法不同,批量(顺序和平行)操作允许安全地进行并发修改。ConcurrentModificationException不会被抛出,这也适用于其迭代器。与流相关的几个forEach*searchreduce*方法也被添加,以支持更丰富的遍历和map-reduce操作。

3.4. Performance

3.4.性能

Under the hood, ConcurrentHashMap is somewhat similar to HashMap, with data access and update based on a hash table (though more complex).

在引擎盖下,ConcurrentHashMapHashMap有些相似,数据访问和更新都是基于哈希表(尽管更复杂)。

And of course, the ConcurrentHashMap should yield much better performance in most concurrent cases for data retrieval and update.

当然,ConcurrentHashMap在大多数数据检索和更新的并发情况下应该产生更好的性能。

Let’s write a quick micro-benchmark for get and put performance and compare that to Hashtable and Collections.synchronizedMap, running both operations for 500,000 times in 4 threads.

让我们为getput的性能写一个快速的微观测试,并与HashtableCollections.synchronizedMap进行比较,在4个线程中运行这两个操作50万次。

@Test
public void givenMaps_whenGetPut500KTimes_thenConcurrentMapFaster() 
  throws Exception {
    Map<String, Object> hashtable = new Hashtable<>();
    Map<String, Object> synchronizedHashMap = 
      Collections.synchronizedMap(new HashMap<>());
    Map<String, Object> concurrentHashMap = new ConcurrentHashMap<>();

    long hashtableAvgRuntime = timeElapseForGetPut(hashtable);
    long syncHashMapAvgRuntime = 
      timeElapseForGetPut(synchronizedHashMap);
    long concurrentHashMapAvgRuntime = 
      timeElapseForGetPut(concurrentHashMap);

    assertTrue(hashtableAvgRuntime > concurrentHashMapAvgRuntime);
    assertTrue(syncHashMapAvgRuntime > concurrentHashMapAvgRuntime);
}

private long timeElapseForGetPut(Map<String, Object> map) 
  throws InterruptedException {
    ExecutorService executorService = 
      Executors.newFixedThreadPool(4);
    long startTime = System.nanoTime();
    for (int i = 0; i < 4; i++) {
        executorService.execute(() -> {
            for (int j = 0; j < 500_000; j++) {
                int value = ThreadLocalRandom
                  .current()
                  .nextInt(10000);
                String key = String.valueOf(value);
                map.put(key, value);
                map.get(key);
            }
        });
    }
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);
    return (System.nanoTime() - startTime) / 500_000;
}

Keep in mind micro-benchmarks are only looking at a single scenario and aren’t always a good reflection of real world performance.

请记住,微型基准测试只关注单一场景,并不总是对现实世界性能的良好反映。

That being said, on an OS X system with an average dev system, we’re seeing an average sample result for 100 consecutive runs (in nanoseconds):

也就是说,在一个具有平均开发系统的OS X系统上,我们看到的是连续100次运行的平均样本结果(以纳秒为单位)。

Hashtable: 1142.45
SynchronizedHashMap: 1273.89
ConcurrentHashMap: 230.2

In a multi-threading environment, where multiple threads are expected to access a common Map, the ConcurrentHashMap is clearly preferable.

在多线程环境中,当多个线程要访问一个共同的Map时,ConcurrentHashMap显然是更好的。

However, when the Map is only accessible to a single thread, HashMap can be a better choice for its simplicity and solid performance.

然而,当Map只能被单个线程访问时,HashMap因其简单性和稳固的性能而成为更好的选择。

3.5. Pitfalls

3.5.陷阱

Retrieval operations generally do not block in ConcurrentHashMap and could overlap with update operations. So for better performance, they only reflect the results of the most recently completed update operations, as stated in the official Javadoc.

检索操作在ConcurrentHashMap中一般不会阻塞,并且可能与更新操作重叠。因此,为了提高性能,它们只反映最近完成的更新操作的结果,正如官方Javadoc中所述。

There are several other facts to bear in mind:

还有其他几个事实需要牢记。

  • results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads:
@Test
public void givenConcurrentMap_whenUpdatingAndGetSize_thenError() 
  throws InterruptedException {
    Runnable collectMapSizes = () -> {
        for (int i = 0; i < MAX_SIZE; i++) {
            mapSizes.add(concurrentMap.size());
        }
    };
    Runnable updateMapData = () -> {
        for (int i = 0; i < MAX_SIZE; i++) {
            concurrentMap.put(String.valueOf(i), i);
        }
    };
    executorService.execute(updateMapData);
    executorService.execute(collectMapSizes);
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);

    assertNotEquals(MAX_SIZE, mapSizes.get(MAX_SIZE - 1).intValue());
    assertEquals(MAX_SIZE, concurrentMap.size());
}

If concurrent updates are under strict control, aggregate status would still be reliable.

如果并发更新受到严格控制,聚合状态仍将是可靠的。

Although these aggregate status methods do not guarantee the real-time accuracy, they may be adequate for monitoring or estimation purposes.

尽管这些集合状态方法不能保证实时准确性,但对于监测或估计的目的来说,它们可能是足够的

Note that usage of size() of ConcurrentHashMap should be replaced by mappingCount(), for the latter method returns a long count, although deep down they are based on the same estimation.

请注意,ConcurrentHashMapsize()的使用应该被mappingCount()所取代,因为后者的方法会返回一个long计数,尽管在内心深处它们是基于相同的估计。

  • hashCode matters: note that using many keys with exactly the same hashCode() is a sure way to slow down a performance of any hash table.

To ameliorate impact when keys are Comparable, ConcurrentHashMap may use comparison order among keys to help break ties. Still, we should avoid using the same hashCode() as much as we can.

为了改善键值为可比较时的影响,ConcurrentHashMap可以使用键值之间的比较顺序来帮助打破联系。尽管如此,我们还是应该尽可能地避免使用相同的hashCode()

  • iterators are only designed to use in a single thread as they provide weak consistency rather than fast-fail traversal, and they will never throw ConcurrentModificationException.
  • the default initial table capacity is 16, and it’s adjusted by the specified concurrency level:
public ConcurrentHashMap(
  int initialCapacity, float loadFactor, int concurrencyLevel) {
 
    //...
    if (initialCapacity < concurrencyLevel) {
        initialCapacity = concurrencyLevel;
    }
    //...
}
  • caution on remapping functions: though we can do remapping operations with provided compute and merge* methods, we should keep them fast, short and simple, and focus on the current mapping to avoid unexpected blocking.
  • keys in ConcurrentHashMap are not in sorted order, so for cases when ordering is required, ConcurrentSkipListMap is a suitable choice.

4. ConcurrentNavigableMap

4. ConcurrentNavigableMap

For cases when ordering of keys is required, we can use ConcurrentSkipListMap, a concurrent version of TreeMap.

对于需要对键进行排序的情况,我们可以使用ConcurrentSkipListMap,一个TreeMap的并发版本。

As a supplement for ConcurrentMap, ConcurrentNavigableMap supports total ordering of its keys (in ascending order by default) and is concurrently navigable. Methods that return views of the map are overridden for concurrency compatibility:

作为ConcurrentMap的补充,ConcurrentNavigableMap支持其键的总排序(默认为升序)并可并发导航。返回地图视图的方法被重写以实现并发性。

  • subMap
  • headMap
  • tailMap
  • subMap
  • headMap
  • tailMap
  • descendingMap

keySet() views’ iterators and spliterators are enhanced with weak-memory-consistency:

keySet()视图的迭代器和拆分器得到了弱内存一致性的增强。

  • navigableKeySet
  • keySet
  • descendingKeySet

5. ConcurrentSkipListMap

5.ConcurrentSkipListMap

Previously, we have covered NavigableMap interface and its implementation TreeMap. ConcurrentSkipListMap can be seen a scalable concurrent version of TreeMap.

之前,我们已经介绍了NavigableMap接口及其实现TreeMapConcurrentSkipListMap可以看作是TreeMap的一个可扩展的并发版本。

In practice, there’s no concurrent implementation of the red-black tree in Java. A concurrent variant of SkipLists is implemented in ConcurrentSkipListMap, providing an expected average log(n) time cost for the containsKey, get, put and remove operations and their variants.

在实践中,Java中没有红黑树的并发实现。SkipLists的并发变体在ConcurrentSkipListMap中实现,为containsKeygetputremove操作及其变体提供了预期的平均log(n)时间成本。

In addition to TreeMap‘s features, key insertion, removal, update and access operations are guaranteed with thread-safety. Here’s a comparison to TreeMap when navigating concurrently:

除了TreeMap的特性外,键的插入、移除、更新和访问操作都有线程安全保证。下面是与TreeMap进行并发导航时的比较。

@Test
public void givenSkipListMap_whenNavConcurrently_thenCountCorrect() 
  throws InterruptedException {
    NavigableMap<Integer, Integer> skipListMap
      = new ConcurrentSkipListMap<>();
    int count = countMapElementByPollingFirstEntry(skipListMap, 10000, 4);
 
    assertEquals(10000 * 4, count);
}

@Test
public void givenTreeMap_whenNavConcurrently_thenCountError() 
  throws InterruptedException {
    NavigableMap<Integer, Integer> treeMap = new TreeMap<>();
    int count = countMapElementByPollingFirstEntry(treeMap, 10000, 4);
 
    assertNotEquals(10000 * 4, count);
}

private int countMapElementByPollingFirstEntry(
  NavigableMap<Integer, Integer> navigableMap, 
  int elementCount, 
  int concurrencyLevel) throws InterruptedException {
 
    for (int i = 0; i < elementCount * concurrencyLevel; i++) {
        navigableMap.put(i, i);
    }
    
    AtomicInteger counter = new AtomicInteger(0);
    ExecutorService executorService
      = Executors.newFixedThreadPool(concurrencyLevel);
    for (int j = 0; j < concurrencyLevel; j++) {
        executorService.execute(() -> {
            for (int i = 0; i < elementCount; i++) {
                if (navigableMap.pollFirstEntry() != null) {
                    counter.incrementAndGet();
                }
            }
        });
    }
    executorService.shutdown();
    executorService.awaitTermination(1, TimeUnit.MINUTES);
    return counter.get();
}

A full explanation of the performance concerns behind the scenes is beyond the scope of this article. The details can be found in ConcurrentSkipListMap’s Javadoc, which is located under java/util/concurrent in the src.zip file.

对幕后性能问题的全面解释超出了本文的范围。细节可以在ConcurrentSkipListMap的Javadoc中找到,它位于java/util/concurrent中的src.zip文件。

6. Conclusion

6.结论

In this article, we mainly introduced the ConcurrentMap interface and the features of ConcurrentHashMap and covered on ConcurrentNavigableMap being key-ordering required.

在这篇文章中,我们主要介绍了ConcurrentMap接口和ConcurrentHashMap的特点,并介绍了ConcurrentNavigableMap需要键排序。

The full source code for all the examples used in this article can be found in the GitHub project.

本文中使用的所有示例的完整源代码可以在GitHub项目中找到