1. Overview
1.概述
When managing key-value pairs in a Java application, we often find ourselves considering two main options: Hashtable and ConcurrentHashMap.
在 Java 应用程序中管理键值对时,我们经常会发现自己在考虑两个主要选项:Hashtable 和 ConcurrentHashMap.
While both collections offer the advantage of thread safety, their underlying architectures and capabilities significantly differ. Whether we’re building a legacy system or working on modern, microservices-based cloud applications, understanding these nuances is critical for making the right choice.
虽然 这两种集合都具有线程安全的优势,但它们的底层架构和功能却大相径庭。无论我们是在构建传统系统,还是在开发基于微服务的现代云应用程序,了解这些细微差别对于做出正确的选择至关重要。
In this tutorial, we’ll dissect the differences between Hashtable and ConcurrentHashMap, delving into their performance metrics, synchronization features, and various other aspects to help us make an informed decision.
在本教程中,我们将剖析 Hashtable 和 ConcurrentHashMap 之间的差异,深入探讨它们的性能指标、同步功能和其他各个方面,以帮助我们做出明智的决定。
2. Hashtable
2.哈希表</em
Hashtable is one of the oldest collection classes in Java and has been present since JDK 1.0. It provides key-value storage and retrieval APIs:
Hashtable是 Java 中最古老的集合类之一,从 JDK 1.0 开始就存在了。它提供键值存储和检索 API:
Hashtable<String, String> hashtable = new Hashtable<>();
hashtable.put("Key1", "1");
hashtable.put("Key2", "2");
hashtable.putIfAbsent("Key3", "3");
String value = hashtable.get("Key2");
The primary selling point of Hashtable is thread safety, which is achieved through method-level synchronization.
Hashtable的主要卖点是线程安全,它是通过方法级同步实现的。
Methods like put(), putIfAbsent(), get(), and remove() are synchronized. Only one thread can execute any of these methods at a given time on a Hashtable instance, ensuring data consistency.
诸如 put()、putIfAbsent()、get() 和 remove() 等方法都是同步的。在一个 Hashtable 实例上,只有一个线程能在给定时间内执行这些方法中的任何一个,从而确保数据的一致性。
3. ConcurrentHashMap
3.ConcurrentHashMap
ConcurrentHashMap is a more modern alternative, introduced with the Java Collections Framework as part of Java 5.
ConcurrentHashMap是一种更现代的替代方法,它是 Java Collections Framework(Java 5 的一部分)引入的。
Both Hashtable and ConcurrentHashMap implement the Map interface, which accounts for the similarity in method signatures:
Hashtable 和 ConcurrentHashMap 都实现了 Map 接口,这也是方法签名相似的原因:
ConcurrentHashMap<String, String> concurrentHashMap = new ConcurrentHashMap<>();
concurrentHashMap.put("Key1", "1");
concurrentHashMap.put("Key2", "2");
concurrentHashMap.putIfAbsent("Key3", "3");
String value = concurrentHashMap.get("Key2");
4. Differences
4.差异
In this section, we’ll examine key aspects that set Hashtable and ConcurrentHashMap apart, including concurrency, performance, and memory usage.
在本节中,我们将研究将 Hashtable 和 ConcurrentHashMap 区分开来的关键方面,包括并发性、性能和内存使用。
4.1. Concurrency
4.1.并发性
As we discussed earlier, Hashtable achieves thread safety through method-level synchronization.
正如我们前面所讨论的,Hashtable 通过方法级同步实现了线程安全。
ConcurrentHashMap, on the other hand, provides thread safety with a higher level of concurrency. It allows multiple threads to read and perform limited writes simultaneously without locking the entire data structure. This is especially useful in applications that have more read operations than write operations.
另一方面,ConcurrentHashMap提供了具有更高并发级别的线程安全性。它允许多个线程同时读取和执行有限的写入操作,而无需锁定整个数据结构。这在读取操作多于写入操作的应用程序中尤其有用。
4.2. Performance
4.2.性能
While both Hashtable and ConcurrentHashMap guarantee thread safety, they differ in performance due to their underlying synchronization mechanisms.
虽然Hashtable和ConcurrentHashMap都能保证线程安全,但它们的性能却因其底层同步机制而有所不同。
Hashtable locks the entire table during a write operation, thereby preventing other reads or writes. This could be a bottleneck in a high-concurrency environment.
Hashtable 会在写操作期间锁定整个表,从而阻止其他读取或写入操作。这在高并发环境中可能会成为瓶颈。
ConcurrentHashMap, however, allows concurrent reads and limited concurrent writes, making it more scalable and often faster in practice.
ConcurrentHashMap允许并发读取和有限的并发写入,从而使其更具可扩展性,在实践中通常也更快。
Differences in performance numbers may not be noticeable for small datasets. However, ConcurrentHashMap often shows its strength with larger datasets and higher levels of concurrency.
对于小型数据集,性能数据的差异可能并不明显。但是,ConcurrentHashMap 通常会在数据集较大和并发程度较高时显示出其优势。
To substantiate performance numbers, let’s run benchmark tests using JMH (the Java Microbenchmark Harness), which uses 10 threads to simulate concurrent activity and performs three warm-up iterations followed by five measurement iterations. It measures the average time taken by each benchmark method, indicating the average execution time:
为了证实性能数据,让我们使用 JMH(Java Microbenchmark Harness)来运行基准测试,该测试使用 10 个线程来模拟并发活动,并执行三次预热迭代和五次测量迭代。它测量每个基准方法的平均耗时,显示平均执行时间:
@Benchmark
@Group("hashtable")
public void benchmarkHashtablePut() {
for (int i = 0; i < 10000; i++) {
hashTable.put(String.valueOf(i), i);
}
}
@Benchmark
@Group("hashtable")
public void benchmarkHashtableGet(Blackhole blackhole) {
for (int i = 0; i < 10000; i++) {
Integer value = hashTable.get(String.valueOf(i));
blackhole.consume(value);
}
}
@Benchmark
@Group("concurrentHashMap")
public void benchmarkConcurrentHashMapPut() {
for (int i = 0; i < 10000; i++) {
concurrentHashMap.put(String.valueOf(i), i);
}
}
@Benchmark
@Group("concurrentHashMap")
public void benchmarkConcurrentHashMapGet(Blackhole blackhole) {
for (int i = 0; i < 10000; i++) {
Integer value = concurrentHashMap.get(String.valueOf(i));
blackhole.consume(value);
}
}
Here are the test results:
以下是测试结果:
Benchmark Mode Cnt Score Error
BenchMarkRunner.concurrentHashMap avgt 5 1.788 ± 0.406
BenchMarkRunner.concurrentHashMap:benchmarkConcurrentHashMapGet avgt 5 1.157 ± 0.185
BenchMarkRunner.concurrentHashMap:benchmarkConcurrentHashMapPut avgt 5 2.419 ± 0.629
BenchMarkRunner.hashtable avgt 5 10.744 ± 0.873
BenchMarkRunner.hashtable:benchmarkHashtableGet avgt 5 10.810 ± 1.208
BenchMarkRunner.hashtable:benchmarkHashtablePut avgt 5 10.677 ± 0.541
Benchmark results provide insights into the average execution times of specific methods for both Hashtable and ConcurrentHashMap.
基准测试结果可帮助我们深入了解 Hashtable 和 ConcurrentHashMap 的特定方法的平均执行时间。
Lower scores indicate better performance, and the results show that, on average, ConcurrentHashMap outperforms Hashtable for both get() and put() operations.
得分越低表示性能越好,结果显示,平均而言,ConcurrentHashMap 在 get() 和 put() 操作方面的性能均优于 Hashtable 。
4.3. Hashtable Iterators
4.3.Hashtable 迭代器
Hashtable iterators are “fail-fast”, which means that if the structure of the Hashtable is modified after an iterator has been created, the iterator will throw a ConcurrentModificationException. This mechanism helps prevent unpredictable behavior by failing quickly when concurrent modifications are detected.
Hashtable 迭代器是 “fail-fast”,这意味着如果在创建迭代器后修改了 Hashtable 的结构,迭代器将抛出 ConcurrentModificationException 异常。当检测到并发修改时,该机制会迅速失效,从而有助于防止出现不可预测的行为。
In the example below, we have a Hashtable containing three key-value pairs, and we initiate two threads:
在下面的示例中,我们有一个包含三个键值对的 Hashtable 并启动了两个线程:
- iteratorThread: iterates through the Hashtable keys and prints them with 100 milliseconds delay
- modifierThread: waits for 50 milliseconds and then adds a new key-value pair to the Hashtable
When modifierThread adds a new key-value pair to the Hashtable, iteratorThread throws a ConcurrentModificationException, indicating that the Hashtable structure was modified while the iteration was in progress:
当 modifierThread 向 Hashtable 添加新键值对时,iteratorThread 将抛出 ConcurrentModificationException 异常,表明 Hashtable 结构在迭代过程中被修改:
Hashtable<String, Integer> hashtable = new Hashtable<>();
hashtable.put("Key1", 1);
hashtable.put("Key2", 2);
hashtable.put("Key3", 3);
AtomicBoolean exceptionCaught = new AtomicBoolean(false);
Thread iteratorThread = new Thread(() -> {
Iterator<String> it = hashtable.keySet().iterator();
try {
while (it.hasNext()) {
it.next();
Thread.sleep(100);
}
} catch (ConcurrentModificationException e) {
exceptionCaught.set(true);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
Thread modifierThread = new Thread(() -> {
try {
Thread.sleep(50);
hashtable.put("Key4", 4);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
iteratorThread.start();
modifierThread.start();
iteratorThread.join();
modifierThread.join();
assertTrue(exceptionCaught.get());
4.4. ConcurrentHashMap Iterators
4.4.ConcurrentHashMap 迭代器
In contrast to Hashtable, which uses “fail-fast” iterators, ConcurrentHashMap employs “weakly consistent” iterators.
Hashtable 使用 “快速失败 “迭代器,而 ConcurrentHashMap 则使用 “弱一致性 “迭代器。
These iterators can withstand concurrent modifications to the original map, reflecting the state of the map at the time the iterator was created. They might also reflect further changes but aren’t guaranteed to do so. Therefore, we can modify ConcurrentHashMap in one thread while iterating over it in another without getting a ConcurrentModificationException.
这些迭代器可以承受对原始地图的并发修改,反映迭代器创建时的地图状态。它们还可能反映进一步的更改,但并不保证会如此。因此,我们可以在一个线程中修改 ConcurrentHashMap ,同时在另一个线程中对其进行迭代,而不会出现 ConcurrentModificationException 异常。
The example below demonstrates the weakly consistent nature of iterators in ConcurrentHashMap:
下面的示例演示了 ConcurrentHashMap 中迭代器的弱一致性:
- iteratorThread: iterates through the ConcurrentHashMap keys and prints them with 100 milliseconds delay
- modifierThread: waits for 50 milliseconds and then adds a new key-value pair to the ConcurrentHashMap
Unlike Hashtable “fail-fast” iterators, the weakly consistent iterator here doesn’t throw a ConcurrentModificationException. The iterator in iteratorThread continues without any issues, showcasing how ConcurrentHashMap is designed for high-concurrency scenarios:
与 Hashtable 的 “快速失败 “迭代器不同,这里的弱一致性迭代器不会抛出 ConcurrentModificationException 异常。iteratorThread中的迭代器在继续执行时不会出现任何问题,这展示了ConcurrentHashMap是如何为高并发场景而设计的:
ConcurrentHashMap<String, Integer> concurrentHashMap = new ConcurrentHashMap<>();
concurrentHashMap.put("Key1", 1);
concurrentHashMap.put("Key2", 2);
concurrentHashMap.put("Key3", 3);
AtomicBoolean exceptionCaught = new AtomicBoolean(false);
Thread iteratorThread = new Thread(() -> {
Iterator<String> it = concurrentHashMap.keySet().iterator();
try {
while (it.hasNext()) {
it.next();
Thread.sleep(100);
}
} catch (ConcurrentModificationException e) {
exceptionCaught.set(true);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
Thread modifierThread = new Thread(() -> {
try {
Thread.sleep(50);
concurrentHashMap.put("Key4", 4);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
iteratorThread.start();
modifierThread.start();
iteratorThread.join();
modifierThread.join();
assertFalse(exceptionCaught.get());
4.5. Memory
4.5.内存
Hashtable uses a simple data structure, essentially an array of linked lists. Each bucket in this array stores one key-value pair, so there’s only the overhead of the array itself and the linked list nodes. There are no additional internal data structures to manage the concurrency level, load factor, or other advanced functionalities. Thus Hashtable consumes less memory overall.
Hashtable 使用一种简单的数据结构,本质上是一个链表数组。该数组中的每个桶都存储一个键值对,因此只有数组本身和链表节点的开销。没有额外的内部数据结构来管理并发级别、负载因素或其他高级功能。因此,Hashtable总体内存消耗更少。
ConcurrentHashMap is more complex and consists of an array of segments, which is essentially a separate Hashtable. This allows it to perform certain operations concurrently but also consumes additional memory for these segment objects.
ConcurrentHashMap更为复杂,它由一个段数组组成,而段数组本质上是一个单独的 Hashtable 。这允许它并发执行某些操作,但也会消耗这些段对象的额外内存。
For each segment, it maintains extra information, such as count, threshold, load factor, etc., which increases its memory footprint. It dynamically adjusts the number of segments and their sizes to accommodate more entries and reduce collision, which means it has to keep additional metadata to manage these, leading to further memory consumption.
对于每个分段,它都要维护额外的信息,如计数、阈值、负载系数等,这就增加了内存占用。它还会动态调整段的数量和大小,以容纳更多的条目并减少碰撞,这意味着它必须保留额外的元数据来管理这些信息,从而进一步消耗内存。
5. Conclusion
5.结论
In this article, we learned the differences between Hashtable and ConcurrentHashMap.
在本文中,我们了解了 Hashtable 和 ConcurrentHashMap. 之间的区别。
Both Hashtable and ConcurrentHashMap serve the purpose of storing key-value pairs in a thread-safe manner. However, we saw that ConcurrentHashMap usually has the upper hand in terms of performance and scalability due to its advanced synchronization features.
Hashtable 和 ConcurrentHashMap 都能以线程安全的方式存储键值对。不过,我们看到 ConcurrentHashMap 由于具有高级同步功能,通常在性能和可扩展性方面更胜一筹。
Hashtable is still useful and might be preferable in legacy systems or scenarios where method-level synchronization is explicitly required. Understanding the specific needs of our application can help us make a more informed decision between these two.
Hashtable仍然有用,在传统系统或明确要求方法级同步的应用场景中可能更受欢迎。了解应用程序的特定需求有助于我们在这两者之间做出更明智的决定。
As always, the source for the examples is available over on GitHub.
一如既往,这些示例的源代码可在 GitHub 上获取。