1. Introduction
1.导言
In this tutorial, we’ll look at several ways to increment a numerical value associated with a key in a Map. The Maps interface, part of the Collections framework in Java, represents a collection of key-value pairs. Some common Map implementations include the HashMap, TreeMap, and LinkedHashMap.
在本教程中,我们将介绍几种递增与 Map 中的键相关联的数值的方法。Maps 接口是 Java 中集合框架的一部分,表示键值对的集合。一些常见的 Map 实现包括HashMap、TreeMap和LinkedHashMap。
2. Problem Statement
2 问题陈述
Let’s see an example where we have a sentence as a String input and store the frequencies of each character that appears in the sentence in a Map. Here’s an example of the problem and the output:
让我们来看一个例子:我们将一个句子作为 String 输入,并将句子中出现的每个字符的频率存储在 Map 中。下面是问题和输出的示例:
sample sentence:
"the quick brown fox jumps over the lazy dog"
character frequency:
t: 2 times
h: 2 times
e: 3 times
q: 1 time
u: 2 times
...... and so on
The solution will involve storing the character frequencies in a Map where the key is a character, and the value is the total number of times the character appears in the given String. As there can be repeated characters in a String, we’ll need to update the value associated with a key, or rather increment the current value of a key many times.
解决方案是将字符频率存储在 Map 中,其中键是一个字符,值是该字符在给定 String 中出现的总次数。由于 String 中可能存在重复字符,因此我们需要 更新与键相关联的值,或者说多次递增键的当前值。
3. Solutions
3.解决方案
3.1. Using containsKey()
3.1.使用 containsKey()
The simplest solution to our problem is to walk through the String, and at each step, we check the character’s existence in the map using the containsKey() method. If the key exists, we increment the value by 1 or put a new entry in the map with the key as the character and the value as 1:
要解决这个问题,最简单的方法就是在 String 中走一遍,每走一步,我们都使用 containsKey() 方法检查字符是否存在于映射中。如果键存在,我们就将值递增 1,或者在映射中添加一个新条目,键为字符,值为 1:
public Map<Character, Integer> charFrequencyUsingContainsKey(String sentence) {
Map<Character, Integer> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
int count = 0;
if (charMap.containsKey(sentence.charAt(c))) {
count = charMap.get(sentence.charAt(c));
}
charMap.put(sentence.charAt(c), count + 1);
}
return charMap;
}
3.2. Using getOrDefault()
3.2.使用 getOrDefault()
Java 8 introduced the getOrDefault() method in Map as a simple way to retrieve the value associated with a key in a Map or a default preset value if the key doesn’t exist:
Java 8 在 Map 中引入了getOrDefault()方法,作为一种简单的方法来检索与 Map 中的键相关联的值,或者在键不存在的情况下检索默认预设值: Java 8 引入了getOrDefault()方法。
V getOrDefault(Object key, V defaultValue);
We’ll use this method to fetch the value associated with the current character of the String (our key) and increment the value by 1. This is a simpler and less verbose alternative to containsKey():
我们将使用该方法获取与 String 的当前字符(我们的键)相关联的值,并将该值递增 1。这是 containsKey() 更简单、更省时的替代方法:
public Map<Character, Integer> charFrequencyUsingGetOrDefault(String sentence) {
Map<Character, Integer> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
charMap.put(sentence.charAt(c),
charMap.getOrDefault(sentence.charAt(c), 0) + 1);
}
return charMap;
}
3.3. Using merge()
3.3.使用 merge()
The merge() method is provided by Java 8 as a way to override the values associated with a specific key with an updated value in a Map. The method takes in a key, a value, and a remapping function, which is used to compute the new updated value that will replace the existing value in the Map:
merge() 方法由 Java 8 提供,用于在 Map 中用更新值覆盖与特定键相关联的值。该方法包含一个键、一个值和一个重映射函数,该函数用于计算将替换 Map 中现有值的新更新值:
default V merge(K key, V value, BiFunction<? super V,? super V,? extends V> remappingFunction)
The remapping function is a BiFunction, which means that it follows the functional programming paradigm of Java 8. We send our desired function inline as an argument to the merge() method along with the parameters, and it performs the desired function.
重映射函数是BiFunction,这意味着它遵循 Java 8 的函数式编程范例。我们将所需的函数作为参数内联发送给 merge() 方法,然后它就会执行所需的函数。
The method expects us to define a default value, which will be merged with the result of the remapping function. Hence, we can write our remapping function as a summation of the default value(which is 1) and the current value that exists for the key:
该方法希望我们定义一个默认值,它将与重映射函数的结果合并。因此,我们可以把重映射函数写成默认值(即 1)和键的当前值的求和:
(a, b) -> a + b
We can also rewrite the same using method reference:
我们还可以使用 方法引用重写相同的内容:
public Map<Character, Integer> charFrequencyUsingMerge(String sentence) {
Map<Character, Integer> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
charMap.merge(sentence.charAt(c), 1, Integer::sum);
}
return charMap;
}
3.4. Using compute()
3.4.使用 compute()
The compute() method differs from the merge() method in that the compute() method has no effect on missing keys and does not throw an exception. The method does not take any value, and the remapping function is supposed to decide the new value:
compute()方法与merge()方法的不同之处在于,compute()方法对丢失的键没有影响,也不会抛出异常。该方法不取任何值,而是由重映射函数决定新值:
V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction)
We’ll need to handle the case when the key is missing and the value is null and set the value to the default 1.
我们需要处理键丢失和值为空的情况,并将值设置为默认的 1。
public Map<Character, Integer> charFrequencyUsingCompute(String sentence) {
Map<Character, Integer> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
charMap.compute(sentence.charAt(c), (key, value) -> (value == null) ? 1 : value + 1);
}
return charMap;
}
3.5. Using incrementAndGet() of AtomicInteger
3.5.使用AtomicInteger的incrementAndGet() </em
We can use the AtomicInteger type to store the frequencies of the characters instead of our regular Integer wrapper class. This benefits us by introducing atomicity to the code, and we can use the incrementAndGet() method of the AtomicInteger class. This method performs an atomic increment operation on the value, equivalent to a ++i operation.
我们可以使用 AtomicInteger 类型来存储字符的频率,而不是使用常规的 Integer 封装类。这样做的好处是为代码引入了原子性,我们可以使用 AtomicInteger 类的 incrementAndGet() 方法。
Additionally, we should make sure that the key exists in place using the putIfAbsent() method of Map:
此外,我们应使用 Map 的 putIfAbsent() 方法确保键存在:
public Map<Character, AtomicInteger> charFrequencyWithGetAndIncrement(String sentence) {
Map<Character, AtomicInteger> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
charMap.putIfAbsent(sentence.charAt(c), new AtomicInteger(0));
charMap.get(sentence.charAt(c)).incrementAndGet();
}
return charMap;
}
Notice that the return type of this method uses AtomicInteger.
请注意,此方法的返回类型使用的是 AtomicInteger.
Furthermore, we can rewrite the same code in a slightly more compact way by using computeIfAbsent() by passing a remapping function to it:
此外,我们还可以通过向 computeIfAbsent() 传递重映射函数,使用 computeIfAbsent() 以更简洁的方式重写相同的代码:
public Map<Character, AtomicInteger> charFrequencyWithGetAndIncrementComputeIfAbsent(String sentence) {
Map<Character, AtomicInteger> charMap = new HashMap<>();
for (int c = 0; c < sentence.length(); c++) {
charMap.computeIfAbsent(sentence.charAt(c), k-> new AtomicInteger(0)).incrementAndGet();
}
return charMap;
}
3.6. Using Guava
3.6.使用 Guava
We can use the Guava library to solve the problem as well. To use Guava, we should first add the Maven library as a dependency:
我们还可以使用 Guava 库来解决这个问题。要使用 Guava,我们首先应将 Maven 库添加为依赖项:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.1.3-jre</version>
</dependency>
Guava provides an AtomicLongMap class, which has an inbuilt getAndIncrement() method which helps us in our use case:
Guava提供了一个AtomicLongMap类,该类内置了一个getAndIncrement() 方法,可帮助我们解决以下用例:
public Map<Character, Long> charFrequencyUsingAtomicMap(String sentence) {
AtomicLongMap<Character> map = AtomicLongMap.create();
for (int c = 0; c < sentence.length(); c++) {
map.getAndIncrement(sentence.charAt(c));
}
return map.asMap();
}
4. Benchmarking the Approaches
4.方法基准
JMH is a tool at our disposal to benchmark the different approaches mentioned above. To get started, we include the relevant dependencies:
JMH 是一款供我们使用的工具,用于对上述不同方法进行基准测试。为了开始使用,我们将提供相关的依赖项:
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.37</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.37</version>
</dependency>
The latest versions of the JMH Core and JMH Annotation Processor can be found in Maven Central.
JMH Core 和 JMH Annotation Processor 的最新版本可在 Maven Central 中找到。
We wrap each approach within a Benchmark annotated method along and pass some additional parameters:
我们将每种方法封装在一个Benchmark注释方法中,并传递一些附加参数:
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 1, warmups = 1)
public void benchContainsKeyMap() {
IncrementMapValueWays im = new IncrementMapValueWays();
im.charFrequencyUsingContainsKey(getString());
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 1, warmups = 1)
public void benchMarkComputeMethod() {
IncrementMapValueWays im = new IncrementMapValueWays();
im.charFrequencyUsingCompute(getString());
}
// and so on
After running the benchmark tests, we see that the compute() and merge() methods scored better than the other approaches:
运行基准测试后,我们发现 compute() 和 merge() 方法的得分高于其他方法:。
Benchmark Mode Cnt Score Error Units
BenchmarkMapMethodsJMH.benchContainsKeyMap avgt 5 50697.511 ± 25054.056 ns/op
BenchmarkMapMethodsJMH.benchMarkComputeMethod avgt 5 45124.359 ± 377.541 ns/op
BenchmarkMapMethodsJMH.benchMarkGuavaMap avgt 5 121372.968 ± 853.782 ns/op
BenchmarkMapMethodsJMH.benchMarkMergeMethod avgt 5 46185.990 ± 5446.775 ns/op
We should also note that these results vary slightly and might be unnoticeable when the overall spread of the keys is not as big. As we are considering only English characters and some special characters in our example, the range of keys is capped at a few hundred. Performance would be a big concern for other scenarios where the number of keys might be large.
我们还应该注意到,这些结果会略有不同,当按键的总体分布范围不大时,这些结果可能不会被注意到。由于我们在示例中只考虑了英文字符和一些特殊字符,因此键值范围被限制在几百个。在键值数量可能很大的其他情况下,性能将是一个很大的问题.。
5. Multithreading Considerations
5.多线程考虑因素
The approaches we discussed above do not have any multi-threading considerations. If multiple threads were reading the input String and updating a shared Map, we would most definitely be susceptible to concurrent modifications in the increment operation. This would lead to inconsistent counts in our final result.
我们上面讨论的方法没有任何多线程方面的考虑。如果多个线程在读取输入 String 并更新共享 Map 时,我们肯定会在增量操作中出现并发修改。这将导致最终结果中的计数不一致。
One of the ways we can solve this is by using a ConcurrentHashMap, a thread-safe alternative to our regular HashMap. A ConcurrentHashMap provides support for concurrent operations and does not need external synchronization.
解决这一问题的方法之一是使用 ConcurrentHashMap,这是常规 HashMap 的线程安全替代方案。ConcurrentHashMap 支持并发操作,并且不需要外部同步。
Map<Character, Integer> charMap = new ConcurrentHashMap<>();
In our solution, we should also consider that the character count increment operation that we are doing should be atomic. To ensure this, we should use atomic methods, such as compute() and merge().
在我们的解决方案中,我们还应该考虑到我们正在进行的字符数递增操作应该是原子操作。为确保这一点,我们应使用原子方法,例如 compute() 和 merge() 方法。</em
Let’s write a test case to validate our assertion. We’ll create two threads with a shared instance of a concurrent hashmap. Each thread takes a portion of the String input and performs the same operation:
让我们编写一个测试用例来验证我们的断言。我们将使用并发哈希表的共享实例创建两个线程。每个线程获取 String 输入的一部分,并执行相同的操作:
@Test
public void givenString_whenUsingConcurrentMapCompute_thenReturnFreqMap() throws InterruptedException {
Map<Character, Integer> charMap = new ConcurrentHashMap<>();
Thread thread1 = new Thread(() -> {
IncrementMapValueWays ic = new IncrementMapValueWays();
ic.charFrequencyWithConcurrentMap("the quick brown", charMap);
});
Thread thread2 = new Thread(() -> {
IncrementMapValueWays ic = new IncrementMapValueWays();
ic.charFrequencyWithConcurrentMap(" fox jumps over the lazy dog", charMap);
});
thread1.start();
thread2.start();
thread1.join();
thread2.join();
Map<Character, Integer> expectedMap = getExpectedMap();
Assert.assertEquals(expectedMap, charMap);
}
6. Conclusion
6.结论
In this article, we looked at different ways to increment the value of a Map entry. We benchmarked the execution speeds of the approaches and looked at how we could write a thread-safe solution as well.
在本文中,我们研究了递增 Map 条目的值的不同方法。我们对这些方法的执行速度进行了基准测试,并探讨了如何编写线程安全的解决方案。