1. Overview
1.概述
In this article, we’ll see how to use HashMap in Java, and we’ll look at how it works internally.
在这篇文章中,我们将看到如何在Java中使用HashMap,我们将看看它的内部工作原理。
A class very similar to HashMap is Hashtable. Please refer to a couple of our other articles to learn more about the java.util.Hashtable class itself and the differences between HashMap and Hashtable.
与HashMap非常相似的一个类是Hashtable。请参考我们的其他几篇文章,以了解更多关于java.util.Hashtable类本身以及HashMap和Hashtable之间的区别。
2. Basic Usage
2.基本使用方法
Let’s first look at what it means that HashMap is a map. A map is a key-value mapping, which means that every key is mapped to exactly one value and that we can use the key to retrieve the corresponding value from a map.
我们先来看看HashMap是一个地图是什么意思。地图是一种键值映射,这意味着每个键都精确地映射到一个值,我们可以使用键从地图中检索出相应的值。
One might ask why not simply add the value to a list. Why do we need a HashMap? The simple reason is performance. If we want to find a specific element in a list, the time complexity is O(n) and if the list is sorted, it will be O(log n) using, for example, a binary search.
有人可能会问,为什么不简单地将该值添加到一个列表中。为什么我们需要一个HashMap?简单的原因是性能。如果我们想在一个列表中找到一个特定的元素,时间复杂度是O(n),如果列表被排序,使用例如二进制搜索,时间复杂度是O(log n)。
The advantage of a HashMap is that the time complexity to insert and retrieve a value is O(1) on average. We’ll look at how that can be achieved later. Let’s first look at how to use HashMap.
HashMap的优势在于,插入和检索一个值的时间复杂度平均为O(1)。我们将在后面看一下如何实现这一目标。首先让我们看看如何使用HashMap。
2.1. Setup
2.1.设置
Let’s create a simple class that we’ll use throughout the article:
让我们创建一个简单的类,我们将在整个文章中使用。
public class Product {
private String name;
private String description;
private List<String> tags;
// standard getters/setters/constructors
public Product addTagsOfOtherProduct(Product product) {
this.tags.addAll(product.getTags());
return this;
}
}
2.2. Put
2.2 放
We can now create a HashMap with the key of type String and elements of type Product:
我们现在可以创建一个HashMap,键的类型为String,元素的类型为Product。
Map<String, Product> productsByName = new HashMap<>();
And add products to our HashMap:
并将产品添加到我们的HashMap。
Product eBike = new Product("E-Bike", "A bike with a battery");
Product roadBike = new Product("Road bike", "A bike for competition");
productsByName.put(eBike.getName(), eBike);
productsByName.put(roadBike.getName(), roadBike);
2.3. Get
2.3. 获得
We can retrieve a value from the map by its key:
我们可以通过键从地图中检索一个值。
Product nextPurchase = productsByName.get("E-Bike");
assertEquals("A bike with a battery", nextPurchase.getDescription());
If we try to find a value for a key that doesn’t exist in the map, we’ll get a null value:
如果我们试图为一个不存在于地图中的键找到一个值,我们会得到一个null值。
Product nextPurchase = productsByName.get("Car");
assertNull(nextPurchase);
And if we insert a second value with the same key, we’ll only get the last inserted value for that key:
如果我们插入第二个相同键的值,我们将只得到该键的最后一个插入值。
Product newEBike = new Product("E-Bike", "A bike with a better battery");
productsByName.put(newEBike.getName(), newEBike);
assertEquals("A bike with a better battery", productsByName.get("E-Bike").getDescription());
2.4. Null as the Key
2.4.以空为键
HashMap also allows us to have null as a key:
HashMap也允许我们将null作为一个键。
Product defaultProduct = new Product("Chocolate", "At least buy chocolate");
productsByName.put(null, defaultProduct);
Product nextPurchase = productsByName.get(null);
assertEquals("At least buy chocolate", nextPurchase.getDescription());
2.5. Values with the Same Key
2.5.具有相同键的值
Furthermore, we can insert the same object twice with a different key:
此外,我们可以用不同的键插入同一个对象两次。
productsByName.put(defaultProduct.getName(), defaultProduct);
assertSame(productsByName.get(null), productsByName.get("Chocolate"));
2.6. Remove a Value
2.6.删除一个值
We can remove a key-value mapping from the HashMap:
我们可以从HashMap中删除一个键值映射。
productsByName.remove("E-Bike");
assertNull(productsByName.get("E-Bike"));
2.7. Check If a Key or Value Exists in the Map
2.7.检查一个键或值是否存在于地图中
To check if a key is present in the map, we can use the containsKey() method:
为了检查一个键是否存在于地图中,我们可以使用containsKey()方法。
productsByName.containsKey("E-Bike");
Or, to check if a value is present in the map, we can use the containsValue() method:
或者,为了检查一个值是否存在于地图中,我们可以使用containsValue()方法。
productsByName.containsValue(eBike);
Both method calls will return true in our example. Though they look very similar, there is an important difference in performance between these two method calls. The complexity to check if a key exists is O(1), while the complexity to check for an element is O(n), as it’s necessary to loop over all the elements in the map.
在我们的例子中,两个方法调用都将返回true。虽然它们看起来非常相似,但这两个方法调用在性能上有一个重要的区别。检查一个键是否存在的复杂度是O(1),而检查一个元素的复杂度是O(n),因为它需要在map中的所有元素上循环。
2.8. Iterating Over a HashMap
2.8.在HashMap上进行迭代
There are three basic ways to iterate over all key-value pairs in a HashMap.
有三种基本方法来遍历HashMap中的所有键值对。
We can iterate over the set of all keys:
我们可以在所有键的集合上进行迭代。
for(String key : productsByName.keySet()) {
Product product = productsByName.get(key);
}
Or we can iterate over the set of all entries:
或者我们可以在所有条目的集合上进行迭代。
for(Map.Entry<String, Product> entry : productsByName.entrySet()) {
Product product = entry.getValue();
String key = entry.getKey();
//do something with the key and value
}
Fiinally, we can iterate over all values:
最后,我们可以对所有的值进行迭代。
List<Product> products = new ArrayList<>(productsByName.values());
3. The Key
3.钥匙
We can use any class as the key in our HashMap. However, for the map to work properly, we need to provide an implementation for equals() and hashCode(). Let’s say we want to have a map with the product as the key and the price as the value:
我们可以使用任何类作为我们的HashMap中的键。然而,为了让地图正常工作,我们需要为equals()和hashCode()提供一个实现。假设我们想有一个以产品为键、以价格为值的地图。
HashMap<Product, Integer> priceByProduct = new HashMap<>();
priceByProduct.put(eBike, 900);
Let’s implement the equals() and hashCode() methods:
我们来实现equals()和hashCode()方法。
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Product product = (Product) o;
return Objects.equals(name, product.name) &&
Objects.equals(description, product.description);
}
@Override
public int hashCode() {
return Objects.hash(name, description);
}
Note that hashCode() and equals() need to be overridden only for classes that we want to use as map keys, not for classes that are only used as values in a map. We’ll see why this is necessary in section 5 of this article.
注意,hashCode()和equals()只需要对我们想用作地图键的类进行重写,而不是对只用作地图中的值的类进行重写。我们将在本文的第5节看到为什么要这样做。
4. Additional Methods as of Java 8
4.Java 8中的其他方法
Java 8 added several functional-style methods to HashMap. In this section, we’ll look at some of these methods.
Java 8为HashMap增加了几个函数式方法。在本节中,我们将看看其中的一些方法。
For each method, we’ll look at two examples. The first example shows how to use the new method, and the second example shows how to achieve the same in earlier versions of Java.
对于每个方法,我们将看两个例子。第一个例子显示了如何使用新方法,第二个例子显示了如何在早期版本的Java中实现同样的目的。
As these methods are quite straightforward, we won’t look at more detailed examples.
由于这些方法相当直接,我们不会看更详细的例子。
4.1. forEach()
4.1.forEach()
The forEach method is the functional-style way to iterate over all elements in the map:
forEach方法是函数式的方式,用于遍历地图中的所有元素。
productsByName.forEach( (key, product) -> {
System.out.println("Key: " + key + " Product:" + product.getDescription());
//do something with the key and value
});
Prior to Java 8:
在Java 8之前。
for(Map.Entry<String, Product> entry : productsByName.entrySet()) {
Product product = entry.getValue();
String key = entry.getKey();
//do something with the key and value
}
Our article Guide to the Java 8 forEach covers the forEach loop in greater detail.
我们的文章Guide to Java 8 forEach更详细地介绍了forEach循环。
4.2. getOrDefault()
4.2.getOrDefault()
Using the getOrDefault() method, we can get a value from the map or return a default element in case there is no mapping for the given key:
使用getOrDefault()方法,我们可以从映射中获得一个值,或者在没有给定键的映射时返回一个默认元素。
Product chocolate = new Product("chocolate", "something sweet");
Product defaultProduct = productsByName.getOrDefault("horse carriage", chocolate);
Product bike = productsByName.getOrDefault("E-Bike", chocolate);
Prior to Java 8:
在Java 8之前。
Product bike2 = productsByName.containsKey("E-Bike")
? productsByName.get("E-Bike")
: chocolate;
Product defaultProduct2 = productsByName.containsKey("horse carriage")
? productsByName.get("horse carriage")
: chocolate;
4.3. putIfAbsent()
4.3.putIfAbsent()
With this method, we can add a new mapping, but only if there is not yet a mapping for the given key:
通过这个方法,我们可以添加一个新的映射,但前提是还没有一个针对给定键的映射。
productsByName.putIfAbsent("E-Bike", chocolate);
Prior to Java 8:
在Java 8之前。
if(productsByName.containsKey("E-Bike")) {
productsByName.put("E-Bike", chocolate);
}
Our article Merging Two Maps with Java 8 takes a closer look at this method.
我们的文章用Java 8合并两个地图对这个方法进行了仔细的研究。
4.4. merge()
4.4.merge()
And with merge(), we can modify the value for a given key if a mapping exists, or add a new value otherwise:
通过merge(),我们可以在存在映射的情况下修改给定键的值,或者添加一个新的值。
Product eBike2 = new Product("E-Bike", "A bike with a battery");
eBike2.getTags().add("sport");
productsByName.merge("E-Bike", eBike2, Product::addTagsOfOtherProduct);
Prior to Java 8:
在Java 8之前。
if(productsByName.containsKey("E-Bike")) {
productsByName.get("E-Bike").addTagsOfOtherProduct(eBike2);
} else {
productsByName.put("E-Bike", eBike2);
}
4.5. compute()
4.5.compute()
With the compute() method, we can compute the value for a given key:
通过compute()方法,我们可以计算出一个给定键的值。
productsByName.compute("E-Bike", (k,v) -> {
if(v != null) {
return v.addTagsOfOtherProduct(eBike2);
} else {
return eBike2;
}
});
Prior to Java 8:
在Java 8之前。
if(productsByName.containsKey("E-Bike")) {
productsByName.get("E-Bike").addTagsOfOtherProduct(eBike2);
} else {
productsByName.put("E-Bike", eBike2);
}
It’s worth noting that the methods merge() and compute() are quite similar. The compute() method accepts two arguments: the key and a BiFunction for the remapping. And merge() accepts three parameters: the key, a default value to add to the map if the key doesn’t exist yet, and a BiFunction for the remapping.
值得注意的是,方法merge()和compute()非常相似。 compute()方法接受两个参数:key和一个用于重映射的BiFunction。而merge()接受三个参数:key、一个default value(如果key还不存在的话)和一个用于重映射的BiFunction。
5. HashMap Internals
5.HashMap内部程序
In this section, we’ll look at how HashMap works internally and what are the benefits of using HashMap instead of a simple list, for example.
在这一节中,我们将看看HashMap是如何在内部工作的,以及使用HashMap而不是简单的列表等有哪些好处。
As we’ve seen, we can retrieve an element from a HashMap by its key. One approach would be to use a list, iterate over all elements, and return when we find an element for which the key matches. Both the time and space complexity of this approach would be O(n).
正如我们所见,我们可以通过键从HashMap中检索一个元素。一种方法是使用一个列表,遍历所有的元素,当我们找到一个与键相匹配的元素时返回。这种方法的时间和空间复杂性都是O(n)。
With HashMap, we can achieve an average time complexity of O(1) for the put and get operations and space complexity of O(n). Let’s see how that works.
通过HashMap,我们可以实现put和get操作的平均时间复杂度为O(1),空间复杂度为O(n)。让我们看看效果如何。
5.1. The Hash Code and Equals
5.1.哈希代码和等价物
Instead of iterating over all its elements, HashMap attempts to calculate the position of a value based on its key.
HashMap没有对其所有元素进行迭代,而是试图根据一个值的键来计算其位置。
The naive approach would be to have a list that can contain as many elements as there are keys possible. As an example, let’s say our key is a lower-case character. Then it’s sufficient to have a list of size 26, and if we want to access the element with key ‘c’, we’d know that it’s the one at position 3, and we can retrieve it directly.
天真的做法是有一个列表,可以包含尽可能多的元素,就像有可能存在的键一样。举个例子,假设我们的键是一个小写的字符。那么有一个大小为26的列表就足够了,如果我们想访问键值为’c’的元素,我们就知道它在第3位,我们可以直接检索到它。
However, this approach would not be very effective if we have a much bigger keyspace. For example, let’s say our key was an integer. In this case, the size of the list would have to be 2,147,483,647. In most cases, we would also have far fewer elements, so a big part of the allocated memory would remain unused.
然而,如果我们有一个更大的密钥空间,这种方法就不太有效了。例如,假设我们的密钥是一个整数。在这种情况下,列表的大小必须是2,147,483,647。在大多数情况下,我们的元素也会少得多,所以分配的内存有很大一部分是未使用的。
HashMap stores elements in so-called buckets and the number of buckets is called capacity.
HashMap在所谓的桶中存储元素,桶的数量被称为capacity。
When we put a value in the map, the key’s hashCode() method is used to determine the bucket in which the value will be stored.
当我们把一个值放到map中时,键的hashCode()方法被用来确定该值将被存储在哪个桶中。
To retrieve the value, HashMap calculates the bucket in the same way – using hashCode(). Then it iterates through the objects found in that bucket and use key’s equals() method to find the exact match.
为了检索值,HashMap以同样的方式计算桶 – 使用hashCode()。然后,它遍历在该桶中找到的对象,并使用key的equals()方法来找到精确的匹配。
5.2. Keys’ Immutability
5.2.密钥的不可更改性
In most cases, we should use immutable keys. Or at least, we must be aware of the consequences of using mutable keys.
在大多数情况下,我们应该使用不可变的键。或者至少,我们必须意识到使用可变键的后果。。
Let’s see what happens when our key changes after we used it to store a value in a map.
让我们看看当我们的键在我们用它在地图中存储一个值之后发生了什么变化。
For this example, we’ll create the MutableKey:
对于这个例子,我们将创建MutableKey。
public class MutableKey {
private String name;
// standard constructor, getter and setter
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
MutableKey that = (MutableKey) o;
return Objects.equals(name, that.name);
}
@Override
public int hashCode() {
return Objects.hash(name);
}
}
And here goes the test:
下面开始测试。
MutableKey key = new MutableKey("initial");
Map<MutableKey, String> items = new HashMap<>();
items.put(key, "success");
key.setName("changed");
assertNull(items.get(key));
As we can see, we’re no longer able to get the corresponding value once the key has changed, instead, null is returned. This is because HashMap is searching in the wrong bucket.
我们可以看到,一旦键发生变化,我们就无法再获得相应的值,相反,null被返回。这是因为HashMap在错误的桶中搜索。
The above test case may be surprising if we don’t have a good understanding of how HashMap works internally.
如果我们没有很好地理解HashMap的内部工作原理,上述测试案例可能会让人吃惊。
5.3. Collisions
5.3.碰撞
For this to work correctly, equal keys must have the same hash, however, different keys can have the same hash. If two different keys have the same hash, the two values belonging to them will be stored in the same bucket. Inside a bucket, values are stored in a list and retrieved by looping over all elements. The cost of this is O(n).
为了使其正常工作,相等的键必须有相同的哈希值,然而,不同的键可以有相同的哈希值。如果两个不同的键有相同的哈希值,属于它们的两个值将被存储在同一个桶中。在一个桶内,值被存储在一个列表中,并通过在所有元素上循环来检索。这样做的成本是O(n)。
As of Java 8 (see JEP 180), the data structure in which the values inside one bucket are stored is changed from a list to a balanced tree if a bucket contains 8 or more values, and it’s changed back to a list if, at some point, only 6 values are left in the bucket. This improves the performance to be O(log n).
从Java 8开始(见JEP 180),如果一个桶中包含8个或更多的值,那么存储一个桶内的值的数据结构将从列表变为平衡树,如果在某一时刻,桶中只剩下6个值,那么它又将变为列表。这使得性能提高到O(log n)。
5.4. Capacity and Load Factor
5.4.容量和负载率
To avoid having many buckets with multiple values, the capacity is doubled if 75% (the load factor) of the buckets become non-empty. The default value for the load factor is 75%, and the default initial capacity is 16. Both can be set in the constructor.
为了避免许多桶有多个值,如果75%(负载系数)的桶成为非空的,容量就会翻倍。负载因子的默认值是75%,默认的初始容量是16。两者都可以在构造函数中设置。
5.5. Summary of put and get Operations
5.5.put和get操作的摘要
Let’s summarize how the put and get operations work.
让我们总结一下put和get操作的工作原理。
When we add an element to the map, HashMap calculates the bucket. If the bucket already contains a value, the value is added to the list (or tree) belonging to that bucket. If the load factor becomes bigger than the maximum load factor of the map, the capacity is doubled.
当我们向地图添加一个元素时,HashMap会计算桶。如果该桶已经包含了一个值,该值就会被添加到属于该桶的列表(或树)中。如果负载因子变得比地图的最大负载因子大,那么容量就会增加一倍。
When we want to get a value from the map, HashMap calculates the bucket and gets the value with the same key from the list (or tree).
当我们想从地图中获取一个值时,HashMap计算桶并从列表(或树)中获取具有相同键的值。
6. Conclusion
6.结语
In this article, we saw how to use a HashMap and how it works internally. Along with ArrayList, HashMap is one of the most frequently used data structures in Java, so it’s very handy to have good knowledge of how to use it and how it works under the hood. Our article The Java HashMap Under the Hood covers the internals of HashMap in more detail.
在这篇文章中,我们看到了如何使用HashMap以及它的内部工作方式。与ArrayList一样,HashMap是Java中最经常使用的数据结构之一,因此,掌握如何使用它以及它在内部如何工作的知识是非常方便的。我们的文章引擎盖下的Java HashMap更详细地介绍了HashMap的内部情况。
As usual, the complete source code is available over on Github.
像往常一样,完整的源代码可在Github上获得。