1. Introduction
1.绪论
In this tutorial, we’ll learn how to use a byte array as a key in HashMap. Because of how HashMap works, we, unfortunately, can’t do that directly. We’ll investigate why is that and look at several ways to solve that problem.
在本教程中,我们将学习如何在HashMap中使用一个字节数组作为键。由于HashMap的工作方式,不幸的是,我们不能直接这样做。我们将研究这是为什么,并看看解决这个问题的几种方法。
2. Designing a Good Key for HashMap
2.为HashMap设计一个好钥匙
2.1. How HashMap Works
2.1.哈希图如何工作
HashMap uses the mechanism of hashing for storing and retrieving values from itself. When we invoke the put(key, value) method, HashMap calculates the hash code based on the key’s hashCode() method. This hash is used to identify a bucket in which the value is finally stored:
HashMap使用散列机制来存储和检索自身的值。当我们调用put(key, value)方法时,HashMap会根据key的hashCode()方法来计算散列代码。这个散列被用来识别一个桶,该值最终被存储在其中。
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
When we retrieve a value using the get(key) method, a similar process is involved. The key is used to compute the hash code and then to find the bucket. Then each entry in the bucket is checked for equality using the equals() method. Finally, the value of the matching entry is returned:
当我们使用get(key)方法检索一个值时,也会涉及类似的过程。键被用来计算哈希代码,然后找到桶。然后使用equals()方法检查桶中的每个条目是否相等。最后,返回匹配条目的值。
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
2.2. Contract Between equals() and hashCode()
2.2.equals()和hashCode()之间的合同
Both equals and hashCode methods have contracts that should be observed. In the context of HashMaps, one aspect is especially important: objects that are equal to each other must return the same hashCode. However, objects that return the same hashCode don’t need to be equal to each other. That’s why we can store several values in one bucket.
equals和hashCode方法都有应该遵守的契约。在HashMaps的上下文中,有一个方面特别重要。彼此相等的对象必须返回相同的hashCode。然而,返回相同hashCode的对象并不需要彼此相等。这就是为什么我们可以在一个桶中存储多个值。
2.3. Immutability
2.3.不变性
The hashCode of the key in HashMap should not change. While it’s not mandatory, it’s highly recommended for keys to be immutable. If an object is immutable, its hashCode won’t have an opportunity to change, regardless of the implementation of the hashCode method.
HashMap中的键的hashCode不应该改变。虽然这不是强制性的,但强烈建议键是不可变的。如果一个对象是不可变的,它的hashCode就不会有机会改变,无论hashCode方法的实现如何。
By default, the hash is computed based on all fields of the object. If we would like to have a mutable key, we’d need to override the hashCode method to ensure that mutable fields aren’t used in its computation. To maintain the contract, we would also need to change the equals method.
默认情况下,哈希值是基于对象的所有字段来计算的。如果我们想拥有一个可变的密钥,我们需要覆盖hashCode方法,以确保在其计算中不使用可变字段。为了维护契约,我们还需要改变equals方法。
2.4. Meaningful Equality
2.4.有意义的平等
To be able to successfully retrieve values from the map, equality must be meaningful. In most cases, we need to be able to create a new key object that will be equal to some existing key in the map. For that reason, object identity isn’t very useful in this context.
为了能够成功地从地图中检索出数值,平等必须是有意义的。在大多数情况下,我们需要能够创建一个新的键对象,它将等于地图中的某个现有键。由于这个原因,对象身份在这种情况下不是很有用。
This is also the main reason why using a primitive byte array isn’t really an option. Arrays in Java use object identity to determine equality. If we create HashMap with byte array as the key, we’ll be able to retrieve a value only using exactly the same array object.
这也是为什么使用原始字节数组并不是一个真正的选择的主要原因。Java中的数组使用对象身份来确定平等。如果我们创建了以字节数组为键的HashMap,我们将只能使用完全相同的数组对象来检索一个值。
Let’s create a naive implementation with a byte array as a key:
让我们创建一个以字节数组为键的天真实现。
byte[] key1 = {1, 2, 3};
byte[] key2 = {1, 2, 3};
Map<byte[], String> map = new HashMap<>();
map.put(key1, "value1");
map.put(key2, "value2");
Not only do we have two entries with virtually the same key, but also, we can’t retrieve anything using a newly created array with the same values:
我们不仅有两个键值几乎相同的条目,而且,我们也不能用一个新创建的具有相同值的数组检索任何东西。
String retrievedValue1 = map.get(key1);
String retrievedValue2 = map.get(key2);
String retrievedValue3 = map.get(new byte[]{1, 2, 3});
assertThat(retrievedValue1).isEqualTo("value1");
assertThat(retrievedValue2).isEqualTo("value2");
assertThat(retrievedValue3).isNull();
3. Using Existing Containers
3.使用现有的容器
Instead of the byte array, we can use existing classes whose equality implementation is based on content, not object identity.
我们可以使用现有的类来代替字节数组,这些类的平等实现是基于内容而不是对象身份的。
3.1. String
3.1.字符串
String equality is based on the content of the character array:
字符串平等是基于字符阵列的内容。
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = count;
if (n == anotherString.count) {
char v1[] = value;
char v2[] = anotherString.value;
int i = offset;
int j = anotherString.offset;
while (n-- != 0) {
if (v1[i++] != v2[j++])
return false;
}
return true;
}
}
return false;
}
Strings are also immutable, and creating a String based on a byte array is fairly straightforward. We can easily encode and decode a String using the Base64 scheme:
String也是不可变的,并且基于字节数组创建String是相当直接的。我们可以使用Base64方案轻松地编码和解码String。
String key1 = Base64.getEncoder().encodeToString(new byte[]{1, 2, 3});
String key2 = Base64.getEncoder().encodeToString(new byte[]{1, 2, 3});
Now we can create a HashMap with String as keys instead of byte arrays. We’ll put values into the Map in a manner similar to the previous example:
现在我们可以创建一个HashMap,用String作为键而不是字节数组。我们将以类似于前一个例子的方式将值放入Map。
Map<String, String> map = new HashMap<>();
map.put(key1, "value1");
map.put(key2, "value2");
Then we can retrieve a value from the map. For both keys, we’ll get the same, second value. We can also check that the keys are truly equal to each other:
然后我们可以从地图中检索一个值。对于这两个键,我们将得到相同的、第二个值。我们还可以检查这两个键是否真的彼此相等。
String retrievedValue1 = map.get(key1);
String retrievedValue2 = map.get(key2);
assertThat(key1).isEqualTo(key2);
assertThat(retrievedValue1).isEqualTo("value2");
assertThat(retrievedValue2).isEqualTo("value2");
3.2. Lists
3.2.列表
Similarly to String, the List#equals method will check for equality of each of its elements. If these elements have a sensible equals() method and are immutable, List will work correctly as the HashMap key. We only need to make sure we’re using an immutable List implementation:
与String类似,List#equals方法将检查其每个元素的平等性。如果这些元素有一个合理的equals()方法并且是不可更改的,那么List将作为HashMap的键正确工作。我们只需要确保我们使用一个不可变的List实现。
List<Byte> key1 = ImmutableList.of((byte)1, (byte)2, (byte)3);
List<Byte> key2 = ImmutableList.of((byte)1, (byte)2, (byte)3);
Map<List<Byte>, String> map = new HashMap<>();
map.put(key1, "value1");
map.put(key2, "value2");
assertThat(map.get(key1)).isEqualTo(map.get(key2));
Mind that the List of the Byte object will take a lot more memory than the array of byte primitives. So that solution, while convenient, isn’t viable for most scenarios.
请注意,Byte对象的List将比byte基元的数组多出很多内存。因此,这个解决方案虽然方便,但对于大多数情况来说并不可行。
4. Implementing Custom Container
4.实现自定义容器
We can also implement our own wrapper to take full control of hash code computation and equality. That way we can make sure the solution is fast and doesn’t have a big memory footprint.
我们还可以实现我们自己的包装器,以完全控制哈希代码的计算和平等。这样一来,我们就可以确保解决方案是快速的,并且没有大的内存占用。。
Let’s make a class with one final, private byte array field. It’ll have no setter, and its getter will make a defensive copy to ensure full immutability:
让我们做一个有一个最终的、私有的byte数组字段的类。它没有setter,而它的getter会做一个防御性的拷贝,以确保完全的不变性。
public final class BytesKey {
private final byte[] array;
public BytesKey(byte[] array) {
this.array = array;
}
public byte[] getArray() {
return array.clone();
}
}
We also need to implement our own equals and hashCode methods. Fortunately, we can use the Arrays utility class for both of these tasks:
我们还需要实现我们自己的equals和hashCode方法。幸运的是,我们可以使用Arrays实用类来完成这两项任务。
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
BytesKey bytesKey = (BytesKey) o;
return Arrays.equals(array, bytesKey.array);
}
@Override
public int hashCode() {
return Arrays.hashCode(array);
}
Finally, we can use our wrapper as a key in a HashMap:
最后,我们可以将我们的包装器作为HashMap中的一个键。
BytesKey key1 = new BytesKey(new byte[]{1, 2, 3});
BytesKey key2 = new BytesKey(new byte[]{1, 2, 3});
Map<BytesKey, String> map = new HashMap<>();
map.put(key1, "value1");
map.put(key2, "value2");
Then, we can retrieve the second value using either of the declared keys or we may use one created on the fly:
然后,我们可以使用已声明的键来检索第二个值,也可以使用一个临时创建的键来检索。
String retrievedValue1 = map.get(key1);
String retrievedValue2 = map.get(key2);
String retrievedValue3 = map.get(new BytesKey(new byte[]{1, 2, 3}));
assertThat(retrievedValue1).isEqualTo("value2");
assertThat(retrievedValue2).isEqualTo("value2");
assertThat(retrievedValue3).isEqualTo("value2");
5. Conclusion
5.总结
In this tutorial, we looked at different problems and solutions for using a byte array as a key in HashMap. First, we investigated why we can’t use arrays as keys. Then we used some built-in containers to mitigate that problem and, finally, implemented our own wrapper.
在本教程中,我们研究了使用字节数组作为HashMap中的键的不同问题和解决方案。首先,我们研究了为什么我们不能使用数组作为键。然后,我们使用了一些内置的容器来缓解这个问题,最后,实现了我们自己的包装器。
As usual, the source code for this tutorial can be found over on GitHub.
像往常一样,本教程的源代码可以在GitHub上找到超过。