1. Overview
1.概述
Hashtable is the oldest implementation of a hash table data structure in Java. The HashMap is the second implementation, which was introduced in JDK 1.2.
Hashtable是Java中最古老的哈希表数据结构的实现。HashMap是第二个实现,它在JDK 1.2中被引入。
Both classes provide similar functionality, but there are also small differences, which we’ll explore in this tutorial.
这两个类都提供了类似的功能,但也有一些小的区别,我们将在本教程中探讨这些区别。
2. When to Use Hashtable
2.何时使用Hashtable?
Let’s say we have a dictionary, where each word has its definition. Also, we need to get, insert and remove words from the dictionary quickly.
假设我们有一本字典,其中每个词都有它的定义。同时,我们需要从字典中快速获取、插入和删除单词。
Hence, Hashtable (or HashMap) makes sense. Words will be the keys in the Hashtable, as they are supposed to be unique. Definitions, on the other hand, will be the values.
因此,Hashtable(或HashMap)是合理的。词将是Hashtable中的键,因为它们应该是唯一的。另一方面,定义将是值。
3. Example of Use
3.使用实例
Let’s continue with the dictionary example. We’ll model Word as a key:
让我们继续讨论字典的例子。我们将把 Word 作为一个键来建模。
public class Word {
private String name;
public Word(String name) {
this.name = name;
}
// ...
}
Let’s say the values are Strings. Now we can create a Hashtable:
假设这些值是字符串。现在我们可以创建一个Hashtable。
Hashtable<Word, String> table = new Hashtable<>();
First, let’s add an entry:
首先,让我们添加一个条目。
Word word = new Word("cat");
table.put(word, "an animal");
Also, to get an entry:
另外,为了得到一个条目。
String definition = table.get(word);
Finally, let’s remove an entry:
最后,让我们删除一个条目。
definition = table.remove(word);
There are many more methods in the class, and we’ll describe some of them later.
该类中还有许多方法,我们将在后面描述其中的一些。
But first, let’s talk about some requirements for the key object.
但首先,让我们谈一谈对关键对象的一些要求。
4. The Importance of hashCode()
4、hashCode()的重要性
To be used as a key in a Hashtable, the object mustn’t violate the hashCode() contract. In short, equal objects must return the same code. To understand why let’s look at how the hash table is organized.
要作为Hashtable中的一个键,对象必须不违反hashCode()契约。简而言之,同等对象必须返回相同的代码。为了理解这个原因,我们来看看哈希表是如何组织的。
Hashtable uses an array. Each position in the array is a “bucket” which can be either null or contain one or more key-value pairs. The index of each pair is calculated.
Hashtable使用一个数组。数组中的每个位置是一个 “桶”,它可以是空的,也可以包含一个或多个键值对。每个键值对的索引都被计算出来。
But why not to store elements sequentially, adding new elements to the end of the array?
但为什么不按顺序存储元素,将新的元素添加到数组的末端呢?
The point is that finding an element by index is much quicker than iterating through the elements with the comparison sequentially. Hence, we need a function that maps keys to indexes.
重点是,通过索引找到一个元素要比用比较法依次遍历元素快得多。因此,我们需要一个将键映射到索引的函数。
4.1. Direct Address Table
4.1.直接地址表
The simplest example of such mapping is the direct-address table. Here keys are used as indexes:
这种映射的最简单例子是直接地址表。这里的键被用作索引。
index(k)=k,
where k is a key
Keys are unique, that is each bucket contains one key-value pair. This technique works well for integer keys when the possible range of them is reasonably small.
键是唯一的,也就是每个桶包含一个键值对。当整数键的可能范围相当小时,这种技术对整数键很有效。
But we have two problems here:
但我们在这里有两个问题。
- First, our keys are not integers, but Word objects
- Second, if they were integers, nobody would guarantee they were small. Imagine that the keys are 1, 2 and 1000000. We’ll have a big array of size 1000000 with only three elements, and the rest will be a wasted space
hashCode() method solves the first problem.
hashCode()方法解决了第一个问题。
The logic for data manipulation in the Hashtable solves the second problem.
Hashtable中的数据操作逻辑解决了第二个问题。
Let’s discuss this in depth.
让我们深入讨论一下这个问题。
4.2. hashCode() Method
4.2.hashCode()方法
Any Java object inherits the hashCode() method which returns an int value. This value is calculated from the internal memory address of the object. By default hashCode() returns distinct integers for distinct objects.
任何Java对象都继承了hashCode()方法,它返回一个int值。这个值是由对象的内部内存地址计算出来的。默认情况下,hashCode()为不同的对象返回不同的整数。
Thus any key object can be converted to an integer using hashCode(). But this integer may be big.
因此,任何键对象都可以使用hashCode()转换为一个整数。但这个整数可能很大。
4.3. Reducing the Range
4.3.缩小范围
get(), put() and remove() methods contain the code which solves the second problem – reducing the range of possible integers.
get()、put()和remove()方法包含解决第二个问题的代码–减少可能的整数范围。
The formula calculates an index for the key:
该公式计算出一个键的索引。
int index = (hash & 0x7FFFFFFF) % tab.length;
Where tab.length is the array size, and hash is a number returned by the key’s hashCode() method.
其中tab.length是数组大小,hash是由键的hashCode()方法返回的数字。
As we can see index is a reminder of the division hash by the array size. Note that equal hash codes produce the same index.
我们可以看到index是除法hash对数组大小的提醒。注意,相等的哈希代码产生相同的索引。
4.4. Collisions
4.4.碰撞
Furthermore, even different hash codes can produce the same index. We refer to this as a collision. To resolve collisions Hashtable stores a LinkedList of key-value pairs.
此外,即使是不同的哈希代码也能产生相同的索引。我们把这称为碰撞。为了解决碰撞问题,Hashtable存储了一个键值对的LinkedList。
Such data structure is called a hash table with chaining.
这样的数据结构被称为带链式的哈希表。
4.5. Load Factor
4.5.负荷系数
It is easy to guess that collisions slow down operations with elements. To get an entry it is not enough to know its index, but we need to go through the list and perform a comparison with each item.
很容易猜到,碰撞会减慢对元素的操作。为了得到一个条目,仅仅知道它的索引是不够的,我们还需要遍历整个列表,并与每个项目进行比较。
Therefore it’s important to reduce the number of collisions. The bigger is an array, the smaller is the chance of a collision. The load factor determines the balance between the array size and the performance. By default, it’s 0.75 which means that the array size doubles when 75% of the buckets become not empty. This operation is executed by rehash() method.
因此,减少碰撞的数量是很重要的。阵列越大,碰撞的机会就越小。负载因子决定了数组大小和性能之间的平衡。默认情况下,它是0.75,这意味着当75%的桶不为空时,数组的大小会增加一倍。这个操作由rehash()方法执行。
But let’s return to the keys.
但让我们回到钥匙的问题上。
4.6. Overriding equals() and hashCode()
4.6.重写 equals() 和 hashCode()
When we put an entry into a Hashtable and get it out of it, we expect that the value can be obtained not only with same the instance of the key but also with an equal key:
当我们把一个条目放入Hashtable并从其中得到它时,我们希望不仅可以用相同的键的实例来获得该值,还可以用相同的键来获得。
Word word = new Word("cat");
table.put(word, "an animal");
String extracted = table.get(new Word("cat"));
To set the rules of equality, we override the key’s equals() method:
为了设置平等规则,我们覆盖了键的equals()方法:。
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Word))
return false;
Word word = (Word) o;
return word.getName().equals(this.name);
}
But if we don’t override hashCode() when overriding equals() then two equal keys may end up in the different buckets because Hashtable calculates the key’s index using its hash code.
但是如果我们在覆盖equals()时没有覆盖hashCode(),那么两个相等的键可能最终会出现在不同的桶中,因为Hashtable使用其哈希代码来计算键的索引。
Let’s take a close look at the above example. What happens if we don’t override hashCode()?
让我们仔细看一下上面的例子。如果我们不覆盖hashCode()会怎样?
- Two instances of Word are involved here – the first is for putting the entry and the second is for getting the entry. Although these instances are equal, their hashCode() method return different numbers
- The index for each key is calculated by the formula from section 4.3. According to this formula, different hash codes may produce different indexes
- This means that we put the entry into one bucket and then try to get it out from the other bucket. Such logic breaks Hashtable
Equal keys must return equal hash codes, that’s why we override the hashCode() method:
相等的键必须返回相等的哈希代码,这就是为什么我们要覆盖hashCode()方法:。
public int hashCode() {
return name.hashCode();
}
Note that it’s also recommended to make not equal keys return different hash codes, otherwise they end up in the same bucket. This will hit the performance, hence, losing some of the advantages of a Hashtable.
请注意,也建议让不相等的键返回不同的哈希代码,否则它们最终会在同一个桶里。 这将影响性能,因此,失去了Hashtable的一些优势。
Also, note that we don’t care about the keys of String, Integer, Long or another wrapper type. Both equal() and hashCode() methods are already overridden in wrapper classes.
另外,请注意,我们并不关心String、Integer、Long或其他封装类型的键。equal()和hashCode()方法都已经在封装类中被重写。
5. Iterating Hashtables
5.迭代Hashtables
There are a few ways to iterate Hashtables. In this section well talk about them and explain some of the implications.
有几种方法可以迭代Hashtables。在本节中,我们将讨论这些方法,并解释其中的一些含义。
5.1. Fail Fast: Iteration
5.1.快速失败 迭代
Fail-fast iteration means that if a Hashtable is modified after its Iterator is created, then the ConcurrentModificationException will be thrown. Let’s demonstrate this.
失败快速迭代意味着如果一个Hashtable在其Iterator创建后被修改,那么ConcurrentModificationException将被抛出。让我们来演示一下。
First, we’ll create a Hashtable and add entries to it:
首先,我们将创建一个Hashtable并向其添加条目。
Hashtable<Word, String> table = new Hashtable<Word, String>();
table.put(new Word("cat"), "an animal");
table.put(new Word("dog"), "another animal");
Second, we’ll create an Iterator:
其次,我们将创建一个Iterator。
Iterator<Word> it = table.keySet().iterator();
And third, we’ll modify the table:
第三,我们要修改表格。
table.remove(new Word("dog"));
Now if we try to iterate through the table, we’ll get a ConcurrentModificationException:
现在,如果我们试图遍历该表,我们会得到一个ConcurrentModificationException。
while (it.hasNext()) {
Word key = it.next();
}
java.util.ConcurrentModificationException
at java.util.Hashtable$Enumerator.next(Hashtable.java:1378)
ConcurrentModificationException helps to find bugs and thus avoid unpredictable behavior, when, for example, one thread is iterating through the table, and another one is trying to modify it at the same time.
ConcurrentModificationException有助于发现错误,从而避免不可预测的行为,例如,当一个线程正在迭代表,而另一个线程正试图同时修改它。
5.2. Not Fail Fast: Enumeration
5.2.不快速失败 枚举
Enumeration in a Hashtable is not fail-fast. Let’s look at an example.
在Hashtable中的Enumeration不是故障快速的。让我们看一个例子。
First, let’s create a Hashtable and add entries to it:
首先,让我们创建一个Hashtable并向其添加条目。
Hashtable<Word, String> table = new Hashtable<Word, String>();
table.put(new Word("1"), "one");
table.put(new Word("2"), "two");
Second, let’s create an Enumeration:
第二,让我们创建一个Enumeration。
Enumeration<Word> enumKey = table.keys();
Third, let’s modify the table:
第三,让我们修改一下表格。
table.remove(new Word("1"));
Now if we iterate through the table it won’t throw an exception:
现在,如果我们遍历该表,就不会出现异常。
while (enumKey.hasMoreElements()) {
Word key = enumKey.nextElement();
}
5.3. Unpredictable Iteration Order
5.3.不可预测的迭代顺序
Also, note that iteration order in a Hashtable is unpredictable and does not match the order in which the entries were added.
另外,请注意,Hashtable中的迭代顺序是不可预测的,与添加条目的顺序不一致。
This is understandable as it calculates each index using the key’s hash code. Moreover, rehashing takes place from time to time, rearranging the order of the data structure.
这是可以理解的,因为它使用钥匙的哈希代码来计算每个索引。此外,重新洗牌会不时发生,重新安排数据结构的顺序。
Hence, let’s add some entries and check the output:
因此,让我们添加一些条目并检查输出。
Hashtable<Word, String> table = new Hashtable<Word, String>();
table.put(new Word("1"), "one");
table.put(new Word("2"), "two");
// ...
table.put(new Word("8"), "eight");
Iterator<Map.Entry<Word, String>> it = table.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Word, String> entry = it.next();
// ...
}
}
five
four
three
two
one
eight
seven
6. Hashtable vs. HashMap
6.Hashtable与HashMap
Hashtable and HashMap provide very similar functionality.
Hashtable和HashMap提供非常类似的功能。
Both of them provide:
这两个人都提供。
- Fail-fast iteration
- Unpredictable iteration order
But there are some differences too:
但也有一些区别。
- HashMap doesn’t provide any Enumeration, while Hashtable provides not fail-fast Enumeration
- Hashtable doesn’t allow null keys and null values, while HashMap do allow one null key and any number of null values
- Hashtable‘s methods are synchronized while HashMaps‘s methods are not
7. Hashtable API in Java 8
7.Java 8中的Hashtable API
Java 8 has introduced new methods which help make our code cleaner. In particular, we can get rid of some if blocks. Let’s demonstrate this.
Java 8引入了新的方法,有助于使我们的代码更加简洁。特别是,我们可以去掉一些if块。让我们来演示一下。
7.1. getOrDefault()
7.1.getOrDefault()
Let’s say we need to get the definition of the word “dog” and assign it to the variable if it is on the table. Otherwise, assign “not found” to the variable.
假设我们需要得到 “狗“这个词的定义,如果它在表中,就把它赋给变量。否则,将 “未找到 “分配给该变量。
Before Java 8:
在Java 8之前。
Word key = new Word("dog");
String definition;
if (table.containsKey(key)) {
definition = table.get(key);
} else {
definition = "not found";
}
After Java 8:
在Java 8之后。
definition = table.getOrDefault(key, "not found");
7.2. putIfAbsent()
7.2.putIfAbsent()
Let’s say we need to put a word “cat“ only if it’s not in the dictionary yet.
比方说,我们需要把一个词 “猫“,只有当它还没有出现在字典里时才会出现。
Before Java 8:
在Java 8之前。
if (!table.containsKey(new Word("cat"))) {
table.put(new Word("cat"), definition);
}
After Java 8:
在Java 8之后。
table.putIfAbsent(new Word("cat"), definition);
7.3. boolean remove()
7.3.boolean remove()
Let’s say we need to remove the word “cat” but only if it’s definition is “an animal”.
假设我们需要删除 “猫 “这个词,但前提是它的定义是 “一种动物”。
Before Java 8:
在Java 8之前。
if (table.get(new Word("cat")).equals("an animal")) {
table.remove(new Word("cat"));
}
After Java 8:
在Java 8之后。
boolean result = table.remove(new Word("cat"), "an animal");
Finally, while old remove() method returns the value, the new method returns boolean.
最后,旧的remove()方法返回值,新的方法返回boolean。
7.4. replace()
7.4.replace()
Let’s say we need to replace a definition of “cat”, but only if its old definition is “a small domesticated carnivorous mammal”.
假设我们需要替换 “猫 “的定义,但前提是它的旧定义是 “一种小型驯化的食肉哺乳动物”。
Before Java 8:
在Java 8之前。
if (table.containsKey(new Word("cat"))
&& table.get(new Word("cat")).equals("a small domesticated carnivorous mammal")) {
table.put(new Word("cat"), definition);
}
After Java 8:
在Java 8之后。
table.replace(new Word("cat"), "a small domesticated carnivorous mammal", definition);
7.5. computeIfAbsent()
7.5.computeIfAbsent()
This method is similar to putIfabsent(). But putIfabsent() takes the value directly, and computeIfAbsent() takes a mapping function. It calculates the value only after it checks the key, and this is more efficient, especially if the value is difficult to obtain.
这个方法类似于putIfabsent()。 但是putIfabsent()直接取值,而computeIfAbsent()取一个映射函数。它只在检查了键之后才计算值,这更有效率,特别是在值很难得到的情况下。
table.computeIfAbsent(new Word("cat"), key -> "an animal");
Hence, the above line is equivalent to:
因此,上面这句话相当于。
if (!table.containsKey(cat)) {
String definition = "an animal"; // note that calculations take place inside if block
table.put(new Word("cat"), definition);
}
7.6. computeIfPresent()
7.6.computeIfPresent()
This method is similar to the replace() method. But, again, replace() takes the value directly, and computeIfPresent() takes a mapping function. It calculates the value inside of the if block, that’s why it’s more efficient.
这个方法类似于replace()方法。但是,同样的,replace()直接获取值,而computeIfPresent()获取一个映射函数。它在if块内计算值,这就是为什么它更有效率。
Let’s say we need to change the definition:
比方说,我们需要改变这个定义。
table.computeIfPresent(cat, (key, value) -> key.getName() + " - " + value);
Hence, the above line is equivalent to:
因此,上面这句话相当于。
if (table.containsKey(cat)) {
String concatination=cat.getName() + " - " + table.get(cat);
table.put(cat, concatination);
}
7.7. compute()
7.7.compute()
Now we’ll solve another task. Let’s say we have an array of String, where the elements are not unique. Also, let’s calculate how many occurrences of a String we can get in the array. Here is the array:
现在我们要解决另一项任务。假设我们有一个String数组,其中的元素是不唯一的。 另外,让我们计算一下在数组中可以得到多少个String的出现次数。下面是这个数组。
String[] animals = { "cat", "dog", "dog", "cat", "bird", "mouse", "mouse" };
Also, we want to create a Hashtable which contains an animal as a key and the number of its occurrences as a value.
另外,我们想创建一个Hashtable,其中包含一个动物作为键,其出现的数量作为值。
Here is a solution:
这里有一个解决方案。
Hashtable<String, Integer> table = new Hashtable<String, Integer>();
for (String animal : animals) {
table.compute(animal,
(key, value) -> (value == null ? 1 : value + 1));
}
Finally, let’s make sure, that the table contains two cats, two dogs, one bird and two mouses:
最后,让我们确定,该表包含两只猫、两只狗、一只鸟和两只鼠标。
assertThat(table.values(), hasItems(2, 2, 2, 1));
7.8. merge()
7.8.merge()
There is another way to solve the above task:
还有一种方法可以解决上述任务。
for (String animal : animals) {
table.merge(animal, 1, (oldValue, value) -> (oldValue + value));
}
The second argument, 1, is the value which is mapped to the key if the key is not yet on the table. If the key is already in the table, then we calculate it as oldValue+1.
第二个参数,1,是映射到键的值,如果该键还没有在表中。如果键已经在表中,那么我们计算它为oldValue+1。
7.9. foreach()
7.9.foreach()
This is a new way to iterate through the entries. Let’s print all the entries:
这是一种遍历条目的新方法。让我们打印所有的条目。
table.forEach((k, v) -> System.out.println(k.getName() + " - " + v)
7.10. replaceAll()
7.10.replaceAll()
Additionally, we can replace all the values without iteration:
此外,我们可以不经迭代就替换所有的值。
table.replaceAll((k, v) -> k.getName() + " - " + v);
8. Conclusion
8.结论
In this article, we’ve described the purpose of the hash table structure and showed how to complicate a direct-address table structure to get it.
在这篇文章中,我们已经描述了哈希表结构的目的,并展示了如何将直接地址表结构复杂化以得到它。
Additionally, we’ve covered what collisions are and what a load factor is in a Hashtable. Also, we’ve learned why to override equals() and hashCode() for key objects.
此外,我们还介绍了什么是碰撞,以及什么是Hashtable中的负载因子。此外,我们还学习了为什么要覆盖equals()和hashCode()来处理关键对象。
Finally, we’ve talked about Hashtable‘s properties and Java 8 specific API.
最后,我们谈到了Hashtable的属性和Java 8的特定API。
As usual, the complete source code is available on Github.