Java TreeMap vs HashMap – Java TreeMap vs HashMap

最后修改: 2018年 1月 8日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this article, we’re going to compare two Map implementations: TreeMap and HashMap.

在这篇文章中,我们将比较两种Map的实现。TreeMapHashMap

Both implementations form an integral part of the Java Collections Framework and store data as key-value pairs.

这两种实现都是JavaCollections框架的一个组成部分,并以键-值对的形式存储数据。

2. Differences

2.差异

2.1. Implementation

2.1.实施

We’ll first talk about the HashMap which is a hashtable-based implementation. It extends the AbstractMap class and implements the Map interface. A HashMap works on the principle of hashing.

我们将首先讨论HashMap,它是一个基于哈希图的实现。它扩展了AbstractMap类并实现了Map接口。HashMap的工作原理是hashing

This Map implementation usually acts as a bucketed hash table, but when buckets get too large, they get transformed into nodes of TreeNodes, each structured similarly to those in java.util.TreeMap.

这个Map实现通常作为一个桶状的哈希表,但当桶变得太大,它们被转化为TreeNodes的节点,每个节点的结构类似于java.util.TreeMap.中的节点。

You can find more on the HashMap’s internals in the article focused on it.

你可以在专注于它的文章中找到更多关于HashMap的内部信息。

On the other hand, TreeMap extends AbstractMap class and implements NavigableMap interface. A TreeMap stores map elements in a Red-Black tree, which is a Self-Balancing Binary Search Tree.

另一方面,TreeMap扩展了AbstractMap类并实现了NavigableMap接口。TreeMapRed-Black树中存储地图元素,它是一个自我平衡的二进制搜索树

And, you can also find more on the TreeMap’s internals in the article focused on it here.

而且,你还可以在这里重点介绍的文章中找到更多关于TreeMap的内部信息。

2.2. Order

2.2.命令

HashMap doesn’t provide any guarantee over the way the elements are arranged in the Map.

HashMap并没有对元素在Map中的排列方式提供任何保证。

It means, we can’t assume any order while iterating over keys and values of a HashMap:

这意味着,我们在迭代HashMap时不能假设任何顺序:

@Test
public void whenInsertObjectsHashMap_thenRandomOrder() {
    Map<Integer, String> hashmap = new HashMap<>();
    hashmap.put(3, "TreeMap");
    hashmap.put(2, "vs");
    hashmap.put(1, "HashMap");
    
    assertThat(hashmap.keySet(), containsInAnyOrder(1, 2, 3));
}

However, items in a TreeMap are sorted according to their natural order.

然而,TreeMap中的项目是按照其自然顺序排序的

If TreeMap objects cannot be sorted according to natural order then we may make use of a Comparator or Comparable to define the order in which the elements are arranged within the Map:

如果TreeMap对象不能按照自然顺序排序,那么我们可以使用ComparatorComparable来定义元素在Map中的排列顺序:

@Test
public void whenInsertObjectsTreeMap_thenNaturalOrder() {
    Map<Integer, String> treemap = new TreeMap<>();
    treemap.put(3, "TreeMap");
    treemap.put(2, "vs");
    treemap.put(1, "HashMap");
    
    assertThat(treemap.keySet(), contains(1, 2, 3));
}

2.3. Null Values

2.3.空值

HashMap allows storing at most one null key and many null values.

HashMap允许最多存储一个null key和许多null值。

Let’s see an example:

让我们看一个例子。

@Test
public void whenInsertNullInHashMap_thenInsertsNull() {
    Map<Integer, String> hashmap = new HashMap<>();
    hashmap.put(null, null);
    
    assertNull(hashmap.get(null));
}

However, TreeMap doesn’t allow a null key but may contain many null values.

然而,TreeMap不允许有null key,但可以包含许多null值。

A null key isn’t allowed because the compareTo() or the compare() method throws a NullPointerException:

由于compareTo()compare()方法会抛出NullPointerException:,所以不允许使用nullkey。

@Test(expected = NullPointerException.class)
public void whenInsertNullInTreeMap_thenException() {
    Map<Integer, String> treemap = new TreeMap<>();
    treemap.put(null, "NullPointerException");
}

If we’re using a TreeMap with a user-defined Comparator, then it depends on the implementation of the compare() method how null values get handled.

如果我们使用带有用户定义的ComparatorTreeMap,那么如何处理null值就取决于compare()方法的实现了。

3. Performance Analysis

3.性能分析

Performance is the most critical metric that helps us understand the suitability of a data-structure given a use-case.

性能是最关键的指标,可以帮助我们了解一个数据结构在使用情况下的适用性。

In this section, we’ll provide a comprehensive analysis of performance for HashMap and TreeMap.

在本节中,我们将对HashMapTreeMap的性能进行全面分析。

3.1. HashMap

3.1.HashMap

HashMap, being a hashtable-based implementation, internally uses an array-based data structure to organize its elements according to the hash function.

HashMap,是一个基于哈希的实现,内部使用一个基于数组的数据结构,根据哈希函数组织其元素。

HashMap provides expected constant-time performance O(1) for most operations like add(), remove() and contains(). Therefore, it’s significantly faster than a TreeMap.

HashMapadd()remove()contains()等大多数操作提供了预期的恒定时间性能O(1)TreeMap快。

The average time to search for an element under the reasonable assumption, in a hash table is O(1). But, an improper implementation of the hash function may lead to a poor distribution of values in buckets which results in:

在合理的假设下,在哈希表中搜索一个元素的平均时间是O(1)。但是,对哈希函数的不恰当实现可能会导致值在桶中的分布不佳,从而导致。

  • Memory Overhead – many buckets remain unused
  • Performance Degradationthe higher the number of collisions, the lower the performance

Before Java 8, Separate Chaining was the only preferred way to handle collisions. It’s usually implemented using linked lists, i.e., if there is any collision or two different elements have same hash value then store both the items in the same linked list.

在Java 8之前,Separate Chaining是处理碰撞的唯一首选方式。它通常使用链表来实现,,如果有任何碰撞或两个不同的元素具有相同的哈希值,那么将这两个项目存储在同一个链表中。

Therefore, searching for an element in a HashMap, in the worst case could have taken as long as searching for an element in a linked list i.e. O(n) time.

因此,在HashMap中搜索一个元素,在最坏的情况下,可能需要和在链接列表中搜索一个元素一样长的时间i.e.O(n)时间。

However, with JEP 180 coming into the picture, there’s been a subtle change in the implementation of the way the elements are arranged in a HashMap.

然而,随着JEP 180的出现,元素在HashMap中的排列方式的实现发生了微妙的变化。

According to the specification, when buckets get too large and contain enough nodes they get transformed into modes of TreeNodes, each structured similarly to those in TreeMap.

根据规范,当桶变得太大并且包含足够多的节点时,它们会被转化为TreeNodes的模式,每个模式的结构与TreeMap中的模式类似。

Hence, in the event of high hash collisions, the worst-case performance will improve from O(n) to O(log n).

因此,在高哈希碰撞的情况下,最坏情况下的性能将从O(n)提高到O(log n)。

The code performing this transformation has been illustrated below:

执行这种转换的代码如下所示。

if(binCount >= TREEIFY_THRESHOLD - 1) {
    treeifyBin(tab, hash);
}

The value for TREEIFY_THRESHOLD is eight which effectively denotes the threshold count for using a tree rather than a linked list for a bucket.

TREEIFY_THRESHOLD的值是8,它有效地表示了使用树而不是链接列表的桶的阈值计数。

It is evident that:

可见,。

  • A HashMap requires way more memory than is needed to hold its data
  • A HashMap shouldn’t be more than 70% – 75% full. If it gets close, it gets resized and entries rehashed
  • Rehashing requires n operations which is costly wherein our constant time insert becomes of order O(n)
  • It’s the hashing algorithm which determines the order of inserting the objects in the HashMap

The performance of a HashMap can be tuned by setting the custom initial capacity and the load factor, at the time of HashMap object creation itself.

在创建HashMap对象时,可以通过设置自定义的初始容量负载因子来调整HashMap的性能。

However, we should choose a HashMap if:

然而,在以下情况下,我们应该选择一个HashMap

  • we know approximately how many items to maintain in our collection
  • we don’t want to extract items in a natural order

Under the above circumstances, HashMap is our best choice because it offers constant time insertion, search, and deletion.

在上述情况下,HashMap是我们的最佳选择,因为它提供了恒定时间的插入、搜索和删除。

3.2. TreeMap

3.2.TreeMap

A TreeMap stores its data in a hierarchical tree with the ability to sort the elements with the help of a custom Comparator.

一个TreeMap将其数据存储在一个分层的树中,并能够在一个自定义Comparator.的帮助下对元素进行排序。

A summary of its performance:

对其表现的总结。

  • TreeMap provides a performance of O(log(n)) for most operations like add(), remove() and contains()
  • A Treemap can save memory (in comparison to HashMap) because it only uses the amount of memory needed to hold its items, unlike a HashMap which uses contiguous region of memory
  • A tree should maintain its balance in order to keep its intended performance, this requires a considerable amount of effort, hence complicates the implementation

We should go for a TreeMap whenever:

无论何时,我们都应该选择TreeMap

  • memory limitations have to be taken into consideration
  • we don’t know how many items have to be stored in memory
  • we want to extract objects in a natural order
  • if items will be consistently added and removed
  • we’re willing to accept O(log n) search time

4. Similarities

4.相似之处

4.1. Unique Elements

4.1.独特的元素

Both TreeMap and HashMap don’t support duplicate keys. If added, it overrides the previous element (without an error or an exception):

TreeMapHashMap都不支持重复键。如果添加,它将覆盖前一个元素(没有错误或异常)。

@Test
public void givenHashMapAndTreeMap_whenputDuplicates_thenOnlyUnique() {
    Map<Integer, String> treeMap = new HashMap<>();
    treeMap.put(1, "Baeldung");
    treeMap.put(1, "Baeldung");

    assertTrue(treeMap.size() == 1);

    Map<Integer, String> treeMap2 = new TreeMap<>();
    treeMap2.put(1, "Baeldung");
    treeMap2.put(1, "Baeldung");

    assertTrue(treeMap2.size() == 1);
}

4.2. Concurrent Access

4.2.并行访问

Both Map implementations aren’t synchronized and we need to manage concurrent access on our own.

两个Map实现都不是同步的,我们需要自己管理并发访问。

Both must be synchronized externally whenever multiple threads access them concurrently and at least one of the threads modifies them.

只要有多个线程同时访问它们,并且至少有一个线程修改了它们,就必须对这两者进行外部同步。

We have to explicitly use Collections.synchronizedMap(mapName) to obtain a synchronized view of a provided map.

我们必须明确地使用Collections.synchronizedMap(mapName)来获得一个所提供的地图的同步视图。

4.3. Fail-Fast Iterators

4.3.失败快速的迭代器

The Iterator throws a ConcurrentModificationException if the Map gets modified in any way and at any time once the iterator has been created.

如果Map在迭代器创建后的任何时候以任何方式被修改,Iterator会抛出ConcurrentModificationException

Additionally, we can use the iterator’s remove method to alter the Map during iteration.

此外,我们可以使用迭代器的移除方法来改变迭代过程中的Map

Let’s see an example:

让我们看一个例子。

@Test
public void whenModifyMapDuringIteration_thenThrowExecption() {
    Map<Integer, String> hashmap = new HashMap<>();
    hashmap.put(1, "One");
    hashmap.put(2, "Two");
    
    Executable executable = () -> hashmap
      .forEach((key,value) -> hashmap.remove(1));
 
    assertThrows(ConcurrentModificationException.class, executable);
}

5. Which Implementation to Use?

5.使用哪种实施方式?

In general, both implementations have their respective pros and cons, however, it’s about understanding the underlying expectation and requirement which must govern our choice regarding the same.

一般来说,这两种实施方式都有各自的优点和缺点,然而,它是关于理解潜在的期望和要求,这必须指导我们关于相同的选择。

Summarizing:

归纳总结。

  • We should use a TreeMap if we want to keep our entries sorted
  • We should use a HashMap if we prioritize performance over memory consumption
  • Since a TreeMap has a more significant locality, we might consider it if we want to access objects that are relatively close to each other according to their natural ordering
  • HashMap can be tuned using the initialCapacity and loadFactor, which isn’t possible for the TreeMap
  • We can use the LinkedHashMap if we want to preserve insertion order while benefiting from constant time access

6. Conclusion

6.结论

In this article, we showed the differences and similarities between TreeMap and HashMap.

在这篇文章中,我们展示了TreeMapHashMap之间的异同。

As always, the code examples for this article are available over on GitHub.

像往常一样,本文的代码示例可在GitHub上获得