Finding the Differences Between Two Lists in Java – 在Java中寻找两个列表之间的差异

最后修改: 2020年 8月 12日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Finding differences between collections of objects of the same data type is a common programming task. As an example, imagine we have a list of students who applied for an exam, and another list of students who passed it. The difference between those two lists would give us the students who didn’t pass the exam.

寻找相同数据类型的对象集合之间的差异是一项常见的编程任务。举个例子,设想我们有一个申请考试的学生名单,以及另一个通过考试的学生名单。这两个列表之间的差异将给我们提供没有通过考试的学生。

In Java, there’s no explicit way of finding the differences between two lists in the List API, though there are some helper methods that come close.

Java中,在List API中没有明确的方法来查找两个列表之间的差异,尽管有一些辅助方法可以接近。

In this quick tutorial, we’ll learn how to find the differences between the two lists. We’ll try a few different approaches, including plain Java (with and without Streams), and third-party libraries, such as Guava and the Apache Commons Collections.

在这个快速教程中,我们将学习如何找到这两个列表之间的差异。我们将尝试一些不同的方法,包括普通的Java(有或没有Streams),以及第三方库,例如GuavaApache Commons Collections

2. Test Setup

2.测试设置

Let’s start by defining two lists, which we’ll use to test out our examples:

让我们从定义两个列表开始,我们将用它们来测试我们的例子。

public class FindDifferencesBetweenListsUnitTest {

    private static final List listOne = Arrays.asList("Jack", "Tom", "Sam", "John", "James", "Jack");
    private static final List listTwo = Arrays.asList("Jack", "Daniel", "Sam", "Alan", "James", "George");

}

3. Using the Java List API

3.使用JavaListAPI

We can create a copy of one list and then remove all the elements common with the other using the List method removeAll():

我们可以创建一个列表的副本,然后使用List方法removeAll()删除与另一个列表共有的所有元素

List<String> differences = new ArrayList<>(listOne);
differences.removeAll(listTwo);
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

Let’s reverse this to find the differences the other way around:

让我们反过来找找另一个方向的差异。

List<String> differences = new ArrayList<>(listTwo);
differences.removeAll(listOne);
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");

We should also note that if we want to find the common elements between the two lists, List also contains a retainAll method.

我们还应该注意,如果我们想找到两个列表之间的共同元素,List也包含一个retainAll方法。

4. Using the Streams API

4.使用Streams API

A Java Stream can be used for performing sequential operations on data from collections, which includes filtering the differences between lists:

Java Stream 可用于对来自集合的数据进行顺序操作,这包括过滤列表之间的差异

List<String> differences = listOne.stream()
            .filter(element -> !listTwo.contains(element))
            .collect(Collectors.toList());
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

As in our first example, we can switch the order of lists to find the different elements from the second list:

正如我们的第一个例子,我们可以切换列表的顺序,从第二个列表中找到不同的元素。

List<String> differences = listTwo.stream()
            .filter(element -> !listOne.contains(element))
            .collect(Collectors.toList());
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Daniel", "Alan", "George");

We should note that the repeated calling of List.contains() can be a costly operation for larger lists.

我们应该注意到,重复调用List.contains()对于较大的列表来说可能是一个昂贵的操作。

5. Using Third-Party Libraries

5.使用第三方库

5.1. Using Google Guava

5.1.使用Google Guava

Guava contains a handy Sets.difference method, but to use it, we need to first convert our List to a Set:

Guava包含一个方便的Sets.difference方法,但是为了使用它,我们需要首先将我们的List转换成Set

List<String> differences = new ArrayList<>(Sets.difference(Sets.newHashSet(listOne), Sets.newHashSet(listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactlyInAnyOrder("Tom", "John");

We should note that converting the List to a Set will have the effect of duplicating and reordering it.

我们应该注意到,将List转换为Set将产生复制和重新排序的效果。

5.2. Using Apache Commons Collections

5.2.使用Apache Commons集合

The CollectionUtils class from Apache Commons Collections contains a removeAll method.

来自Apache Commons CollectionsCollectionUtils类包含一个removeAll方法。

This method does the same as List.removeAll, while also creating a new collection for the result:

这个方法的作用List.removeAll相同,同时也为结果创建一个新的集合

List<String> differences = new ArrayList<>((CollectionUtils.removeAll(listOne, listTwo)));
assertEquals(2, differences.size());
assertThat(differences).containsExactly("Tom", "John");

6. Handling Duplicate Values

6.处理重复值

Now let’s look at finding the differences when two lists contain duplicated values.

现在让我们来看看当两个列表包含重复的值时,如何寻找差异。

To achieve this, we need to remove the duplicate elements from the first list, precisely as many times as they are contained in the second list.

为了实现这一点,我们需要从第一个列表中删除重复的元素,精确到它们在第二个列表中的次数。

In our example, the value “Jack” appears twice in the first list, and only once in the second list:

在我们的例子中,值“Jack”在第一个列表中出现了两次,而在第二个列表中只有一次。

List<String> differences = new ArrayList<>(listOne);
listTwo.forEach(differences::remove);
assertThat(differences).containsExactly("Tom", "John", "Jack");

We can also achieve this using the subtract method from Apache Commons Collections:

我们也可以使用Apache Commons Collections中的subtract方法来实现这一点。

List<String> differences = new ArrayList<>(CollectionUtils.subtract(listOne, listTwo));
assertEquals(3, differences.size());
assertThat(differences).containsExactly("Tom", "John", "Jack");

7. Conclusion

7.结语

In this article, we explored a few ways to find the differences between lists. We covered a basic Java solution, a solution using the Streams API, and solutions using third-party libraries, like Google Guava and Apache Commons Collections.

在这篇文章中,我们探索了一些寻找列表之间差异的方法。我们介绍了一个基本的 Java 解决方案、使用 Streams API 的解决方案以及使用第三方库的解决方案,如 Google GuavaApache Commons Collections

We also discussed how to handle duplicate values.

我们还讨论了如何处理重复的值。

As always, the complete source code is available over on GitHub.

一如既往,完整的源代码可在GitHub上获得