Intersection Between two Integer Arrays – 两个整数数组之间的交集

最后修改: 2018年 11月 11日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this quick tutorial, we’ll have a look at how to compute the intersection between two Integer arrays ‘a’ and ‘b’.

在这个快速教程中,我们将看看如何计算两个整数数组的交集‘a’‘b’

We’ll also focus on how to handle duplicate entries.

我们还将重点讨论如何处理重复的条目。

For the implementation, we’ll use Streams.

对于实现,我们将使用Streams.

2. Membership Predicate for an Array

2.数组的成员谓词

The intersection of two sets is by definition a set with all values from one, which are also part of the second set.

根据定义,两个集合的交集是一个集合,其中一个集合的所有数值也是第二个集合的一部分。

Therefore we need a Function or rather a Predicate to decide the membership in the second array. Since List provides such a method out of the box, we’ll transform this to a List:

因此,我们需要一个Function或者说一个Predicate来决定第二个数组中的成员。由于List提供了这样的方法,我们将其转换为List

Predicate isContainedInB = Arrays.asList(b)::contains;

3. Building the Intersection

3.建立交叉点

To build up the resulting array, we’ll consider the elements of the first set sequentially and verify if they’re also contained in the second array. Then we’ll create a new array based on this.

为了建立所产生的数组,我们将依次考虑第一组的元素,并验证它们是否也包含在第二个数组中。然后我们将在此基础上创建一个新的数组。

The Stream API provides us with the needed methods. First, we’ll create a Stream, then filter with the membership-Predicate and finally we’ll create a new array:

StreamAPI为我们提供了所需的方法。首先,我们将创建一个Stream,然后用成员-Predicate过滤,最后我们将创建一个新的数组:

public static Integer[] intersectionSimple(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(Arrays.asList(b)::contains)
      .toArray(Integer[]::new);
}

4. Duplicate Entries

4.重复的条目

Since arrays in Java are no Set implementation, we face the issue of duplicate entries in the input and then in the result. Notice that the number of occurrences in the result depends on the occurrences in the first parameter.

由于Java中的数组没有Set实现,我们面临着输入和结果中的重复条目问题。注意,结果中的出现次数取决于第一个参数中的出现次数。

But for sets, elements must not occur multiple times. We can archive this by using the distinct() method:

但是对于集合来说,元素不能多次出现。我们可以通过使用distinct()方法来存档:

public static Integer[] intersectionSet(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(Arrays.asList(b)::contain)
      .distinct()
      .toArray(Integer[]::new);
}

So the length of the intersection no longer depends on the parameter order.

因此,相交的长度不再取决于参数的顺序。

However, the intersection of an array with itself may not be the array again since we remove double entries.

然而,数组与自身的交集可能不再是数组了,因为我们删除了双项。

5. Multiset Intersection

5.多组交集

A more general notion, which allows multiple equal entries, are multisets. For them, the intersection is then defined by the minimal number of input occurrences. So our membership-Predicate must keep score how often we add an element to the result.

一个更普遍的概念是多集,它允许多个相等的条目。对于它们来说,相交是由最小的输入出现次数来定义的。所以我们的成员-谓词必须记下我们在结果中添加一个元素的频率。

The remove() method can be used for this, which returns the membership and consumes the elements. So after all equal elements in ‘b’ are consumed, no more equal elements are added to the result:

remove()方法可用于此,它返回成员资格并消耗元素。因此,在‘b’中的所有相等元素被消耗后,没有更多的相等元素被添加到结果中。

public static Integer[] intersectionSet(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(new LinkedList<>(Arrays.asList(b))::remove)
      .toArray(Integer[]::new);
}

Since the Arrays API only returns an immutable List, we have to generate a dedicate mutable one.

由于Arrays API只返回一个不可变的List,我们必须生成一个专用的可变的。

6. Conclusion

6.结语

In this article, we’ve seen how to use the contains and remove methods to implement an intersection for two arrays in Java.

在这篇文章中,我们已经看到了如何使用containsremove方法来实现Java中两个数组的相交。

All the implementation, code snippets, and tests can be found in our GitHub repository – this is a Maven-based project, so it should be easy to import and run as it is.

所有的实现、代码片段和测试都可以在我们的GitHub仓库中找到–这是一个基于Maven的项目,所以应该很容易导入并按原样运行。