Filter Java Stream to 1 and Only 1 Element – 将Java流过滤为1个且只有1个元素

最后修改: 2022年 7月 29日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this article, we’ll use two methods from Collectors to retrieve the unique element which matches a certain predicate in a given stream of elements.

在这篇文章中,我们将使用Collectors中的两个方法来检索在给定的元素流中与某个谓词匹配的唯一元素。

For both approaches, we’ll define two methods according to the following standard:

对于这两种方法,我们将根据以下标准定义两种方法。

  • the get method expects to have a unique result. Otherwise, it throws an Exception
  • the find method accepts that the result can be missing and returns an Optional with the value if it exists

2. Retrieve the Unique Result Using Reduction

2.使用还原法检索唯一的结果

Collectors.reducing performs a reduction of its input elements. To do so, it applies a function specified as a BinaryOperator. The result is described as an Optional. Thus we can define our find method.

Collectors.reducing对其输入元素进行还原。为此,它应用了一个指定为BinaryOperator的函数。其结果被描述为一个Optional。因此我们可以定义我们的查找方法。

In our case, if there are two or more elements after filtering, we just need to discard the result:

在我们的例子中,如果在过滤后有两个或更多的元素,我们只需要丢弃结果。

public static <T> Optional<T> findUniqueElementMatchingPredicate_WithReduction(Stream<T> elements, Predicate<T> predicate) {
    return elements.filter(predicate)
      .collect(Collectors.reducing((a, b) -> null));
}

To write the get method, we’ll need to make the following changes:

为了写出get方法,我们需要做以下修改。

Furthermore, in this case, we can directly apply the reducing operation on the Stream:

此外,在这种情况下,我们可以直接在reducing操作上应用Stream

public static <T> T getUniqueElementMatchingPredicate_WithReduction(Stream<T> elements, Predicate<T> predicate) {
    return elements.filter(predicate)
      .reduce((a, b) -> {
          throw new IllegalStateException("Too many elements match the predicate");
      })
      .orElseThrow(() -> new IllegalStateException("No element matches the predicate"));
}

3. Retrieve the Unique Result Using Collectors.collectingAndThen

3.使用Collectors.collectionAndThen检索唯一结果

Collectors.collectingAndThen applies a function to the result List of a collecting operation.

Collectors.collectionAndThen将一个函数应用于一个收集操作的结果List

Hence, to define the find method, we’ll need to take the List and:

因此,为了定义查找方法,我们需要把List和。

  • if the List has either zero, or more than two elements, return null
  • if the List has exactly one element, return it

Here is the code for this operation:

下面是这个操作的代码。

private static <T> T findUniqueElement(List<T> elements) {
    if (elements.size() == 1) {
        return elements.get(0);
    }
    return null;
}

As a result, the find method reads:

结果是,查找方法的内容是:”。

public static <T> Optional<T> findUniqueElementMatchingPredicate_WithCollectingAndThen(Stream<T> elements, Predicate<T> predicate) {
    return elements.filter(predicate)
      .collect(Collectors.collectingAndThen(Collectors.toList(), list -> Optional.ofNullable(findUniqueElement(list))));
}

In order to adapt our private method for the get case, we’ll need to throw if the number of retrieved elements is not exactly 1. Let’s be precise and distinguish the cases where there is no result and too many results, as we did with reduction:

为了使我们的私有方法适应get的情况,我们需要抛出如果检索到的元素数量不正好是1。让我们精确一点,区分没有结果和结果过多的情况,就像我们在减少时做的那样。

private static <T> T getUniqueElement(List<T> elements) {
    if (elements.size() > 1) {
        throw new IllegalStateException("Too many elements match the predicate");
    } else if (elements.size() == 0) {
        throw new IllegalStateException("No element matches the predicate");
    }
    return elements.get(0);
}

In the end, given that we named our class FilterUtils, we can write the get method:

最后,鉴于我们将我们的类命名为FilterUtils,我们可以写出get方法。

public static <T> T getUniqueElementMatchingPredicate_WithCollectingAndThen(Stream<T> elements, Predicate<T> predicate) {
    return elements.filter(predicate)
      .collect(Collectors.collectingAndThen(Collectors.toList(), FilterUtils::getUniqueElement));
}

4. Performance Benchmark

4.业绩基准

Let’s use JMH to run a quick performance comparison between the different methods.

让我们使用JMH来运行不同方法之间的快速性能比较。

First, let’s apply our methods to

首先,让我们将我们的方法应用于

In this case, the Predicate will be verified for one unique element of the Stream. Let’s have a look at the definition of the Benchmark:

在这种情况下,Predicate将对Stream的一个唯一元素进行验证。让我们看一下Benchmark的定义。

@State(Scope.Benchmark)
public static class MyState {
    final Stream<Integer> getIntegers() { 
        return IntStream.range(1, 1000000).boxed();
    }
    
    final Predicate<Integer> PREDICATE = i -> i == 751879;
}

@Benchmark
public void evaluateFindUniqueElementMatchingPredicate_WithReduction(Blackhole blackhole, MyState state) {
    blackhole.consume(FilterUtils.findUniqueElementMatchingPredicate_WithReduction(state.INTEGERS.stream(), state.PREDICATE));
}

@Benchmark
public void evaluateFindUniqueElementMatchingPredicate_WithCollectingAndThen(Blackhole blackhole, MyState state) {
    blackhole.consume(FilterUtils.findUniqueElementMatchingPredicate_WithCollectingAndThen(state.INTEGERS.stream(), state.PREDICATE));
}

@Benchmark
public void evaluateGetUniqueElementMatchingPredicate_WithReduction(Blackhole blackhole, MyState state) {
    try {
        FilterUtils.getUniqueElementMatchingPredicate_WithReduction(state.INTEGERS.stream(), state.PREDICATE);
    } catch (IllegalStateException exception) {
        blackhole.consume(exception);
    }
}

@Benchmark
public void evaluateGetUniqueElementMatchingPredicate_WithCollectingAndThen(Blackhole blackhole, MyState state) {
    try {
        FilterUtils.getUniqueElementMatchingPredicate_WithCollectingAndThen(state.INTEGERS.stream(), state.PREDICATE);
    } catch (IllegalStateException exception) {
        blackhole.consume(exception);
    }
}

Let’s run it. We’re measuring the number of operations per second. The higher, the better:

让我们来运行它。我们正在测量每秒的操作数。越高越好:

Benchmark                                                                          Mode  Cnt    Score    Error  Units
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithCollectingAndThen  thrpt   25  140.581 ± 28.793  ops/s
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithReduction          thrpt   25  100.171 ± 36.796  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithCollectingAndThen   thrpt   25  145.568 ±  5.333  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithReduction           thrpt   25  144.616 ± 12.917  ops/s

As we can see, in this case, the different methods perform very similarly.

我们可以看到,在这种情况下,不同的方法表现非常相似。

Let’s change our Predicate to check if an element of the Stream is equal to 0. This condition is false for all elements of the List. We can now run the benchmark again:

让我们改变我们的Predicate,以检查Stream的一个元素是否等于0。这个条件对于List的所有元素来说都是假的。现在我们可以再次运行该基准。

Benchmark                                                                          Mode  Cnt    Score    Error  Units
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithCollectingAndThen  thrpt   25  165.751 ± 19.816  ops/s
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithReduction          thrpt   25  174.667 ± 20.909  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithCollectingAndThen   thrpt   25  188.293 ± 18.348  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithReduction           thrpt   25  196.689 ±  4.155  ops/s

Here again, the performance chart is quite balanced.

在这里,表现图又是相当平衡的。

Lastly, let’s check out what happens if we use a Predicate that returns true for values greater than 751879: there is a huge amount of elements of the List that match this Predicate. This leads to the following benchmark:

最后,让我们看看如果我们使用一个Predicate,对大于751879的值返回true会发生什么:List中有大量的元素与这个Predicate匹配。这就导致了以下的基准。

Benchmark                                                                          Mode  Cnt    Score    Error  Units
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithCollectingAndThen  thrpt   25   70.879 ±  6.205  ops/s
BenchmarkRunner.evaluateFindUniqueElementMatchingPredicate_WithReduction          thrpt   25  210.142 ± 23.680  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithCollectingAndThen   thrpt   25   83.927 ±  1.812  ops/s
BenchmarkRunner.evaluateGetUniqueElementMatchingPredicate_WithReduction           thrpt   25  252.881 ±  2.710  ops/s

As we can see, the variants with reduction are more efficient. Moreover, using reduce directly on the filtered Stream shines because the Exception is thrown straight after two matching values have been found.

我们可以看到,带有reduce的变体更加高效。此外,直接在过滤后的Stream上使用reduce会更有优势,因为在找到两个匹配的值后会直接抛出Exception

To put it in a nutshell, if performance is a matter:

一言以蔽之,如果业绩是一个问题。

  • Using reduction should be favored
  • If we expect a lot of potential matching values to be found, the get method that reduces the Stream is much faster

5. Conclusion

5.总结

In this tutorial, we saw different methods to retrieve a unique result after filtering a Stream, then compared their efficiency.

在本教程中,我们看到了过滤Stream后检索唯一结果的不同方法,然后比较它们的效率。

As always, the code is available over on GitHub.

像往常一样,代码可在GitHub上获得