DistinctBy in the Java Stream API – Java流API中的DistinctBy

最后修改: 2017年 8月 22日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Searching for different elements in a list is one of the common tasks that we as programmers usually face. From Java 8 on with the inclusion of Streams we have a new API to process data using functional approach.

在一个列表中搜索不同的元素是我们作为程序员通常面临的任务之一。从Java 8开始,随着Streams的加入,我们有了一个新的API来使用函数式方法处理数据。

In this article, we’ll show different alternatives to filtering a collection using a particular attribute of objects in the list.

在这篇文章中,我们将展示使用列表中对象的特定属性过滤集合的不同选择。

2. Using the Stream API

2.使用流API

The Stream API provides the distinct() method that returns different elements of a list based on the equals() method of the Object class.

Stream API提供了distinct()方法,该方法根据Object类的equals()方法返回列表的不同元素。

However, it becomes less flexible if we want to filter by a specific attribute. One of the alternatives we have is to write a filter that maintains the state.

然而,如果我们想通过一个特定的属性来过滤,它就变得不那么灵活了。我们有一个选择,就是写一个保持状态的过滤器。

2.1. Using a Stateful Filter

2.1.使用有状态过滤器

One of the possible solutions would be to implement a stateful Predicate:

其中一个可能的解决方案是实现一个有状态的谓词:

public static <T> Predicate<T> distinctByKey(
    Function<? super T, ?> keyExtractor) {
  
    Map<Object, Boolean> seen = new ConcurrentHashMap<>(); 
    return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; 
}

To test it, we’ll use the following Person class that has the attributes age, email, and name:

为了测试它,我们将使用以下Person类,该类具有ageemailname:属性。

public class Person { 
    private int age; 
    private String name; 
    private String email; 
    // standard getters and setters 
}

And to get a new filtered collection by name, we can use:

而要获得一个新的按name过滤的集合,我们可以使用。

List<Person> personListFiltered = personList.stream() 
  .filter(distinctByKey(p -> p.getName())) 
  .collect(Collectors.toList());

3. Using Eclipse Collections

3.使用Eclipse集合

Eclipse Collections is a library that provides additional methods for processing Streams and collections in Java.

Eclipse Collections是一个库,它提供了额外的方法来处理Streams和Java中的集合。

3.1. Using the ListIterate.distinct()

3.1.使用ListIterate.distinct()

The ListIterate.distinct() method allows us to filter a Stream using various HashingStrategies. These strategies can be defined using lambda expressions or method references.

ListIterate.distinct()方法允许我们使用各种HashingStrategies来过滤Stream这些策略可以使用lambda表达式或方法引用来定义。

If we want to filter by the Person’s name:

如果我们想通过Person’s name来过滤。

List<Person> personListFiltered = ListIterate
  .distinct(personList, HashingStrategies.fromFunction(Person::getName));

Or, if the attribute we are going to use is primitive (int, long, double), we can use a specialized function like this:

或者,如果我们要使用的属性是原始的(int、long、double),我们可以使用这样的专门函数。

List<Person> personListFiltered = ListIterate.distinct(
  personList, HashingStrategies.fromIntFunction(Person::getAge));

3.2. Maven Dependency

3.2.Maven的依赖性

We need to add the following dependencies to our pom.xml to use Eclipse Collections in our project:

我们需要在我们的pom.xml中添加以下依赖项,以便在我们的项目中使用Eclipse Collections。

<dependency> 
    <groupId>org.eclipse.collections</groupId> 
    <artifactId>eclipse-collections</artifactId> 
    <version>8.2.0</version> 
</dependency>

You can find the latest version of the Eclipse Collections library in the Maven Central repository.

您可以在Maven Central资源库中找到最新版本的Eclipse Collections库。

To learn more about this library we can go to this article.

要了解关于这个库的更多信息,我们可以去看看这篇文章

4. Using Vavr (Javaslang)

4.使用Vavr(Javaslang

This is a functional library for Java 8 that provides immutable data and functional control structures.

这是一个适用于Java 8的功能库,提供不可变的数据和功能控制结构。

4.1. Using List.distinctBy

4.1.使用List.distinctBy

To filter lists, this class provides its own List class which has the distinctBy() method that allows us to filter by attributes of the objects it contains:

为了过滤列表,这个类提供了自己的List类,它有distinctBy()方法,允许我们通过它所包含的对象的属性来过滤。

List<Person> personListFiltered = List.ofAll(personList)
  .distinctBy(Person::getName)
  .toJavaList();

4.2. Maven Dependency

4.2.Maven的依赖性

We will add the following dependencies to our pom.xml to use Vavr in our project.

我们将在我们的pom.xml中添加以下依赖项,以便在我们的项目中使用Vavr。

<dependency> 
    <groupId>io.vavr</groupId> 
    <artifactId>vavr</artifactId> 
    <version>0.9.0</version>  
</dependency>

You can find the latest version of the Vavr library in the Maven Central repository.

您可以在Maven Central资源库中找到Vavr库的最新版本。

To learn more about this library we can go to this article.

要了解关于这个库的更多信息,我们可以去看看这篇文章

5. Using StreamEx

5.使用StreamEx

This library provides useful classes and methods for Java 8 streams processing.

这个库为Java 8的流处理提供了有用的类和方法。

5.1. Using StreamEx.distinct

5.1.使用StreamEx.distinct

Within the classes provided is StreamEx which has the distinct method to which we can send a reference to the attribute where we want to distinct:

在所提供的类中,StreamExdistinct方法,我们可以向其发送一个我们想要区分的属性的引用。

List<Person> personListFiltered = StreamEx.of(personList)
  .distinct(Person::getName)
  .toList();

5.2. Maven Dependency

5.2.Maven的依赖性

We will add the following dependencies to our pom.xml to use StreamEx in our project.

我们将在我们的pom.xml中添加以下依赖项,以便在我们的项目中使用StreamEx。

<dependency> 
    <groupId>one.util</groupId> 
    <artifactId>streamex</artifactId> 
    <version>0.6.5</version> 
</dependency>

You can find the latest version of the StreamEx library in the Maven Central repository.

您可以在Maven Central资源库中找到StreamEx库的最新版本。

6. Conclusion

6.结论

In this quick tutorial, we explored examples of how to get different elements of a Stream, based on an attribute using the standard Java 8 API and additional alternatives with other libraries.

在这个快速教程中,我们探讨了如何根据一个属性使用标准的Java 8 API和其他库的替代方案来获得一个流的不同元素的例子。

As always, the complete code is available over on GitHub.

一如既往,完整的代码可在GitHub上获得