Returning Stream vs. Collection – 回流与收集

最后修改: 2021年 3月 26日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

The Java 8 Stream API offers an efficient alternative over Java Collections to render or process a result set. However, it’s a common dilemma to decide which one to use when.

Java 8 Stream APIJava Collections提供了一个高效的选择,以渲染或处理一个结果集。然而,要决定在什么时候使用哪一个,这是一个常见的难题。

In this article, we’ll explore Stream and Collection and discuss various scenarios that suit their respective uses.

在这篇文章中,我们将探讨StreamCollection,并讨论适合其各自用途的各种场景。

2. Collection vs. Stream

2.集合

Java Collections offer efficient mechanisms to store and process the data by providing data structures like ListSet, and Map.

Java Collections通过提供像List的数据结构来提供高效机制来存储和处理数据。Set。和Map

However, the Stream API is useful for performing various operations on the data without the need for intermediate storage. Therefore, a Stream works similarly to directly accessing the data from the underlying storage like collections and I/O resources.

然而,Stream API对于在不需要中间存储的情况下对数据执行各种操作非常有用。因此,Stream的作用类似于直接从底层存储中访问数据,如集合和I/O资源

Additionally, the collections are primarily concerned with providing access to the data and ways to modify it. On the other hand, streams are concerned with transmitting data efficiently.

此外,集合主要关注的是提供对数据的访问和修改数据的方法。另一方面,流关注的是有效地传输数据。

Although Java allows easy conversion from Collection to Stream and vice-versa, it’s handy to know which is the best possible mechanism to render/process a result set.

尽管Java允许从Collection轻松转换为Stream,反之亦然,但要知道哪种机制是渲染/处理结果集的最佳机制,还是很方便的。

For instance, we can convert a Collection into a Stream using the stream and parallelStream methods:

例如,我们可以使用stream parallelStream方法将Collection转换为Stream

public Stream<String> userNames() {
    ArrayList<String> userNameSource = new ArrayList<>();
    userNameSource.add("john");
    userNameSource.add("smith");
    userNameSource.add("tom");
    return userNames.stream();
}

Similarly, we can convert a Stream into a Collection using the collect method of the Stream API:

同样,我们可以使用Stream API的collect方法,将Stream转换为Collection

public List<String> userNameList() {
    return userNames().collect(Collectors.toList());
}

Here, we’ve converted a Stream into a List using the Collectors.toList() method. Similarly, we can convert a Stream into a Set or into a Map:

在这里,我们使用Collectors.toList()方法将一个Stream转换成一个List。同样地,我们可以将一个Stream转换成Set或者转换成一个Map

public static Set<String> userNameSet() {
    return userNames().collect(Collectors.toSet());
}

public static Map<String, String> userNameMap() {
    return userNames().collect(Collectors.toMap(u1 -> u1.toString(), u1 -> u1.toString()));
}

3. When to Return a Stream?

3.何时返回

3.1. High Materialization Cost

3.1.高物质化成本

The Stream API offers lazy execution and filtering of the results on the go, the most effective ways to lower the materialization cost.

Stream API提供了懒惰的执行和对结果的过滤,这是降低物化成本的最有效方法。

For instance, the readAllLines method in the Java NIO Files class renders all the lines of a file, for which the JVM has to hold the entire file contents in memory. So, this method has a high materialization cost involved in returning the list of lines.

例如,Java NIO Files类中的readAllLines方法会渲染文件的所有行,为此,JVM必须在内存中保存整个文件内容。因此,该方法在返回行的列表时涉及很高的物化成本。

However, the Files class also provides the lines method that returns a Stream that we can use to render all the lines or even better restrict the size of the result set using the limit method – both with lazy execution:

然而,Files类也提供了lines方法,该方法返回一个Stream,我们可以用它来渲染所有的行,甚至更好地使用limit方法来限制结果集的大小–都是懒人执行。

Files.lines(path).limit(10).collect(toList());

Also, a Stream doesn’t perform the intermediate operations until we invoke terminal operations like forEach over it:

另外,Stream在我们对它调用终端操作(如forEach)之前,不会执行中间操作。

userNames().filter(i -> i.length() >= 4).forEach(System.out::println);

Therefore, a Stream avoids the costs associated with premature materialization.

因此,Stream避免了与过早实现有关的成本。

3.2. Large or Infinite Result

3.2.大或无限的结果

Streams are designed for better performance with large or infinite results. Therefore, it’s always a good idea to use a Stream for such a use case.

Stream的设计是为了在大的或无限的结果中获得更好的性能。因此,在这种使用情况下,使用Stream总是一个好主意。

Also, in the case of infinite results, we usually don’t process the entire result set. So, Stream API’s built-in features like filter and limit prove handy in processing the desired result set, making the Stream a preferable choice.

另外,在出现无限结果的情况下,我们通常不会处理整个结果集。因此,Stream API的内置功能,如filterlimit被证明在处理所需的结果集时很方便,使Stream成为一个更好的选择。

3.3. Flexibility

3.3.灵活性

Streams are very flexible in allowing the processing of the results in any form or order.

Streams非常灵活,允许以任何形式或顺序处理结果。

A Stream is an obvious choice when we don’t want to enforce a consistent result set to the consumer. Additionally, the Stream is a great choice when we want to offer much-needed flexibility to the consumer.

当我们不想向消费者强制执行一个一致的结果集时,Stream是一个明显的选择。此外,当我们想为消费者提供亟需的灵活性时,Stream是一个很好的选择。

For instance, we can filter/order/limit the results using various operations available on the Stream API:

例如,我们可以使用Stream API上的各种操作来过滤/排序/限制结果。

public static Stream<String> filterUserNames() {
    return userNames().filter(i -> i.length() >= 4);
}

public static Stream<String> sortUserNames() {
    return userNames().sorted();
}

public static Stream<String> limitUserNames() {
    return userNames().limit(3);
}

3.4. Functional Behavior

3.4.功能性行为

A Stream is functional. It doesn’t allow any modification to the source when processed in different ways. Therefore, it’s a preferred choice to render an immutable result set.

一个Stream是功能性的。它不允许在以不同方式处理时对源进行任何修改。因此,它是渲染一个不可变的结果集的首选。

For instance, let’s filter and limit a set of results received from the primary Stream:

例如,让我们过滤限制一组从主收到的结果。

userNames().filter(i -> i.length() >= 4).limit(3).forEach(System.out::println);

Here, operations like filter and limit on the Stream return a new Stream every time and don’t modify the source Stream provided by the userNames method.

在这里,像filterlimitStream的操作每次都会返回一个新的Stream,并且不会修改由userNames方法提供的源Stream

4. When to Return a Collection?

4.何时返回集合

4.1. Low Materialization Cost

4.1.低物化成本

We can choose collections over streams when rendering or processing the results involving low materialization cost.

在渲染或处理涉及低物化成本的结果时,我们可以选择集合而不是流。

In other words, Java constructs a Collection eagerly by computing all the elements at the beginning. Hence, a Collection with a large result set puts a lot of pressure on the heap memory in materialization.

换句话说,Java通过在开始时计算所有元素来急切地构造一个Collection。因此,一个具有大型结果集的Collection在物化时对堆内存造成了很大的压力。

Therefore, we should consider a Collection to render a result set that doesn’t put much pressure on the heap memory for its materialization.

因此,我们应该考虑用Collection来渲染一个结果集,这样就不会对堆内存的物化造成很大压力。

4.2. Fixed Format

4.2.固定格式

We can use a Collection to enforce a consistent result set for the user. For instance, Collections like TreeSet and TreeMap return naturally ordered results.

我们可以使用一个Collection来为用户执行一个一致的结果集。例如,像TreeSetTreeMap这样的Collection会自然返回有序的结果。

In other words, with the use of the Collection, we can ensure each consumer receives and processes the same result set in identical order.

换句话说,通过使用Collection,我们可以确保每个消费者以相同的顺序接收和处理相同的结果集。

4.3. Reuseable Result

4.3.可重复使用的结果

When a result is returned in the form of a Collection, it can be easily traversed multiple times. However, a Stream is considered consumed once traversed and throws IllegalStateException when reused:

当一个结果以Collection的形式被返回时,它可以很容易地被多次遍历。然而,一个Stream一旦被遍历就会被认为是被消耗了,并且在重复使用时抛出IllegalStateException

public static void tryStreamTraversal() {
    Stream<String> userNameStream = userNames();
    userNameStream.forEach(System.out::println);
    
    try {
        userNameStream.forEach(System.out::println);
    } catch(IllegalStateException e) {
        System.out.println("stream has already been operated upon or closed");
    }
}

Therefore, returning a Collection is a better choice when it’s obvious that a consumer will traverse the result multiple times.

因此,当消费者显然会多次遍历结果时,返回一个Collection是一个更好的选择。

4.4. Modification

4.4.修改

A Collection, unlike a Stream, allows modification of the elements like adding or removing elements from the result source. Hence, we can consider using collections to return the result set to allow modifications by the consumer.

Stream不同,Collection允许对元素进行修改,例如从结果源中添加或删除元素。因此,我们可以考虑使用集合来返回结果集,以允许消费者进行修改。

For example, we can modify an ArrayList using add/remove methods:

例如,我们可以使用add/remove方法修改一个ArrayList

userNameList().add("bob");
userNameList().add("pepper");
userNameList().remove(2);

Similarly, methods like put and remove allow modification on a map:

同样,像putremove这样的方法允许对地图进行修改。

Map<String, String> userNameMap = userNameMap();
userNameMap.put("bob", "bob");
userNameMap.remove("alfred");

4.5. In-Memory Result

4.5.内存中的结果

Additionally, it’s an obvious choice to use a Collection when a materialized result in the form of the collection is already present in memory.

此外,当以集合的形式出现在内存中的物化结果已经存在时,使用Collection是一个明显的选择。

5. Conclusion

5.总结

In this article, we compared Stream vs. Collection and examined various scenarios that suit them.

在这篇文章中,我们比较了StreamCollection,并研究了适合它们的各种场景。

We can conclude that Stream is a great candidate to render large or infinite result sets with benefits like lazy initialization, much-needed flexibility, and functional behavior.

我们可以得出结论,Stream是渲染大型或无限结果集的最佳候选者,它具有懒惰初始化、急需的灵活性和功能性行为等优点。

However, when we require a consistent form of the results, or when low materialization is involved, we should choose a Collection over a Stream.

然而,当我们需要一个一致的结果形式,或者涉及低物化时,我们应该选择Collection而不是Stream

As usual, the source code is available over on GitHub.

像往常一样,源代码可在GitHub上获得。