Java Streams vs Vavr Streams – Java Streams vs Vavr Streams

最后修改: 2018年 5月 2日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this article, we’ll be looking at how Stream implementations differ in Java and Vavr.

在这篇文章中,我们将探讨Stream在Java和Vavr中的实现有何不同。

This article assumes familiarity with the basics of both Java Stream API and the Vavr library.

本文假定您熟悉Java Stream APIVavr库的基本知识。

2. Comparison

2.比较

Both implementations represent the same concept of lazy sequences but differ in details.

两种实现都代表了懒惰序列的相同概念,但在细节上有所不同。

Java Streams were built with robust parallelism in mind, providing easy support for parallelization. On the other hand, Vavr implementation favors handy work with sequences of data and provides no native support for parallelism (but it can be achieved by converting an instance to a Java implementation).

Java Streams在构建时考虑到了强大的并行性,为并行化提供了方便的支持。另一方面,Vavr的实现偏向于方便地处理数据序列,并且没有提供对并行化的本地支持(但可以通过将实例转换为Java实现来实现)。

This is why Java Streams are backed by Spliterator instances – an upgrade to the much older Iterator and Vavr’s implementation is backed by the aforementioned Iterator (at least in one of the latest implementations).

这就是为什么Java流由Spliterator实例支持的原因–它是对更古老的Iterator的升级,而Vavr的实现由上述Iterator支持(至少在一个最新实现中)。

Both implementations are loosely tied to its backing data structure and are essentially facades on top of the source of data that the stream traverses, but since Vavr’s implementation is Iterator-based, it doesn’t tolerate concurrent modifications of the source collection.

这两种实现都与它的支持数据结构松散地联系在一起,本质上都是流所遍历的数据源上的门面,但是由于Vavr的实现是基于迭代器的它不能容忍源集合的并发修改。

Java’s handling of stream sources makes it possible for well-behaved stream sources in to be modified before the terminal stream operation gets executed. 

Java对流源的处理使行为良好的流源在终端流操作被执行之前被修改成为可能。

The fundamental design difference notwithstanding, Vavr provides a very robust API that converts its streams (and other data structures) to Java implementation.

尽管有基本的设计差异,Vavr提供了一个非常强大的API,将其流(和其他数据结构)转换为Java实现。

3. Additional Functionality

3.附加功能

The approach to dealing with streams and their elements lead to interesting differences in the ways we can work with them in both Java and Vavr

处理流及其元素的方法导致了我们在Java和Vavr中处理它们的方式的有趣差异

3.1. Random Element Access

3.1.随机元素访问

Providing convenient API and access methods to elements is one area that Vavr truly shines over the Java API. For example, Vavr has some methods that provide random element access:

为元素提供方便的API和访问方法是Vavr真正比Java API出色的地方。例如,Vavr有一些提供随机元素访问的方法。

  • get() provides index-based access to elements of a stream.
  • indexOf() provides the same index location functionality as in the standard Java List.
  • insert() provides the ability to add an element to a stream at a specified position.
  • intersperse() will insert the provided argument in between all the elements of the stream.
  • find() will locate and return an item from within the stream. Java provides noneMatched which just checks for existence of an element.
  • update() will replace the element at a given index. This also accepts a function to compute the replacement.
  • search() will locate an item in a sorted stream (unsorted streams will yield an undefined result)

It’s important we remember that this functionality is still backed by a data structure that has a linear performance for searches.

重要的是我们要记住,这个功能仍然是由一个数据结构支持的,它的搜索性能是线性的。

3.2. Parallelism and Concurrent Modification

3.2.并行性和并发性修改

While Vavr’s Streams don’t natively support parallelism like Java’s parallel() method, there is the toJavaParallelStream method that provides a parallelized Java-based copy of the source Vavr stream.

虽然Vavr的流并不像Java的parallel()方法那样原生支持并行,但有一个toJavaParallelStream方法可以提供一个基于Java的源Vavr流的并行副本。

An area of relative weakness in Vavr streams is on the principle of Non-Interference.

瓦夫尔流的一个相对薄弱的领域是关于不干涉的原则

Simply put, Java streams allow us to modify the underlying data source right up until a terminal operation is called. As long as a terminal operation hasn’t been called on a given Java stream, the stream can pick up any changes to the underlying data source:

简单地说, Java 流允许我们修改底层数据源,直到终端操作被调用。只要终端操作没有在给定的Java流上被调用,该流就可以接收到对底层数据源的任何更改。

List<Integer> intList = new ArrayList<>();
intList.add(1);
intList.add(2);
intList.add(3);
Stream<Integer> intStream = intList.stream(); //form the stream
intList.add(5); //modify underlying list
intStream.forEach(i -> System.out.println("In a Java stream: " + i)); 

We’ll find that the last addition is reflected in the output from the stream. This behavior is consistent whether the modification is internal or external to the stream pipeline:

我们会发现,最后增加的内容会反映在流的输出中。无论修改是在流管道的内部还是外部,这种行为都是一致的。

in a Java stream: 1
in a Java stream: 2
in a Java stream: 3
in a Java stream: 5

We find that a Vavr stream won’t tolerate this:

我们发现,Vavr流不会容忍这一点。

Stream<Integer> vavrStream = Stream.ofAll(intList);
intList.add(5)
vavrStream.forEach(i -> System.out.println("in a Vavr Stream: " + i));

What we get:

我们得到了什么。

Exception in thread "main" java.util.ConcurrentModificationException
  at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
  at java.util.ArrayList$Itr.next(ArrayList.java:851)
  at io.vavr.collection.StreamModule$StreamFactory.create(Stream.java:2078)

Vavr streams are not “well-behaved”, by Java standards. Vavr is better-behaved with primitive backing data structures:

按照Java标准,Vavr流并不 “乖巧”。Vavr对原始的支持数据结构有更好的处理。

int[] aStream = new int[]{1, 2, 4};
Stream<Integer> wrapped = Stream.ofAll(aStream);

aStream[2] = 5;
wrapped.forEach(i -> System.out.println("Vavr looped " + i));

Giving us:

给予我们。

Vavr looped 1
Vavr looped 2
Vavr looped 5

3.3. Short-circuiting Operations and flatMap()

3.3.短路操作和flatMap()

The flatMap, like the map operation, is an intermediate operation in stream processing – both implementations follow the contract of intermediate stream operations – processing from the underlying data structure shouldn’t occur until a terminal operation has been called.

flatMap,map操作一样,是流处理中的一个中间操作 – 这两个实现都遵循中间流操作的契约 – 在调用终端操作之前,不应该发生来自底层数据结构的处理。

JDK 8 and 9 however feature a bug that causes the flatMap implementation to break this contract and evaluate eagerly when combined with short-circuiting intermediate operations like findFirst or limit.

然而,JDK 8 和 9 具有一个错误,该错误导致flatMap的实现破坏了这一契约,并在与findFirstlimit等短路的中间操作结合时急于评估。

A simple example:

一个简单的例子。

Stream.of(42)
  .flatMap(i -> Stream.generate(() -> { 
      System.out.println("nested call"); 
      return 42; 
  }))
  .findAny();

In the above snippet, we will never get a result from findAny because flatMap will be evaluated eagerly, instead of simply taking a single element from the nested Stream.

在上面的片段中,我们永远不会从findAny中得到一个结果,因为flatMap将被急切地评估,而不是简单地从嵌套的Stream中获取一个元素。

A fix for this bug was provided in Java 10.

在Java 10中提供了对这个错误的修复。

Vavr’s flatMap doesn’t have the same problem and a functionally similar operation completes in O(1):

Vavr的flatMap没有同样的问题,而且一个功能类似的操作在O(1)中完成。

Stream.of(42)
  .flatMap(i -> Stream.continually(() -> { 
      System.out.println("nested call"); 
      return 42; 
  }))
  .get(0);

3.4. Core Vavr Functionality

3.4.Vavr的核心功能

In some areas, there just isn’t a one to one comparison between Java and Vavr; Vavr enhances the streaming experience with functionality that is directly unmatched in Java (or at least requires a fair amount of manual work):

在某些方面,Java和Vavr之间根本无法进行一对一的比较;Vavr增强了流媒体的体验,其功能是Java直接无法比拟的(或者至少需要相当多的手工作业)。

  • zip() pairs up items in the stream with those from a supplied Iterable. This operation used to be supported in JDK-8 but has since been removed after build-93
  • partition() will split the content of a stream into two streams, given a predicate.
  • permutation() as named, will compute the permutation (all possible unique orderings) of the elements of the stream.
  • combinations() gives the combination (i.e. possible selection of items) of the stream.
  • groupBy will return a Map of streams containing elements from the original stream, categorized by a supplied classifier.
  •  distinct method in Vavr improves on the Java version by providing a variant that accepts a compareTo lambda expression.

While the support for advanced functionality is somewhat uninspired in Java SE streams, Expression Language 3.0 oddly provides support for way more functionality than standard JDK streams.

虽然在Java SE流中对高级功能的支持有些不伦不类,但表达式语言3.0却奇怪地提供了比标准JDK流多得多的功能支持。

4. Stream Manipulation

4.流媒体操纵

Vavr allows direct manipulation of the content of a stream:

Vavr允许直接操纵流的内容。

  • Insert into an existing Vavr stream
Stream<String> vavredStream = Stream.of("foo", "bar", "baz");
vavredStream.forEach(item -> System.out.println("List items: " + item));
Stream<String> vavredStream2 = vavredStream.insert(2, "buzz");
vavredStream2.forEach(item -> System.out.println("List items: " + item));
  • Remove an item from a stream
Stream<String> removed = inserted.remove("buzz");
  • Queue-Based Operations 

By Vavr’s stream being backed by a queue, it provides constant-time prepend and append operations.

通过Vavr的流由队列支持,它提供了持续时间的prependappend操作。

However, changes made to the Vavr stream don’t propagate back to the data source that the stream was created from.

然而,对Vavr流所做的更改不会传播回创建该流的数据源。

5. Conclusion

5.结论

Vavr and Java both have their strengths, and we’ve demonstrated each library’s commitment to its design objectives – Java to cheap parallelism and Vavr to convenient stream operations.

Vavr和Java都有自己的优势,我们已经展示了每个库对其设计目标的承诺–Java致力于廉价的并行性,Vavr致力于方便的流操作。

With Vavr’s support for converting back and forth between its own stream and Java’s, one can derive the benefits of both libraries in the same project without a lot of overhead.

由于Vavr支持在它自己的流和Java的流之间来回转换,人们可以在同一个项目中获得两个库的好处,而不需要很多开销。

The source code for this tutorial is available over on Github.

本教程的源代码可在Github上获得