Modifying Objects Within Stream While Iterating – 迭代时修改流内对象

最后修改: 2023年 11月 7日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

The Java Stream API provides various methods that allow modifications of the stream elements. However, the actions inside these methods have to be non-interfering and stateless. Otherwise, this would result in incorrect behavior and output.

Java Stream API 提供了各种允许修改流元素的方法。但是,这些方法中的操作必须是无干扰和无状态的。否则,将导致不正确的行为和输出。

In this tutorial, we’ll discuss the common mistakes made while modifying the elements in a Java Stream and the correct way to do it.

在本教程中,我们将讨论在修改 Java 流中的元素时常犯的错误以及正确的方法。

2. Change the State of a Stream Element

2.更改流元素的状态

Let’s take an example of a list of Person class:

让我们以 Person 类的列表为例:

public class Person {
    private String name;
    private String email;

    public Person(String name, String email) {
        this.name = name;
        this.email = email;
    }
    //standard getters and setters..
}

We’ll modify the email ID of the Person elements inside a stream and convert it to uppercase.

我们将修改流内 Person 元素的电子邮件 ID,并将其转换为大写。

2.1. Modify With forEach() Method

2.1.使用 forEach() 方法进行修改

Let’s start with a very basic way of doing this by simply iterating over the list using the method forEach():

让我们从最基本的方法开始,使用方法 forEach() 对列表进行简单的遍历:

@Test
void givenPersonList_whenUpdatePersonEmailByInterferingWithForEach_thenPersonEmailUpdated() {
    personList.stream().forEach(e -> e.setEmail(e.getEmail().toUpperCase()));

    personList.stream().forEach(e -> assertEquals(e.getEmail(), e.getEmail().toUpperCase()));
}

In the above method, while iterating over the list of Person objects, the email of each of the elements is converted to uppercase. It looks legitimate but it violates the principle of non-interference. It means that in a stream pipeline, we should never modify the original source.

在上述方法中,当遍历 Person 对象列表时,每个元素的电子邮件都被转换为大写字母。这看起来是合法的,但它违反了不干涉原则。这意味着在流管道中,我们绝不能修改原始源。

Unless the stream source is concurrent, modifying a stream’s data source during the execution of a stream pipeline can cause exceptions, incorrect answers, or nonconformant behavior.

除非流源是并发的,否则在执行流管道期间修改流的数据源可能会导致异常、不正确的答案或不一致的行为

2.2. Modify With peek() Method

2.2.使用 peek() 方法进行修改

Let’s now look at the peek() method. We’re often tempted to use it for modifying the properties of the elements inside a stream:

现在让我们来看看 peek() 方法。我们经常想用它来修改流内元素的属性:

@Test
void givenPersonList_whenUpdatePersonEmailByInterferingWithPeek_thenPersonEmailUpdated() {
    personList.stream()
      .peek(e -> e.setEmail(e.getEmail().toUpperCase()))
      .collect(Collectors.toList());

    personList.forEach(e -> assertEquals(e.getEmail(), e.getEmail().toUpperCase()));
}

Again, by updating the source personList we’re repeating the same mistake mentioned in the earlier section.

同样,通过更新源 personList 我们正在重复前面章节中提到的错误

2.3. Modify With map() Method

2.3.使用 map() 方法修改

The method forEach() is a terminal operation in a stream pipeline. However, map(), like peek() is an intermediate operation that returns a Stream. In map() we’ll create a new Person object with email in uppercase and then collect it into a new list:

方法 forEach() 是流流水线中的终端操作。但是,map()peek() 一样,都是返回 Stream 的中间操作。在 map() 中,我们将创建一个新的Person对象,并将电子邮件设置为大写,然后将其收集到一个新的列表中:

@Test
void givenPersonList_whenUpdatePersonEmailWithMapMethod_thenPersonEmailUpdated() {
    List<Person> newPersonList = personList.stream()
      .map(e -> new Person(e.getName(), e.getEmail().toUpperCase()))
      .collect(Collectors.toList());

    newPersonList.forEach(e -> assertEquals(e.getEmail(), e.getEmail().toUpperCase()));
}

In the above method, we didn’t modify the original list. Instead, we created a new list newPersonList out of it. Hence, it’s non-interfering. It’s also stateless because the results of the actions inside it don’t affect each other. Mostly, they operate independently. These principles are recommended, regardless of whether it’s a sequential or a parallel processing.

在上述方法中,我们没有修改原始列表。相反,我们从中创建了一个新的列表 newPersonList 。因此,它是无干扰的。它也是无状态的,因为其中的操作结果不会相互影响。大多数情况下,它们都是独立运行的。不管是顺序处理还是并行处理,我们都推荐使用这些原则。

Considering immutability is one of the essences of functional programming, we can try to create an immutable Person class:

考虑到不可变性是函数式编程的精髓之一,我们可以尝试创建一个不可变的Person类:

public class ImmutablePerson {

    private String name;
    private String email;

    public ImmutablePerson(String name, String email) {
        this.name = name;
        this.email = email;
    }

    public ImmutablePerson withEmail(String email) {
        return new ImmutablePerson(this.name, email);
    }
    //Standard getters
}

The ImmutablePerson class doesn’t have any setter methods. However, it provides a method withEmail() that returns a new ImmutablePerson with email in uppercase.

ImmutablePerson 类没有任何设置方法。不过,它提供了一个 withEmail() 方法,可返回一个新的 ImmutablePerson 并带有大写的 email

Now, let’s use it while modifying the elements in the stream:

现在,让我们在修改数据流中的元素时使用它:

@Test
void givenPersonList_whenUpdateImmutablePersonEmailWithMapMethod_thenPersonEmailUpdated() {
    List<ImmutablePerson> newImmutablePersonList = immutablePersonList.stream()
      .map(e -> e.withEmail(e.getEmail().toUpperCase()))
      .collect(Collectors.toList());

    newImmutablePersonList.forEach(e -> assertEquals(e.getEmail(), e.getEmail().toUpperCase()));
}

With this, we’re enforcing non-interference.

有了这个,我们就可以实施不干涉。

3. Remove Element From a Stream

3.从流中移除元素

Performing structural changes in a stream is even trickier. This is a costlier operation than modification and hence if care isn’t taken, it might lead to inconsistent and undesirable outcomes. Let’s explore this in detail.

在数据流中执行结构更改更加棘手。这是比修改更昂贵的操作,因此如果不小心谨慎,可能会导致不一致和不理想的结果。让我们详细探讨一下这个问题。

3.1. Remove Element With forEach() Method

3.1.使用 forEach() 方法移除元素

What if we want to remove a few elements from a stream? For example, let’s remove the person with the name John from the list:

如果我们想从数据流中删除几个元素,该怎么办?例如,让我们从列表中删除姓名为 John 的人:

@Test
void givenPersonList_whenRemoveWhileIterating_thenThrowException() {
    assertThrows(NullPointerException.class, () -> {
        personList.stream().forEach(e -> {
            if(e.getName().equals("John")) {
                personList.remove(e);
            }
        });
    });
}

We tried to modify the structure of the list in the forEach() method while iterating. Surprisingly, this results in NullPointerException unlike the forEach() in an ArrayList which throws ConcurrentModificationException:

我们尝试在迭代时修改 forEach() 方法中的列表结构。令人惊讶的是,这会导致 NullPointerException 异常,而不像 ArrayList 中的 forEach() 会抛出 ConcurrentModificationException 异常:

@Test
void givenPersonList_whenRemoveWhileIteratingWithForEach_thenThrowException() {
    assertThrows(ConcurrentModificationException.class, () -> {
        personList.forEach(e -> {
            if(e.getName().equals("John")) {
                personList.remove(e);
            }
        });
    });
}

3.2. Remove Element With CopyOnWriteArrayList

3.2.使用 CopyOnWriteArrayList 删除元素</em

CopyOnWriteArrayList is a thread-safe version of ArrayList. While iterating on it elements can be removed:

CopyOnWriteArrayListArrayList 的线程安全版本。在迭代时,可以删除其中的元素:

@Test
void givenPersonList_whenRemoveWhileIterating_thenPersonRemoved() {
    assertEquals(4, personList.size());
    
    CopyOnWriteArrayList<Person> cps = new CopyOnWriteArrayList<>(personList);
    cps.stream().forEach(e -> {
        if(e.getName().equals("John")) {
            cps.remove(e);
        }
    });

    assertEquals(3, cps.size());
}

It can prevent interference among multiple threads but it’s too costly because, for every write operation, it creates a snapshot.

它可以防止多个线程之间的干扰,但成本太高,因为每次写操作都会创建一个快照。

3.3. Remove Element With filter() Method

3.3.使用 filter() 方法移除元素

The Java Stream API provides the method filter() to remove elements in a more elegant way:

Java流API提供了filter()方法,以更优雅的方式移除元素

@Test
void givenPersonList_whenRemovePersonWithFilter_thenPersonRemoved() {
    assertEquals(4, personList.size());

    List<Person> newPersonList = personList.stream()
      .filter(e -> !e.getName().equals("John"))
      .collect(Collectors.toList());

    assertEquals(3, newPersonList.size());
}

In the above method, filter() allows only those Person objects to move forward in the pipeline that don’t have a name, John. Again, the predicate used inside the filter method should be non-interfering and stateless. It also looks simpler, easy to understand and troubleshoot.

在上述方法中,filter() 只允许那些没有名称 JohnPerson 对象在管道中前进。同样,过滤器方法中使用的 predicate 应该是无干扰和无状态的。它看起来也更简单、易于理解和排除故障。

4. Conclusion

4.结论

In this article, we’ve looked at the correct way of modifying the elements in a stream. It’s important that pipeline processing should be non-interfering and stateless. Otherwise, this could result in unexpected results.

在本文中,我们介绍了修改流中元素的正确方法。重要的是,流水线处理应是无干扰和无状态的。否则,可能会导致意想不到的结果。

As usual, the code used in this article can be found over on GitHub.

和往常一样,本文中使用的代码可以在 GitHub 上找到