Benchmark JDK Collections vs Eclipse Collections – JDK集合与Eclipse集合的比较

最后修改: 2019年 11月 20日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.绪论

In this tutorial, we’re going to compare the performance of traditional JDK collections with Eclipse Collections. We’ll create different scenarios and explore the results.

在本教程中,我们将比较传统JDK集合与Eclipse集合的性能。我们将创建不同的场景并探索其结果。

2. Configuration

2.配置

First, note that for this article, we’ll use the default configuration to run the tests. No flags or other parameters will be set on our benchmark.

首先,请注意,在这篇文章中,我们将使用默认配置来运行测试。在我们的基准上将不设置任何标志或其他参数。

We’ll use the following hardware and libraries:

我们将使用以下硬件和库。

The easiest way to create our project is via the command-line:

创建我们项目的最简单方法是通过命令行。

mvn archetype:generate \
  -DinteractiveMode=false \
  -DarchetypeGroupId=org.openjdk.jmh \
  -DarchetypeArtifactId=jmh-java-benchmark-archetype \
  -DgroupId=com.baeldung \
  -DartifactId=benchmark \
  -Dversion=1.0

After that, we can open the project using our favorite IDE and edit the pom.xml to add the Eclipse Collections dependencies:

之后,我们可以使用我们最喜欢的IDE打开项目,并编辑pom.xml以添加Eclipse Collections的依赖项

<dependency>
    <groupId>org.eclipse.collections</groupId>
    <artifactId>eclipse-collections</artifactId>
    <version>10.0.0</version>
</dependency>
<dependency>
    <groupId>org.eclipse.collections</groupId>
    <artifactId>eclipse-collections-api</artifactId>
    <version>10.0.0</version>
</dependency>

3. First Benchmark

3.第一个基准

Our first benchmark is simple. We want to calculate the sum of a previously created List of Integers.

我们的第一个基准很简单。我们想计算一个先前创建的List of Integers的总和。

We’ll test six different combinations while running them in serial and parallel:

我们将测试六种不同的组合,同时以串行和并行方式运行它们。

private List<Integer> jdkIntList;
private MutableList<Integer> ecMutableList;
private ExecutorService executor;
private IntList ecIntList;

@Setup
public void setup() {
    PrimitiveIterator.OfInt iterator = new Random(1L).ints(-10000, 10000).iterator();
    ecMutableList = FastList.newWithNValues(1_000_000, iterator::nextInt);
    jdkIntList = new ArrayList<>(1_000_000);
    jdkIntList.addAll(ecMutableList);
    ecIntList = ecMutableList.collectInt(i -> i, new IntArrayList(1_000_000));
    executor = Executors.newWorkStealingPool();
}

@Benchmark
public long jdkList() {
    return jdkIntList.stream().mapToLong(i -> i).sum();
}

@Benchmark
public long ecMutableList() {
    return ecMutableList.sumOfInt(i -> i);
}

@Benchmark
public long jdkListParallel() {
    return jdkIntList.parallelStream().mapToLong(i -> i).sum();
}

@Benchmark
public long ecMutableListParallel() {
    return ecMutableList.asParallel(executor, 100_000).sumOfInt(i -> i);
}

@Benchmark
public long ecPrimitive() { 
    return this.ecIntList.sum(); 
}

@Benchmark
public long ecPrimitiveParallel() {
    return this.ecIntList.primitiveParallelStream().sum(); 
}

To run our first benchmark we need to execute:

为了运行我们的第一个基准测试,我们需要执行。

mvn clean install
java -jar target/benchmarks.jar IntegerListSum -rf json

This will trigger the benchmark at our IntegerListSum class and save the result to a JSON file.

这将在我们的IntegerListSum类中触发基准,并将结果保存到一个JSON文件中。

We’ll measure the throughput or number of operations per second in our tests, so the higher the better:

我们将在测试中测量吞吐量或每秒的操作数,所以越高越好:

Benchmark                              Mode  Cnt     Score       Error  Units
IntegerListSum.ecMutableList          thrpt   10   573.016 ±    35.865  ops/s
IntegerListSum.ecMutableListParallel  thrpt   10  1251.353 ±   705.196  ops/s
IntegerListSum.ecPrimitive            thrpt   10  4067.901 ±   258.574  ops/s
IntegerListSum.ecPrimitiveParallel    thrpt   10  8827.092 ± 11143.823  ops/s
IntegerListSum.jdkList                thrpt   10   568.696 ±     7.951  ops/s
IntegerListSum.jdkListParallel        thrpt   10   918.512 ±    27.487  ops/s

Accordingly to our tests, Eclipse Collections’ parallel list of primitives had the highest throughput of all. Also, it was the most efficient with a performance almost 10x faster than the Java JDK running also in parallel.

根据我们的测试,Eclipse Collections的基元并行列表的吞吐量是所有列表中最高的。此外,它也是最有效的,其性能几乎比同样以并行方式运行的 Java JDK 快 10 倍。

Of course, a portion of that can be explained by the fact that when working with primitive lists, we don’t have the cost associated with boxing and unboxing.

当然,其中一部分可以解释为,在使用primitive list时,我们没有装箱和拆箱的相关成本。

We can use JMH Visualizer to analyze our results. The chart below shows a better visualization:

我们可以使用JMH Visualizer来分析我们的结果。下面的图表显示了一个更好的可视化。

4. Filtering

4.过滤

Next, we’ll modify our list to get all elements that are multiple of 5. We’ll reuse a big portion of our previous benchmark and a filter function:

接下来,我们将修改我们的列表以获得所有5的倍数的元素。我们将重新使用我们之前的基准的很大一部分和一个过滤函数。

private List<Integer> jdkIntList;
private MutableList<Integer> ecMutableList;
private IntList ecIntList;
private ExecutorService executor;

@Setup
public void setup() {
    PrimitiveIterator.OfInt iterator = new Random(1L).ints(-10000, 10000).iterator();
    ecMutableList = FastList.newWithNValues(1_000_000, iterator::nextInt);
    jdkIntList = new ArrayList<>(1_000_000);
    jdkIntList.addAll(ecMutableList);
    ecIntList = ecMutableList.collectInt(i -> i, new IntArrayList(1_000_000));
    executor = Executors.newWorkStealingPool();
}

@Benchmark
public List<Integer> jdkList() {
    return jdkIntList.stream().filter(i -> i % 5 == 0).collect(Collectors.toList());
}

@Benchmark
public MutableList<Integer> ecMutableList() {
    return ecMutableList.select(i -> i % 5 == 0);
}


@Benchmark
public List<Integer> jdkListParallel() {
    return jdkIntList.parallelStream().filter(i -> i % 5 == 0).collect(Collectors.toList());
}

@Benchmark
public MutableList<Integer> ecMutableListParallel() {
    return ecMutableList.asParallel(executor, 100_000).select(i -> i % 5 == 0).toList();
}

@Benchmark
public IntList ecPrimitive() {
    return this.ecIntList.select(i -> i % 5 == 0);
}

@Benchmark
public IntList ecPrimitiveParallel() {
    return this.ecIntList.primitiveParallelStream()
      .filter(i -> i % 5 == 0)
      .collect(IntLists.mutable::empty, MutableIntList::add, MutableIntList::addAll);
}

We’ll execute the test just like before:

我们将像以前一样执行测试。

mvn clean install
java -jar target/benchmarks.jar IntegerListFilter -rf json

And the results:

而结果是。

Benchmark                                 Mode  Cnt     Score    Error  Units
IntegerListFilter.ecMutableList          thrpt   10   145.733 ±  7.000  ops/s
IntegerListFilter.ecMutableListParallel  thrpt   10   603.191 ± 24.799  ops/s
IntegerListFilter.ecPrimitive            thrpt   10   232.873 ±  8.032  ops/s
IntegerListFilter.ecPrimitiveParallel    thrpt   10  1029.481 ± 50.570  ops/s
IntegerListFilter.jdkList                thrpt   10   155.284 ±  4.562  ops/s
IntegerListFilter.jdkListParallel        thrpt   10   445.737 ± 23.685  ops/s

As we can see, the Eclipse Collections Primitive was the winner again. With a throughput more than 2x faster than the JDK parallel list.

我们可以看到,Eclipse Collections Primitive再次成为赢家。它的吞吐量比JDK的并行列表快2倍以上。

Note that for filtering, the effect of parallel processing is more visible. Summing is a cheap operation for the CPU and we won’t see the same differences between serial and parallel.

请注意,对于过滤来说,并行处理的效果更加明显。对CPU来说,求和是一个廉价的操作,我们不会看到串行和并行之间的相同差异。

Also, the performance boost that Eclipse Collections primitive lists got earlier begins to evaporate as the work done on each element begins to outweigh the cost of boxing and unboxing.

另外,Eclipse集合原始列表早先得到的性能提升开始消失,因为在每个元素上所做的工作开始超过了装箱和拆箱的成本。

To finalize, we could see that operations on primitives are faster than objects:

最后,我们可以看到,对基元的操作要比对象快。

5. Conclusion

5.总结

In this article, we created a couple of benchmarks to compare Java Collections with Eclipse Collections. We’ve leveraged JMH to try to minimize the environment bias.

在这篇文章中,我们创建了几个基准来比较Java Collections和Eclipse Collections。我们利用JMH来尽量减少环境的偏差。

As always, the source code is available over on GitHub.

一如既往,源代码可在GitHub上获得over。