1. Introduction
1.绪论
In this tutorial, we’re going to compare the performance of traditional JDK collections with Eclipse Collections. We’ll create different scenarios and explore the results.
在本教程中,我们将比较传统JDK集合与Eclipse集合的性能。我们将创建不同的场景并探索其结果。
2. Configuration
2.配置
First, note that for this article, we’ll use the default configuration to run the tests. No flags or other parameters will be set on our benchmark.
首先,请注意,在这篇文章中,我们将使用默认配置来运行测试。在我们的基准上将不设置任何标志或其他参数。
We’ll use the following hardware and libraries:
我们将使用以下硬件和库。
- JDK 11.0.3, Java HotSpot(TM) 64-Bit Server VM, 11.0.3+12-LTS.
- MacPro 2.6GHz 6-core i7 with 16GB DDR4.
- Eclipse Collections 10.0.0 (latest available at the time of writing)
- We’ll leverage JMH (Java Microbenchmark Harness) to run our benchmarks
- JMH Visualizer to generate charts from JMH results
The easiest way to create our project is via the command-line:
创建我们项目的最简单方法是通过命令行。
mvn archetype:generate \
-DinteractiveMode=false \
-DarchetypeGroupId=org.openjdk.jmh \
-DarchetypeArtifactId=jmh-java-benchmark-archetype \
-DgroupId=com.baeldung \
-DartifactId=benchmark \
-Dversion=1.0
After that, we can open the project using our favorite IDE and edit the pom.xml to add the Eclipse Collections dependencies:
之后,我们可以使用我们最喜欢的IDE打开项目,并编辑pom.xml以添加Eclipse Collections的依赖项。
<dependency>
<groupId>org.eclipse.collections</groupId>
<artifactId>eclipse-collections</artifactId>
<version>10.0.0</version>
</dependency>
<dependency>
<groupId>org.eclipse.collections</groupId>
<artifactId>eclipse-collections-api</artifactId>
<version>10.0.0</version>
</dependency>
3. First Benchmark
3.第一个基准
Our first benchmark is simple. We want to calculate the sum of a previously created List of Integers.
我们的第一个基准很简单。我们想计算一个先前创建的List of Integers的总和。
We’ll test six different combinations while running them in serial and parallel:
我们将测试六种不同的组合,同时以串行和并行方式运行它们。
private List<Integer> jdkIntList;
private MutableList<Integer> ecMutableList;
private ExecutorService executor;
private IntList ecIntList;
@Setup
public void setup() {
PrimitiveIterator.OfInt iterator = new Random(1L).ints(-10000, 10000).iterator();
ecMutableList = FastList.newWithNValues(1_000_000, iterator::nextInt);
jdkIntList = new ArrayList<>(1_000_000);
jdkIntList.addAll(ecMutableList);
ecIntList = ecMutableList.collectInt(i -> i, new IntArrayList(1_000_000));
executor = Executors.newWorkStealingPool();
}
@Benchmark
public long jdkList() {
return jdkIntList.stream().mapToLong(i -> i).sum();
}
@Benchmark
public long ecMutableList() {
return ecMutableList.sumOfInt(i -> i);
}
@Benchmark
public long jdkListParallel() {
return jdkIntList.parallelStream().mapToLong(i -> i).sum();
}
@Benchmark
public long ecMutableListParallel() {
return ecMutableList.asParallel(executor, 100_000).sumOfInt(i -> i);
}
@Benchmark
public long ecPrimitive() {
return this.ecIntList.sum();
}
@Benchmark
public long ecPrimitiveParallel() {
return this.ecIntList.primitiveParallelStream().sum();
}
To run our first benchmark we need to execute:
为了运行我们的第一个基准测试,我们需要执行。
mvn clean install
java -jar target/benchmarks.jar IntegerListSum -rf json
This will trigger the benchmark at our IntegerListSum class and save the result to a JSON file.
这将在我们的IntegerListSum类中触发基准,并将结果保存到一个JSON文件中。
We’ll measure the throughput or number of operations per second in our tests, so the higher the better:
我们将在测试中测量吞吐量或每秒的操作数,所以越高越好:。
Benchmark Mode Cnt Score Error Units
IntegerListSum.ecMutableList thrpt 10 573.016 ± 35.865 ops/s
IntegerListSum.ecMutableListParallel thrpt 10 1251.353 ± 705.196 ops/s
IntegerListSum.ecPrimitive thrpt 10 4067.901 ± 258.574 ops/s
IntegerListSum.ecPrimitiveParallel thrpt 10 8827.092 ± 11143.823 ops/s
IntegerListSum.jdkList thrpt 10 568.696 ± 7.951 ops/s
IntegerListSum.jdkListParallel thrpt 10 918.512 ± 27.487 ops/s
Accordingly to our tests, Eclipse Collections’ parallel list of primitives had the highest throughput of all. Also, it was the most efficient with a performance almost 10x faster than the Java JDK running also in parallel.
根据我们的测试,Eclipse Collections的基元并行列表的吞吐量是所有列表中最高的。此外,它也是最有效的,其性能几乎比同样以并行方式运行的 Java JDK 快 10 倍。
Of course, a portion of that can be explained by the fact that when working with primitive lists, we don’t have the cost associated with boxing and unboxing.
当然,其中一部分可以解释为,在使用primitive list时,我们没有装箱和拆箱的相关成本。
We can use JMH Visualizer to analyze our results. The chart below shows a better visualization:
我们可以使用JMH Visualizer来分析我们的结果。下面的图表显示了一个更好的可视化。
4. Filtering
4.过滤
Next, we’ll modify our list to get all elements that are multiple of 5. We’ll reuse a big portion of our previous benchmark and a filter function:
接下来,我们将修改我们的列表以获得所有5的倍数的元素。我们将重新使用我们之前的基准的很大一部分和一个过滤函数。
private List<Integer> jdkIntList;
private MutableList<Integer> ecMutableList;
private IntList ecIntList;
private ExecutorService executor;
@Setup
public void setup() {
PrimitiveIterator.OfInt iterator = new Random(1L).ints(-10000, 10000).iterator();
ecMutableList = FastList.newWithNValues(1_000_000, iterator::nextInt);
jdkIntList = new ArrayList<>(1_000_000);
jdkIntList.addAll(ecMutableList);
ecIntList = ecMutableList.collectInt(i -> i, new IntArrayList(1_000_000));
executor = Executors.newWorkStealingPool();
}
@Benchmark
public List<Integer> jdkList() {
return jdkIntList.stream().filter(i -> i % 5 == 0).collect(Collectors.toList());
}
@Benchmark
public MutableList<Integer> ecMutableList() {
return ecMutableList.select(i -> i % 5 == 0);
}
@Benchmark
public List<Integer> jdkListParallel() {
return jdkIntList.parallelStream().filter(i -> i % 5 == 0).collect(Collectors.toList());
}
@Benchmark
public MutableList<Integer> ecMutableListParallel() {
return ecMutableList.asParallel(executor, 100_000).select(i -> i % 5 == 0).toList();
}
@Benchmark
public IntList ecPrimitive() {
return this.ecIntList.select(i -> i % 5 == 0);
}
@Benchmark
public IntList ecPrimitiveParallel() {
return this.ecIntList.primitiveParallelStream()
.filter(i -> i % 5 == 0)
.collect(IntLists.mutable::empty, MutableIntList::add, MutableIntList::addAll);
}
We’ll execute the test just like before:
我们将像以前一样执行测试。
mvn clean install
java -jar target/benchmarks.jar IntegerListFilter -rf json
And the results:
而结果是。
Benchmark Mode Cnt Score Error Units
IntegerListFilter.ecMutableList thrpt 10 145.733 ± 7.000 ops/s
IntegerListFilter.ecMutableListParallel thrpt 10 603.191 ± 24.799 ops/s
IntegerListFilter.ecPrimitive thrpt 10 232.873 ± 8.032 ops/s
IntegerListFilter.ecPrimitiveParallel thrpt 10 1029.481 ± 50.570 ops/s
IntegerListFilter.jdkList thrpt 10 155.284 ± 4.562 ops/s
IntegerListFilter.jdkListParallel thrpt 10 445.737 ± 23.685 ops/s
As we can see, the Eclipse Collections Primitive was the winner again. With a throughput more than 2x faster than the JDK parallel list.
我们可以看到,Eclipse Collections Primitive再次成为赢家。它的吞吐量比JDK的并行列表快2倍以上。
Note that for filtering, the effect of parallel processing is more visible. Summing is a cheap operation for the CPU and we won’t see the same differences between serial and parallel.
请注意,对于过滤来说,并行处理的效果更加明显。对CPU来说,求和是一个廉价的操作,我们不会看到串行和并行之间的相同差异。
Also, the performance boost that Eclipse Collections primitive lists got earlier begins to evaporate as the work done on each element begins to outweigh the cost of boxing and unboxing.
另外,Eclipse集合原始列表早先得到的性能提升开始消失,因为在每个元素上所做的工作开始超过了装箱和拆箱的成本。
To finalize, we could see that operations on primitives are faster than objects:
最后,我们可以看到,对基元的操作要比对象快。
5. Conclusion
5.总结
In this article, we created a couple of benchmarks to compare Java Collections with Eclipse Collections. We’ve leveraged JMH to try to minimize the environment bias.
在这篇文章中,我们创建了几个基准来比较Java Collections和Eclipse Collections。我们利用JMH来尽量减少环境的偏差。
As always, the source code is available over on GitHub.
一如既往,源代码可在GitHub上获得over。