1. Overview
1.概述
We’ve all used Arrays.sort() to sort an array of objects or primitives. In JDK 8, creators enhanced the API to provide a new method: Arrays.parallelSort().
我们都使用过Arrays.sort()来对一个对象或基元数组进行排序。在 JDK 8 中,创作者增强了 API,提供了一个新方法。Arrays.parallelSort()。
In this tutorial, we’ll draw a comparison between the sort() and parallelSort() methods.
在本教程中,我们将对sort()和parallelSort()方法进行比较。
2. Arrays.sort()
2.Arrays.sort()
The Arrays.sort() method sorts the array of objects or primitives. The sorting algorithm used in this method is Dual-Pivot Quicksort. In other words, it is a custom implementation of the Quicksort algorithm to achieve better performance.
Arrays.sort()方法对对象或基元的阵列进行排序。该方法中使用的排序算法是Dual-Pivot Quicksort。换句话说,它是Quicksort算法的一个自定义实现,以实现更好的性能。
This method is single-threaded and there are two variants:
该方法是单线程的,有两个变种。
- sort(array) – sorts the full array into ascending order
- sort(array, fromIndex, toIndex) – sorts only the elements from fromIndex to toIndex
Let’s see an example of both variants:
让我们看看这两种变体的例子。
@Test
public void givenArrayOfIntegers_whenUsingArraysSortMethod_thenSortFullArrayInAscendingOrder() {
int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Arrays.sort(array);
assertArrayEquals(expected, array);
}
@Test
public void givenArrayOfIntegers_whenUsingArraysSortWithRange_thenSortRangeOfArrayAsc() {
int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };
Arrays.sort(array, 2, 8);
assertArrayEquals(expected, array);
}
Let’s summarize the pros and cons of this approach:
让我们总结一下这种方法的优点和缺点。
PROS | CONS |
---|---|
Works fast on smaller data sets | Performance degrades for large datasets |
Multiple cores of the system aren’t utilized |
3. Arrays.parallelSort()
3.Arrays.parallelSort()
This method also sorts an array of objects or primitives. Similar to sort() it also has two variants to sort a full array and partial array:
这个方法也可以对一个对象或基元的数组进行排序。与sort()类似,它也有两种变体,可以对全数组和部分数组进行排序。
@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortMethod_thenSortFullArrayInAscendingOrder() {
int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Arrays.parallelSort(array);
assertArrayEquals(expected, array);
}
@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortWithRange_thenSortRangeOfArrayAsc() {
int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };
Arrays.parallelSort(array, 2, 8);
assertArrayEquals(expected, array);
}
The parallelSort() is functionally different. Unlike sort(), which sorts data sequentially using a single thread, it uses a parallel sort-merge sorting algorithm. It breaks the array into sub-arrays that are themselves sorted and then merged.
parallelSort()在功能上有所不同。与sort()不同的是,它使用并行排序-合并排序算法,后者使用单线程顺序排序数据。它将数组分解成子数组,这些子数组本身被排序,然后被合并。
For executing parallel tasks it uses the ForkJoin pool.
为了执行并行任务,它使用ForkJoin池。。
But we need to know that it uses parallelism only when certain conditions are met. If the array size is less than or equal to 8192 or the processor has only one core, then it uses the sequential Dual-Pivot Quicksort algorithm. Otherwise, it uses a parallel sort.
但是我们需要知道,它只有在满足某些条件时才使用并行性。如果数组大小小于或等于8192,或者处理器只有一个核心,那么它就使用顺序的Dual-Pivot Quicksort算法。否则,它就使用并行排序。
Let’s summarize the advantages and disadvantages of using it:
让我们总结一下使用它的优点和缺点。
PROS | CONS |
---|---|
Offers better performance for large size datasets | Slower for smaller size arrays |
Utilizes multiple cores of the system |
4. Comparison
4.比较
Let’s now see how both methods performed with different size datasets. Below numbers are derived using JMH benchmarking. The test environment uses AMD A10 PRO 2.1Ghz quad-core processor and JDK 1.8.0_221:
现在让我们看看这两种方法在不同大小的数据集上的表现。以下数字是使用JMH基准测试得出的。测试环境使用AMD A10 PRO 2.1Ghz四核处理器和JDK 1.8.0_221。
Array Size | Arrays.sort() | Arrays.parallelSort() |
---|---|---|
1000 | o.048 | 0.054 |
10000 | 0.847 | 0.425 |
100000 | 7.570 | 4.395 |
1000000 | 65.301 | 37.998 |
5. Conclusion
5.总结
In this quick article, we saw how sort() and parallelSort() differ.
在这篇快速文章中,我们看到了sort()和parallelSort()的区别。
Based on performance results, we can conclude that parallelSort() may be a better choice when we have a large dataset to sort. However, in the case of smaller size arrays, it’s better to go with sort() since it offers better performance.
基于性能结果,我们可以得出结论,当我们有一个大的数据集需要排序时,parallelSort()可能是一个更好的选择。然而,如果是较小规模的数组,最好选择sort(),因为它提供了更好的性能。
As always, the complete source code is available over on GitHub.
一如既往,完整的源代码可在GitHub上获得,。