Time Comparison of Arrays.sort(Object[]) and Arrays.sort(int[]) – Arrays.sort(Object[])和Arrays.sort(int[])的时间比较

最后修改: 2019年 3月 6日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this quick tutorial, we’ll compare the two Arrays.sort(Object[]) and Arrays.sort(int[]) sorting operations.

在这个快速教程中,我们将比较两个Arrays.sort(Object[])Arrays.sort(int[]) sorting操作

First, we’ll describe each method separately. After that, we’ll write performance tests to measure their running times.

首先,我们将分别描述每种方法。之后,我们将编写性能测试来测量它们的运行时间。

2. Arrays.sort(Object[])

2.Arrays.sort(Object[])

Before we move ahead, it’s important to keep in mind that Arrays.sort() works for both primitive and reference type arrays.

在我们继续前进之前,重要的是要记住,Arrays.sort() 对于原始和引用类型的数组都有效。

Arrays.sort(Object[]) accepts reference types.

Arrays.sort(Object[]) 接受引用类型

For example, we have an array of Integer objects:

例如,我们有一个Integer对象的数组。

Integer[] numbers = {5, 22, 10, 0};

To sort the array, we can simply use:

为了给数组排序,我们可以简单地使用。

Arrays.sort(numbers);

Now, the numbers array has all its elements in ascending order:

现在,数字数组的所有元素都以升序排列。

[0, 5, 10, 22]

Arrays.sort(Object[]) is based on the TimSort algorithm, giving us a time complexity of O(n log(n)). In short, TimSort makes use of the Insertion sort and the MergeSort algorithms. However, it is still slower compared to other sorting algorithms like some of the QuickSort implementations.

Arrays.sort(Object[]) 基于TimSort算法,为我们提供了O(n log(n))的时间复杂度。简而言之,TimSort利用了插入排序合并排序算法。然而,与其他排序算法(如一些QuickSort的实现)相比,它仍然较慢。

3. Arrays.sort(int[])

3.Arrays.sort(int[])

On the other hand, Arrays.sort(int[]) works with primitive int arrays.

另一方面,Arrays.sort(int[])适用于原始int数组。

Similarly, we can define an int[] array of primitives:

同样地,我们可以定义一个int[]基元数组。

int[] primitives = {5, 22, 10, 0};

And sort it with another implementation of Arrays.sort(int[]). This time, accepting an array of primitives:

并用另一个Arrays.sort(int[])的实现对其进行排序。这一次,接受了一个基元数组。

Arrays.sort(primitives);

The result of this operation will be no different from the previous example. And the items in the primitives array will look like:

这个操作的结果将与前面的例子没有什么不同。而primitives数组中的项目将看起来像。

[0, 5, 10, 22]

Under the hood, it uses a Dual-Pivot Quicksort algorithm. Its internal implementation from the JDK 10 is typically faster than traditional one-pivot Quicksort.

在引擎盖下,它使用双枢轴Quicksort算法。其来自JDK 10的内部实现通常比传统的单支点Quicksort更快。

This algorithm offers O(n log(n)) average time complexity. That’s a great average sorting time for many collections to have. Moreover, it has the advantage of being completely in place, so it does not require any additional storage.

这个算法提供了O(n log(n))平均时间复杂性。对于许多集合来说,这是一个很好的平均分拣时间。此外,它的优点是完全到位,所以它不需要任何额外的存储。

Though, in the worst case, its time complexity is O(n2)

虽然,在最坏的情况下,其时间复杂度为O(n2)

4. Time Comparison

4.时间比较

So, which algorithm is faster and why? Let’s first do some theory, and then we’ll run some concrete tests with JMH.

那么,哪种算法更快,为什么?让我们先做一些理论研究,然后用JMH进行一些具体测试。

4.1. Qualitative Analysis

4.1.定性分析

Arrays.sort(Object[]) is typically slower compared to Arrays.sort(int[]) for a few different reasons.

Arrays.sort(Object[])Arrays.sort(int[])相比,通常会慢一些,原因有几个。

The first is the different algorithms. QuickSort is often faster than Timsort.

首先是算法的不同。QuickSort通常比Timsort快。

Second is how each method compares the values.

第二是每种方法如何比较数值。

See, since Arrays.sort(Object[]) needs to compare one object against another, it needs to call each element’s compareTo method. At the very least, this requires a method lookup and pushing a call onto the stack in addition to whatever the comparison operation actually is.

看,由于Arrays.sort(Object[])需要将一个对象与另一个对象进行比较,它需要调用每个元素的compareTo方法。至少,除了实际的比较操作外,这还需要进行方法查找并将调用推入栈中。

On the other hand, Arrays.sort(int[]) can simply use primitive relational operators like < and >, which are single bytecode instructions.

另一方面,Arrays.sort(int[])可以简单地使用原始的关系运算符,如<>,它们是单一的字节码指令。

4.2. JMH Parameters

4.2.JMH参数

Finally, let’s find out which sorting method runs faster with actual data. For that, we’ll use the JMH (Java Microbenchmark Harness) tool to write our benchmark tests.

最后,让我们找出哪种排序方法在实际数据中运行得更快。为此,我们将使用JMH(Java Microbenchmark Harness)工具来编写我们的基准测试。

So, we are just going to do a very simple benchmark here. It’s not comprehensive but will give us an idea of how we can approach comparing Arrays.sort(int[]) and Arrays.sort(Integer[]) sorting methods.

所以,我们在这里只是要做一个非常简单的基准测试。它并不全面,但会给我们一个概念,让我们知道如何去比较Arrays.sort(int[])Arrays.sort(Integer[])排序方法。

In our benchmark class we’ll use configuration annotations:

在我们的基准类中,我们将使用配置注解。

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Measurement(batchSize = 100000, iterations = 10)
@Warmup(batchSize = 100000, iterations = 10)
public class ArraySortBenchmark {
}

Here, we want to measure the average time for a single operation (Mode.AverageTime) and display our results in milliseconds (TimeUnit.MILLISECONDS). Furthermore, with the batchSize parameter, we’re telling JMH to perform 100,000 iterations to make sure our results have high precision.

在这里,我们想测量单个操作的平均时间(Mode.AverageTime)并以毫秒为单位显示我们的结果(TimeUnit.MILLISECONDS)。此外,通过batchSize参数,我们要告诉JMH执行100,000次迭代,以确保我们的结果具有较高的精度。

4.3. Benchmark Tests

4.3.基准测试

Before running the tests, we need to define the data containers which we want to sort:

在运行测试之前,我们需要定义我们想要排序的数据容器。

@State(Scope.Thread)
public static class Initialize {
    Integer[] numbers = {-769214442, -1283881723, 1504158300, -1260321086, -1800976432, 1278262737, 
      1863224321, 1895424914, 2062768552, -1051922993, 751605209, -1500919212, 2094856518, 
      -1014488489, -931226326, -1677121986, -2080561705, 562424208, -1233745158, 41308167 };
    int[] primitives = {-769214442, -1283881723, 1504158300, -1260321086, -1800976432, 1278262737, 
      1863224321, 1895424914, 2062768552, -1051922993, 751605209, -1500919212, 2094856518, 
      -1014488489, -931226326, -1677121986, -2080561705, 562424208, -1233745158, 41308167};
}

Let’s choose the Integer[] numbers and the int[] primitives array of primitive elements. The @State annotation indicates that the variables declared in the class won’t be the part of running benchmark tests. However, we can then use them in our benchmark methods.

让我们选择Integer[] 数字int[] primitives数组的原始元素。@State注解表明,在类中声明的变量不会成为运行基准测试的一部分。然而,我们可以在我们的基准测试方法中使用它们。

Now, we’re ready to add the first micro-benchmark for Arrays.sort(Integer[]):

现在,我们准备为Arrays.sort(Integer[])添加第一个微型基准。

@Benchmark
public Integer[] benchmarkArraysIntegerSort(ArraySortBenchmark.Initialize state) {
    Arrays.sort(state.numbers);
    return state.numbers;
}

Next, for Arrays.sort(int[]):

接下来,对于Arrays.sort(int[])

@Benchmark
public int[] benchmarkArraysIntSort(ArraySortBenchmark.Initialize state) {
    Arrays.sort(state.primitives);
    return state.primitives;
}

4.4. Test Results

4.4.测试结果

Finally, we run our tests and compare the results:

最后,我们运行我们的测试并比较结果。

Benchmark                   Mode  Cnt  Score   Error  Units
benchmarkArraysIntSort      avgt   10  1.095 ± 0.022  ms/op
benchmarkArraysIntegerSort  avgt   10  3.858 ± 0.060  ms/op

From the results, we can see that Arrays.sort(int[]) method performed better than to Arrays.sort(Object[]) in our test, likely for the earlier reasons we identified.

从结果中,我们可以看到,在我们的测试中,Arrays.sort(int[])方法比Arrays.sort(Object[])表现更好,可能是因为我们之前确定的原因。

And even though the numbers appear to support our theory, though we’d need to do testing with a greater variety of inputs to get a better idea.

尽管这些数字似乎支持我们的理论,但我们需要用更多种类的输入做测试,以获得更好的想法。

Also, keep in mind that the numbers we present here are just JMH benchmark results – so we should always test in the scope of our own system and runtime.

此外,请记住,我们在这里提出的数字只是JMH的基准结果 – 所以我们应该始终在自己的系统和运行时间范围内进行测试。

4.5. Why Timsort Then?

4.5.那为什么是Timsort?

We should probably ask ourselves a question, then. If QuickSort is faster, why not use it for both implementations?

那么,我们也许应该问自己一个问题。如果QuickSort的速度更快,为什么不把它用于两种实现方式?

See, QuickSort isn’t stable, so we can’t use it to sort Objects. Basically, if two ints are equal, it doesn’t matter that their relative order stays the same since one is no different from another 2. With objects though, we can sort by one attribute and then another, making the starting order matter.

看,QuickSort不是稳定的,所以我们不能用它来给Objects排序。基本上,如果两个ints是相等的,它们的相对顺序保持不变并不重要,因为一个2和另一个2>没有什么不同。但是对于对象,我们可以通过一个属性来排序,然后再通过另一个属性来排序,这使得起始顺序很重要。

5. Conclusion

5.结论

In this article, we compared two sorting methods available in Java: Arrays.sort(int[]) and Arrays.sort(Integer[]). Additionally, we discussed the sorting algorithms used in their implementations.

在这篇文章中,我们比较了Java中可用的两种排序方法。Arrays.sort(int[])Arrays.sort(Integer[])。此外,我们还讨论了在其实现中使用的排序算法。

Finally, with the help of benchmark performance tests, we showed a sample run time of each sorting option.

最后,在基准性能测试的帮助下,我们展示了每个排序选项的样本运行时间。

As usual, the complete code for this article is available over on GitHub.

像往常一样,本文的完整代码可以在GitHub上找到