Radix Sort in Java – 在Java中进行Radix排序

最后修改: 2019年 9月 11日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.绪论

In this tutorial, we’ll learn about Radix Sort, analyze its performance, and take a look at its implementation.

在本教程中,我们将了解Radix Sort,分析其性能,并看看其实现。

Here we focus on using Radix Sort to sort integers, but it’s not limited to just numbers. We can use it to sort other types such as String, too.

这里我们主要是使用Radix Sort来对整数进行排序,但它并不局限于数字。我们也可以用它来对其他类型进行排序,比如String,

In order to keep it simple, we’re gonna focus on the decimal system in which the numbers are expressed in base (radix) 10.

为了保持简单,我们将专注于十进制系统,其中的数字以基数(小数)10表示。

2. Algorithm Overview

2.算法概述

Radix sort is a sorting algorithm that sorts numbers based on the positions of their digits. Basically, it uses the place value of the digits in a number. Unlike most of the other sorting algorithms, such as Merge Sort, Insertion Sort, Bubble Sort, it doesn’t compare the numbers.

径向排序是一种排序算法,根据数字的位置对数字进行排序。基本上,它使用数字中的位数值。与其他大多数排序算法不同,例如合并排序插入排序泡沫排序,它不会比较数字。

Radix sort uses a stable sorting algorithm as a subroutine to sort the digits. We’ve used a variation of counting sort as a subroutine here that uses the radix to sort the digits in every position. Counting sort is a stable sorting algorithm and it works well in practice.

径向排序使用稳定的排序算法作为子程序来排序数字。我们在这里使用了计数排序的一个变体作为子程序,它使用弧度来对每个位置的数字进行排序。计数排序是一种稳定的排序算法,它在实践中运行良好。

Radix sort works by sorting digits from the Least Significant Digit (LSD) to the Most Significant Digit (MSD). We can also implement Radix sort to process digits from MSD.

径向排序的工作原理是将数字从最小有效数字(LSD)排序到最大有效数字(MSD)。我们也可以实现Radix排序来处理从MSD开始的数字。

3. A Quick Example

3.一个快速的例子

Let’s see how it works with an example. Let’s consider the following array:

让我们通过一个例子来看看它是如何工作的。让我们考虑以下数组。

Array

Array

3.1. Iteration 1

3.1.迭代1

We’ll sort this array by processing digits from LSD and moving towards MSD.

我们将对这个数组进行排序,从LSD开始处理数字,向MSD移动。

So let’s start with the digits in ones place:

因此,让我们从1位数开始。

Before Iteration 1

Before Iteration 1

After the first iteration, the array now looks like:

经过第一次迭代,现在的数组看起来像。

After Iteration 1

After Iteration 1

Note that the numbers have been sorted according to the digits in ones place.

请注意,这些数字是根据1位数来排序的。

3.2. Iteration 2

3.2.迭代2

Let’s move on to the digits in tens place:

让我们继续讨论十位数的问题。

Before Iteration 2

Before Iteration 2

Now the array looks like:

现在数组看起来像。

After Iteration 2

After Iteration 2

We see that the number 7 has occupied the first position in the array since it doesn’t have any digit in the tens place. We could also think of this as having a 0 in the tens place.

我们看到,数字7占据了数组中的第一个位置,因为它在十位上没有任何数字。我们也可以认为这是在十位上有一个0。

3.3. Iteration 3

3.3.迭代3

Let’s move on to the digits in the hundreds position:

让我们继续讨论百位数的问题。

Before Iteration 3

Before Iteration 3

After this iteration, the array looks like:

经过这次迭代,数组看起来像。

After Iteration 3

After Iteration 3

And the algorithm stops here, with all elements sorted.

算法到此为止,所有元素都已排序。

4. Implementation

4.实施

Let’s now look at the implementation.

现在让我们来看看实施情况。

void sort(int[] numbers) {
    int maximumNumber = findMaximumNumberIn(numbers);
    int numberOfDigits = calculateNumberOfDigitsIn(maximumNumber);
    int placeValue = 1;
    while (numberOfDigits-- > 0) {
        applyCountingSortOn(numbers, placeValue);
        placeValue *= 10;
    }
}

The algorithm works by finding out the maximum number in the array and then calculating its length. This step helps us to ensure that we execute the subroutine for every place value.

该算法的工作原理是找出数组中的最大数字,然后计算其长度。这一步可以帮助我们确保对每个位值都执行子程序。

For example, in the array, [7, 37, 68, 123, 134, 221, 387, 468, 769], the maximum number is 769 and its length is 3.

例如,在数组中,[7, 37, 68, 123, 134, 221, 387, 468, 769],最大数字为769,其长度为3。

So, we iterate and apply the subroutine thrice on the digits in every position:

因此,我们对每个位置的数字进行三次迭代和应用子程序。

void applyCountingSortOn(int[] numbers, int placeValue) {

    int range = 10 // decimal system, numbers from 0-9

    // ...

    // calculate the frequency of digits
    for (int i = 0; i < length; i++) {
        int digit = (numbers[i] / placeValue) % range;
        frequency[digit]++;
    }

    for (int i = 1; i < range; i++) {
        frequency[i] += frequency[i - 1];
    }

    for (int i = length - 1; i >= 0; i--) {
        int digit = (numbers[i] / placeValue) % range;
        sortedValues[frequency[digit] - 1] = numbers[i];
        frequency[digit]--;
    }

    System.arraycopy(result, 0, numbers, 0, length); 

}

In the subroutine, we’ve used the radix (range) to count the occurrence of each digit and increment its frequency. So, each bin in the range from 0 to 9 will have some value based on the frequency of digits. We then use the frequency to position each element in the array. This also helps us to minimize the space required to sort the array.

在子程序中,我们用弧度(range)来计算每个数字的出现,并递增其频率。因此,从0到9的范围内的每个bin将有一些基于数字频率的值。然后我们用频率来定位数组中的每个元素。这也有助于我们尽量减少数组排序所需的空间。

Now let’s test our method:

现在让我们测试一下我们的方法。

@Test
public void givenUnsortedArray_whenRadixSort_thenArraySorted() {
    int[] numbers = {387, 468, 134, 123, 68, 221, 769, 37, 7};
    RadixSort.sort(numbers);
    int[] numbersSorted = {7, 37, 68, 123, 134, 221, 387, 468, 769};
    assertArrayEquals(numbersSorted, numbers); 
}

5. Radix Sort vs Counting Sort

5.弧形排序与计数排序

In the subroutine, the length of the frequency array is 10 (0-9). In the case of Counting Sort, we don’t use the range. The length of the frequency array will be the maximum number in the array + 1. So we don’t divide them into bins whereas Radix Sort uses the bins to sort.

在子程序中,frequency数组的长度是10(0-9)。在计数排序的情况下,我们不使用范围频率数组的长度将是数组中的最大数字+1。因此,我们不把它们分成仓,而Radix Sort则使用仓来排序。

Counting Sort is quite efficient when the length of the array is not much smaller than the maximum value in the array whereas Radix Sort allows for larger values in the array.

当数组的长度不比数组中的最大值小多少时,计数排序是相当有效的,而拉德克斯排序则允许数组中的数值更大。

6. Complexity

6.复杂性

The performance of Radix Sort depends on the stable sorting algorithm chosen to sort the digits.

Radix排序的性能取决于选择的稳定的排序算法来对数字进行排序。

Here we’ve used the Radix Sort to sort an array of n numbers in base b. In our case, the base is 10. We’ve applied the Counting Sort d times where d stands for the number of digits. So the time complexity of Radix Sort becomes O(d * (n + b)).

在这里,我们使用Radix Sort来对一个以b为基数的n数组进行排序。在我们的例子中,基数是10。我们已经应用了计数排序d次,其中d代表数字的数量。因此,Radix排序的时间复杂度变为O(d * (n + b))

The space complexity is O(n + b) since we have used a variation of Counting Sort as a subroutine here.

空间复杂度为O(n + b),因为我们在这里使用了计数排序的一个变体作为子程序。

7. Conclusion

7.结语

In this article, we described the Radix sort algorithm and illustrated how to implement it.

在这篇文章中,我们描述了Radix排序算法并说明了如何实现它。

As usual, the code implementations are available over on Github.

像往常一样,代码实现可在Github上获得over。