How to Find the Kth Largest Element in Java – 如何在Java中找到第K个最大的元素

最后修改: 2018年 1月 8日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this article, we’ll present various solutions for finding the kth largest element in a sequence of unique numbers. We’ll use an array of integers for our examples.

在这篇文章中,我们将介绍在一串唯一的数字中寻找k第最大元素的各种解决方案。我们将使用一个整数数组来举例。

We’ll also talk about each algorithm’s average and worst-case time complexity.

我们还将讨论每种算法的平均和最坏情况下的时间复杂性。

2. Solutions

2.解决方案

Now let’s explore a few possible solutions — one using a plain sort, and two using the Quick Select algorithm derived from Quick Sort.

现在让我们来探讨几种可能的解决方案–一种是使用普通排序,另一种是使用从快速排序衍生出来的快速选择算法。

2.1. Sorting

2.1.分类

When we think about the problem, perhaps the most obvious solution that comes to mind is to sort the array.

当我们思考这个问题时,也许想到的最明显的解决方案是对数组进行排序

Let’s define the steps required:

让我们定义一下所需的步骤。

  • Sort the array in ascending order
  • As the last element of the array would be the largest element, the kth largest element would be at xth index, where x = length(array) – k

As we can see, the solution is straightforward but requires sorting of the entire array. Hence, the time complexity will be O(n*logn):

正如我们所看到的,这个解决方案是直接的,但需要对整个数组进行排序。因此,时间复杂性将是O(n*logn)

public int findKthLargestBySorting(Integer[] arr, int k) {
    Arrays.sort(arr);
    int targetIndex = arr.length - k;
    return arr[targetIndex];
}

An alternative approach is to sort the array in descending order and simply return the element on (k-1)th index:

另一种方法是对数组进行降序排序,并简单地返回(k-1)第1个索引上的元素。

public int findKthLargestBySortingDesc(Integer[] arr, int k) {
    Arrays.sort(arr, Collections.reverseOrder());
    return arr[k-1];
}

2.2. QuickSelect

2.2.快速选择

This can be considered an optimization of the previous approach. In this, we pick the QuickSort for sorting. Analyzing the problem statement, we realize that we don’t actually need to sort the entire array — we only need to rearrange its contents so that the kth element of the array is the kth largest or smallest.

这可以说是对前一种方法的优化。在此,我们选取QuickSort进行排序。分析问题陈述,我们意识到我们实际上不需要对整个数组进行排序–我们只需要重新排列其内容,使数组的k个元素是k个最大或最小的元素。

In QuickSort, we pick a pivot element and move it to its correct position. We also partition the array around it. In QuickSelect, the idea is to stop at the point where the pivot itself is the kth largest element.

在QuickSort中,我们挑选一个枢轴元素并将其移动到正确的位置。我们还围绕它对数组进行分区。在QuickSelect中,我们的想法是在支点本身是k第1大元素的地方停止。

We can optimize the algorithm further if we don’t recur for both left and right sides of the pivot. We only need to recur for one of them according to the position of the pivot.

如果我们不对枢轴的左右两边进行递归,我们可以进一步优化算法。我们只需要根据支点的位置对其中一个进行递归。

Let’s look at the basic ideas of the QuickSelect algorithm:

我们来看看QuickSelect算法的基本思路。

  • Pick a pivot element and partition the array accordingly
    • Pick the rightmost element as pivot
    • Reshuffle the array such that pivot element is placed at its rightful place — all elements less than the pivot would be at lower indexes, and elements greater than the pivot would be placed at higher indexes than the pivot
  • If pivot is placed at the kth element in the array, exit the process, as pivot is the kth largest element
  • If pivot position is greater than k, then continue the process with the left subarray, otherwise, recur the process with right subarray

We can write generic logic which can be used to find the kth smallest element as well. We’ll define a method findKthElementByQuickSelect() which will return the kth element in the sorted array.

我们可以编写通用逻辑,它也可以用来寻找kth最小的元素。我们将定义一个方法findKthElementByQuickSelect() ,它将返回排序后的数组中的k个元素。

If we sort the array in ascending order, the kth element of an array will be the kth smallest element. To find the kth largest element, we can pass k= length(Array) – k.

如果我们对数组进行升序排序,那么数组的k个元素将是k个最小的元素。要找到k第1个最大的元素,我们可以通过k= length(Array) – k.

Let’s implement this solution:

让我们来实施这个解决方案。

public int 
  findKthElementByQuickSelect(Integer[] arr, int left, int right, int k) {
    if (k >= 0 && k <= right - left + 1) {
        int pos = partition(arr, left, right);
        if (pos - left == k) {
            return arr[pos];
        }
        if (pos - left > k) {
            return findKthElementByQuickSelect(arr, left, pos - 1, k);
        }
        return findKthElementByQuickSelect(arr, pos + 1,
          right, k - pos + left - 1);
    }
    return 0;
}

Now let’s implement the partition method, which picks the rightmost element as a pivot, puts it at the appropriate index, and partitions the array in such a way that elements at lower indexes should be less than the pivot element.

现在让我们来实现partition方法,它挑选最右边的元素作为支点,把它放在适当的索引上,并以这样的方式分割数组,即索引较低的元素应该少于支点元素。

Similarly, elements at higher indexes will be greater than the pivot element:

同样地,较高索引的元素将大于枢纽元素。

public int partition(Integer[] arr, int left, int right) {
    int pivot = arr[right];
    Integer[] leftArr;
    Integer[] rightArr;

    leftArr = IntStream.range(left, right)
      .filter(i -> arr[i] < pivot)
      .map(i -> arr[i])
      .boxed()
      .toArray(Integer[]::new);

    rightArr = IntStream.range(left, right)
      .filter(i -> arr[i] > pivot)
      .map(i -> arr[i])
      .boxed()
      .toArray(Integer[]::new);

    int leftArraySize = leftArr.length;
    System.arraycopy(leftArr, 0, arr, left, leftArraySize);
    arr[leftArraySize+left] = pivot;
    System.arraycopy(rightArr, 0, arr, left + leftArraySize + 1,
      rightArr.length);

    return left + leftArraySize;
}

There’s a simpler, iterative approach to achieve the partitioning:

有一个更简单的、迭代的方法来实现分区。

public int partitionIterative(Integer[] arr, int left, int right) {
    int pivot = arr[right], i = left;
    for (int j = left; j <= right - 1; j++) {
        if (arr[j] <= pivot) {
            swap(arr, i, j);
            i++;
        }
    }
    swap(arr, i, right);
    return i;
}

public void swap(Integer[] arr, int n1, int n2) {
    int temp = arr[n2];
    arr[n2] = arr[n1];
    arr[n1] = temp;
}

This solution works in O(n) time on average. However, in the worst case, the time complexity will be O(n^2).

这个解决方案平均在O(n)时间内运行。然而,在最坏的情况下,时间复杂性将是O(n^2)

2.3. QuickSelect With Randomized Partition

2.3.使用随机分区的QuickSelect

This approach is a slight modification of the previous approach. If the array is almost/fully sorted and if we pick the rightmost element as a pivot, the partition of left and right subarrays will be highly uneven.

这种方法是对前一种方法的轻微修改。如果数组几乎/完全排序,如果我们选取最右边的元素作为支点,那么左右子数的划分将是非常不均匀的。

This method suggests picking the initial pivot element in a random manner. We don’t need to change the partitioning logic though.

这种方法建议以随机的方式挑选初始支点元素。不过我们不需要改变分区逻辑。

Instead of calling partition, we call the randomPartition method, which picks a random element and swaps it with the rightmost element before finally invoking the partition method.

我们不调用partition,而是调用randomPartition方法,该方法挑选一个随机元素,并在最后调用partition方法之前将其与最右边的元素交换。

Let’s implement the randomPartition method:

我们来实现randomPartition方法。

public int randomPartition(Integer arr[], int left, int right) {
    int n = right - left + 1;
    int pivot = (int) (Math.random()) * n;
    swap(arr, left + pivot, right);
    return partition(arr, left, right);
}

This solution works better than the previous case in most cases.

在大多数情况下,这种解决方案比前一种情况效果更好。

The expected time complexity of randomized QuickSelect is O(n).

随机化QuickSelect的预期时间复杂度为O(n)

However, the worst time complexity still remains O(n^2).

然而,最差的时间复杂度仍然是O(n^2)

3. Conclusion

3.结论

In this article, we discussed different solutions to find the kth largest (or smallest) element in an array of unique numbers. The simplest solution is to sort the array and return the kth element. This solution has a time complexity of O(n*logn).

在这篇文章中,我们讨论了不同的解决方案,以寻找一个唯一数字数组中最大(或最小)的kth元素。最简单的解决方案是对数组进行排序,并返回kth元素。这个方案的时间复杂度为O(n*logn)

We also discussed two variations of Quick Select. This algorithm isn’t straightforward but it has a time complexity of O(n) in average cases.

我们还讨论了快速选择的两种变化。这种算法并不直接,但在平均情况下,它的时间复杂度为O(n)

As always, the complete code for the algorithm can be found over on GitHub.

一如既往,该算法的完整代码可以在GitHub上找到over