1. Overview
1.概述
In this tutorial, we’ll see different algorithms allowing us to find the smallest missing positive integer in an array.
在本教程中,我们将看到不同的算法,使我们能够找到数组中最小的缺失正整数。
First, we’ll go through the explanation of the problem. After that, we’ll see three different algorithms suiting our needs. Finally, we’ll discuss their complexities.
首先,我们将通过对这个问题的解释。之后,我们将看到三种适合我们需求的不同算法。最后,我们将讨论它们的复杂性。
2. Problem Explanation
问题解释
First, let’s explain what the goal of the algorithm is. We want to search for the smallest missing positive integer in an array of positive integers. That is, in an array of x elements, find the smallest element between 0 and x – 1 that is not in the array. If the array contains them all, then the solution is x, the array size.
首先,让我们解释一下该算法的目标是什么。我们想在一个正整数的数组中寻找最小的缺失的正整数。也就是说,在一个有x个元素的数组中,找到0和x – 1之间不在数组中的最小的元素。如果数组中包含所有的元素,那么解决方案是x,即数组的大小。
For example, let’s consider the following array: [0, 1, 3, 5, 6]. It has 5 elements. That means we’re searching for the smallest integer between 0 and 4 that is not in this array. In this specific case, it’s 2.
例如,让我们考虑下面这个数组。[0, 1, 3, 5, 6]。它有5个元素。这意味着我们要寻找0和4之间不在这个数组中的最小的整数。在这个特定的例子中,它是2。
Now, let’s imagine another array: [0, 1, 2, 3]. As it has 4 elements, we’re searching for an integer between 0 and 3. None is missing, thus the smallest integer that is not in the array is 4.
现在,让我们想象另一个数组。[0, 1, 2, 3]。因为它有4个元素,我们要寻找0和3之间的一个整数。没有,因此不在数组中的最小整数是4。
3. Sorted Array
3.排序的阵列
Now, let’s see how to find the smallest missing number in a sorted array. In a sorted array, the smallest missing integer would be the first index that doesn’t hold itself as a value.
现在,让我们看看如何在一个排序的数组中找到最小的缺失数字。在一个排序的数组中,最小的缺失整数将是第一个不持有自己的数值的索引。
Let’s consider the following sorted array: [0, 1, 3, 4, 6, 7]. Now, let’s see which value matches which index:
让我们考虑下面这个排序的数组。[0, 1, 3, 4, 6, 7]。现在,让我们看看哪个值与哪个索引匹配。
Index: 0 1 2 3 4 5
Value: 0 1 3 4 6 7
As we can see, the value index doesn’t hold integer 2, therefore 2 is the smallest missing integer in the array.
正如我们所看到的,值的索引不包含整数2,因此2是数组中最小的缺失整数。
How about implementing this algorithm in Java? Let’s first create a class SmallestMissingPositiveInteger with a method searchInSortedArray():
在Java中实现这种算法如何?让我们首先创建一个类SmallestMissingPositiveInteger,其中有一个方法searchInSortedArray()。
public class SmallestMissingPositiveInteger {
public static int searchInSortedArray(int[] input) {
// ...
}
}
Now, we can iterate over the array and search for the first index that doesn’t contain itself as a value and return it as the result:
现在,我们可以对数组进行迭代,搜索第一个不包含自己的索引作为一个值,并将其作为结果返回。
for (int i = 0; i < input.length; i++) {
if (i != input[i]) {
return i;
}
}
Finally, if we complete the loop without finding a missing element, we must return the next integer, which is the array length, as we start at index 0:
最后,如果我们完成循环而没有找到一个缺失的元素,我们必须返回下一个整数,也就是数组的长度,因为我们从索引0开始。
return input.length;
Let’s check that this all works as expected. Imagine an array of integers from 0 to 5, with the number 3 missing:
让我们检查一下这一切是否如预期的那样工作。想象一个从0到5的整数数组,其中缺少3。
int[] input = new int[] {0, 1, 2, 4, 5};
Then, if we search for the first missing integer, 3 should be returned:
然后,如果我们搜索第一个缺失的整数,3应该被返回。
int result = SmallestMissingPositiveInteger.searchInSortedArray(input);
assertThat(result).isEqualTo(3);
But, if we search for a missing number in an array without any missing integer:
但是,如果我们在一个没有任何缺失整数的数组中搜索一个缺失的数字。
int[] input = new int[] {0, 1, 2, 3, 4, 5};
We’ll find that the first missing integer is 6, which is the length of the array:
我们会发现,第一个缺失的整数是6,这是数组的长度。
int result = SmallestMissingPositiveInteger.searchInSortedArray(input);
assertThat(result).isEqualTo(input.length);
Next, we’ll see how to handle unsorted arrays.
接下来,我们将看到如何处理未排序的数组。
4. Unsorted Array
4.未排序阵列
So, what about finding the smallest missing integer in an unsorted array? There are multiple solutions. The first one is to simply sort the array first and then reuse our previous algorithm. Another approach would be to use another array to flag the integers that are present and then traverse that array to find the first one missing.
那么,如何在一个未排序的数组中寻找最小的缺失整数呢?有多种解决方案。第一种是简单地先对数组进行排序,然后重新使用我们之前的算法。另一种方法是用另一个数组来标记存在的整数,然后遍历这个数组,找到第一个丢失的整数。
4.1. Sorting the Array First
4.1.首先对数组进行排序
Let’s start with the first solution and create a new searchInUnsortedArraySortingFirst() method.
让我们从第一个解决方案开始,创建一个新的searchInUnsortedArraySortingFirst()方法。
So, we’ll be reusing our algorithm, but first, we need to sort our input array. In order to do that, we’ll make use of Arrays.sort():
因此,我们将重新使用我们的算法,但首先,我们需要对我们的输入数组进行排序。为了做到这一点,我们将利用Arrays.sort()。
Arrays.sort(input);
That method sorts its input according to its natural order. For integers, that means from the smallest to the greatest one. There are more details about sorting algorithms in our article on sorting arrays in Java.
该方法是根据输入的自然顺序进行排序的。对于整数来说,这意味着从最小的到最大的。在我们关于在Java中对数组进行排序的文章中,有更多关于排序算法的细节。
After that, we can call our algorithm with the now sorted input:
之后,我们可以用现在的排序输入调用我们的算法。
return searchInSortedArray(input);
That’s it, we can now check that everything works as expected. Let’s imagine the following array with unsorted integers and missing numbers 1 and 3:
就是这样,我们现在可以检查一切是否如预期般运作。让我们想象一下下面这个数组,其中有未经排序的整数和缺失的数字1和3。
int[] input = new int[] {4, 2, 0, 5};
As 1 is the smallest missing integer, we expect it to be the result of calling our method:
由于1是最小的缺失整数,我们希望它是调用我们方法的结果。
int result = SmallestMissingPositiveInteger.searchInUnsortedArraySortingFirst(input);
assertThat(result).isEqualTo(1);
Now, let’s try it on an array with no missing number:
现在,让我们在一个没有丢失数字的数组上试试。
int[] input = new int[] {4, 5, 1, 3, 0, 2};
int result = SmallestMissingPositiveInteger.searchInUnsortedArraySortingFirst(input);
assertThat(result).isEqualTo(input.length);
That’s it, the algorithm returns 6, that is the array length.
就是这样,该算法返回6,这就是数组的长度。
4.2. Using a Boolean Array
4.2.使用一个布尔数组
Another possibility is to use another array – having the same length as the input array – that holds boolean values telling if the integer matching an index has been found in the input array or not.
另一种可能性是使用另一个数组–其长度与输入数组相同–持有boolean值,告诉人们是否在输入数组中找到了与索引匹配的整数。
First, let’s create a third method, searchInUnsortedArrayBooleanArray().
首先,让我们创建第三个方法,searchInUnsortedArrayBooleanArray()。
After that, let’s create the boolean array, flags, and for each integer in the input array that matches an index of the boolean array, we set the corresponding value to true:
之后,让我们创建布尔数组,flags,对于输入数组中与boolean数组的索引相匹配的每个整数,我们将相应的值设置为true。
boolean[] flags = new boolean[input.length];
for (int number : input) {
if (number < flags.length) {
flags[number] = true;
}
}
Now, our flags array holds true for each integer present in the input array, and false otherwise. Then, we can iterate over the flags array and return the first index holding false. If none, we return the array length:
现在,我们的flags数组对输入数组中的每个整数持有true,否则持有false。然后,我们可以遍历flags数组并返回持有false的第一个索引。如果没有,我们就返回数组的长度。
for (int i = 0; i < flags.length; i++) {
if (!flags[i]) {
return i;
}
}
return flags.length;
Again, let’s try this algorithm with our examples. We’ll first reuse the array missing 1 and 3:
再次,让我们用我们的例子试试这个算法。我们首先重新使用缺少1和3的数组。
int[] input = new int[] {4, 2, 0, 5};
Then, when searching for the smallest missing integer with our new algorithm, the answer is still 1:
那么,当用我们的新算法寻找最小的缺失整数时,答案仍然是1。
int result = SmallestMissingPositiveInteger.searchInUnsortedArrayBooleanArray(input);
assertThat(result).isEqualTo(1);
And for the complete array, the answer doesn’t change either and is still 6:
而对于完整的数组,答案也没有变化,仍然是6。
int[] input = new int[] {4, 5, 1, 3, 0, 2};
int result = SmallestMissingPositiveInteger.searchInUnsortedArrayBooleanArray(input);
assertThat(result).isEqualTo(input.length);
5. Complexities
5.复杂性
Now that we’ve covered the algorithms, let’s talk about their complexities, using Big O notation.
现在我们已经涵盖了这些算法,让我们用Big O符号来谈谈它们的复杂性。
5.1. Sorted Array
5.1.排序的阵列
Let’s start with the first algorithm, for which the input is already sorted. In this case, the worst-case scenario is not finding a missing integer and, therefore, traversing the entire array. This means we have linear complexity, which is noted O(n), considering n is the length of our input.
让我们从第一个算法开始,对于这个算法,输入已经被排序了。在这种情况下,最坏的情况是找不到一个丢失的整数,因此要遍历整个数组。这意味着我们有线性复杂度,考虑到n是我们输入的长度,这被指出为O(n)。
5.2. Unsorted Array with Sorting Algorithm
5.2.带有排序算法的无排序数组
Now, let’s consider our second algorithm. In this case, the input array is not sorted, and we sort it before applying the first algorithm. Here, the complexity will be the greatest between that of the sorting mechanism and that of the algorithm itself.
现在,让我们考虑一下我们的第二种算法。在这种情况下,输入数组没有被排序,我们在应用第一个算法之前对其进行排序。在这里,复杂度将是排序机制的复杂度和算法本身的复杂度之间最大的。
As of Java 11, the Arrays.sort() method uses a dual-pivot quick-sort algorithm to sort arrays. The complexity of this sorting algorithm is, in general, O(n log(n)), though it could degrade up to O(n²). That means the complexity of our algorithm will be O(n log(n)) in general and can also degrade up to a quadratic complexity of O(n²).
从 Java 11 开始,Arrays.sort() 方法使用双支点快速排序算法来对数组进行排序。一般来说,这种排序算法的复杂度是O(n log(n)),尽管它可以降级到O(n²)。这意味着我们算法的复杂度一般会是O(n log(n)),也可以降级到O(n²)的二次方复杂度。
That’s for time complexity, but let’s not forget about space. Although the search algorithm doesn’t take extra space, the sorting algorithm does. Quick-sort algorithm takes up to O(log(n)) space to execute. That’s something we may want to consider when choosing an algorithm for large arrays.
这是关于时间复杂度的,但我们不要忘记空间。虽然搜索算法不需要额外的空间,但是排序算法需要。快速排序算法最多需要O(log(n))空间来执行。这是我们在为大型数组选择算法时可能要考虑的问题。
5.3. Unsorted Array with Boolean Array
5.3.无排序数组与布尔数组
Finally, let’s see how our third and last algorithm performs. For this one, we don’t sort the input array, which means we don’t suffer the complexity of sorting. As a matter of fact, we only traverse two arrays, both of the same size. That means our time complexity should be O(2n), which is simplified to O(n). That’s better than the previous algorithm.
最后,让我们看看我们的第三种也是最后一种算法的表现。对于这个算法,我们不对输入数组进行排序,这意味着我们不会遭受排序的复杂性。事实上,我们只遍历了两个数组,两个数组的大小都是一样的。这意味着我们的时间复杂性应该是O(2n),简化为O(n)。这比之前的算法要好。
But, when it comes to space complexity, we’re creating a second array of the same size as the input. That means we have O(n) space complexity, which is worse than the previous algorithm.
但是,当涉及到空间复杂度时,我们要创建第二个与输入相同大小的数组。这意味着我们有O(n)空间复杂度,这比之前的算法更糟。
Knowing all that, it’s up to us to choose an algorithm that best suits our needs, depending on the conditions in which it’ll be used.
了解了这些,我们就应该根据使用条件,选择一种最适合我们需要的算法。
6. Conclusion
6.结语
In this article, we’ve looked at algorithms for finding the smallest missing positive integer in an array. We’ve seen how to achieve that in a sorted array, as well as in an unsorted array. We also discussed the time and space complexities of the different algorithms, allowing us to choose one wisely according to our needs.
在这篇文章中,我们已经研究了寻找数组中最小的缺失正整数的算法。我们已经看到了如何在一个排序的数组以及未排序的数组中实现这一目标。我们还讨论了不同算法的时间和空间的复杂性,使我们能够根据我们的需要明智地选择一种算法。
As usual, the complete code examples shown in this article are available over on GitHub.
像往常一样,本文中显示的完整代码示例可在GitHub上获得。