1. Introduction
1.导言
In this tutorial, we’ll explore the problem of finding an array’s middle element(s). An array is a data structure that stores data elements of the same type.
在本教程中,我们将探讨查找数组中间元素的问题。 数组是一种存储相同类型数据元素的数据结构。
The elements of the array are stored in a contiguous fashion in memory and are associated with an index. The array has a fixed length.
数组的元素以连续方式存储在内存中,并与索引相关联。数组的长度是固定的。
2. Problem Statement
2 问题陈述
Given an array of n elements, we are supposed to return a new array containing the array’s middle element(s). In case the input array is of odd length, there is one middle element for the array. On the other hand, if the input array is of even length, there are two middle elements of the array.
给定一个包含n个元素的数组,我们应该返回一个包含数组中间元素的新数组。如果输入数组的长度为奇数,则数组有一个中间元素。另一方面,如果输入数组的长度为偶数,则数组有两个中间元素。
The output of our code should return an array of either length 1 or 2, depending on the input array.
根据输入数组的不同,我们的代码输出应该返回一个长度为 1 或 2 的数组。
Let’s see some examples:
让我们来看几个例子:
- Given an input array of 5 elements: [1, 2, 3, 4, 5], the output is [3]. As the array’s length is 5, which is an odd number, we can say that a single middle element exists, in our case 3.
- Given an input array of 6 elements: [1, 2, 3, 4, 5, 6], the output is [3, 4]. The array’s length in this case is 6, which is even. Here, both 3 and 4 are the middle elements of the array.
We should also consider a few edge cases for our problem. For an empty input array, there is no middle element, and hence an empty array is the correct output. The array itself is the output for arrays of lengths 1 and 2.
我们还应该考虑问题的一些边缘情况。对于空输入数组,没有中间元素,因此空数组是正确的输出。对于长度为 1 和 2 的数组,数组本身就是输出结果。
3. Middle Element(s) Using Array Operations
3.使用数组操作的中间元素
An array’s length tells us the number of elements it contains. An array of length n will contain n number of elements in it. The elements can be accessed with a 0-based index.
数组的长度告诉我们它包含的元素数量。长度为 n 的数组将包含 n 个元素。可以使用基于 0 的索引访问这些元素。
3.1. Middle Element of Arrays of Odd Length
3.1.奇数长度数组的中间元素
Given an array of length n, where n is an odd number, we can say that the array’s first index is always 0, and the last index of the array is n-1. An array of length 99 has indices running from 0 through 98, with index 49 being the middle index.
给定一个长度为n的数组,其中n为奇数,我们可以说数组的第一个索引总是0,而数组的最后一个索引是 n-1。长度为 99 的数组的索引从 0 到 98,索引 49 是中间索引。
We know that the middle point between two values, a and b, is always (a + b) / 2. In our case, considering a = 0 and b = n = 98, we can find the middle index to be (0 + 99) / 2 = 49. Hence, accessing the n/2 element will give us our desired output:
我们知道,两个值a和b之间的中间点总是 (a + b) / 2。在我们的例子中,考虑到a = 0和 b = n = 98,我们可以发现中间指数是 (0 + 99) / 2 = 49。因此,访问n/2元素将得到我们想要的输出结果:
int[] middleOfArray(int[] array) {
int n = array.length;
int mid = n / 2;
return new int[] { array[mid] };
}
It is important to note that n is always an integer as it tells us about the array’s length, and length cannot be fractional. Hence, when we perform n/2, Java will perform an Integer division and will discard the decimal part. So, in our previous example of 99 elements, the middle element will be 99/2 = 49 and not 49.5 or 50.
因此,当我们执行 n/2 时,Java 将执行整数除法并舍弃小数部分。所以,在我们之前的 99 个元素的示例中,中间元素将是 99/2 = 49,而不是 49.5 或 50。
3.2. Middle Elements of Arrays of Even Length
3.2.偶数长度数组的中间元素
Now that we know how to find the middle element of an odd-length array, let’s extend the solution to arrays of even length.
既然我们知道了如何找到奇数长度数组的中间元素,那么让我们将这一解决方案扩展到偶数长度数组。
There is no defined single middle element of an array of even length. An array of length 100 with elements starting from index 0 will have its middle elements at index 49 and 50. Hence, the middle elements of an array of length n, where n is even, are the elements at index (n/2)-1 and n/2. As our output depends on the length of the input array, let’s combine them into a single method:
长度为偶数的数组没有确定的单一中间元素。因此,长度为 n 的数组(其中 n 为偶数)的中间元素是位于索引 (n/2)-1 和 n/2的元素。由于我们的输出取决于输入数组的长度,因此让我们将它们合并为一个方法:
int[] middleOfArray(int[] array) {
if (ObjectUtils.isEmpty(array) || array.length < 3) {
return array;<br /> }
int n = array.length;
int mid = n / 2;
if (n % 2 == 0) {
int mid2 = mid - 1;
return new int[] { array[mid2], array[mid] };
} else {
return new int[] { array[mid] };
}
}
Let’s also add a small test to verify that our solution works for all types of arrays:
我们还可以添加一个小测试来验证我们的解决方案是否适用于所有类型的数组:
int[] array = new int[100];
for (int i = 0; i < array.length; i++) {
array[i] = i + 1;
}
int[] expectedMidArray = { 50, 51 };
MiddleOfArray middleOfArray = new MiddleOfArray();
Assert.assertArrayEquals(expectedMidArray, middleOfArray.middleOfArray(array));
int[] expectedMidArrayForOddLength = { 50 };
Assert.assertArrayEquals(expectedMidArrayForOddLength, middleOfArray.middleOfArray(Arrays.copyOfRange(array, 0, 99)));
3.3. Middle Element of an Array Between Two Points
3.3.两点间数组的中间元素 3.3.
In our previous sections, we considered the entire length of the array to be our input, and we calculated the middle of the entire array. A need to calculate the middle element(s) of a portion of the array or a subset given by a start and an end index might arise.
在前面的章节中,我们将整个数组的长度视为输入,并计算了整个数组的中间值。如果需要计算数组的一部分或由起始和结束索引给出的子集的中间元素,则可能会出现这种情况。
We cannot use the value of n, the length of the array to calculate the middle point anymore. Instead of substituting start = 0 and end = n as we did before, we can use the provided values as is and find the middle point: middle = (start + end) / 2.
我们不能再使用 n 的值,即数组的长度来计算中间点。我们不能再像以前那样代入 start = 0 和 end = n,而是可以原样使用所提供的值并找出中间点:middle = (start + end) / 2。
int[] middleOfArrayWithStartEnd(int[] array, int start, int end) {
int mid = (start + end) / 2;
int n = end - start;
if (n % 2 == 0) {
int mid2 = mid - 1;
return new int[] { array[mid2], array[mid] };
} else {
return new int[] { array[mid] };
}
}
However, this approach has a major drawback.
然而,这种方法有一个很大的缺点。
Consider that we are dealing with an array with a very large size in the order of Integer.MAX_VALUE. The value Integer.MAX_VALUE is 2147483647. We are required to find the middle element of the array between indices 100 and 2147483647.
假设我们正在处理一个数组,该数组的大小非常大,与 Integer.MAX_VALUE 的大小相当。Integer.MAX_VALUE 的值是 2147483647。我们需要找到数组中索引 100 和 2147483647 之间的中间元素。
So in our example, start = 100 and end = Integer.MAX_VALUE. When we apply the formula to find the midpoint, start + end is 4294966747. This value is greater than the Integer.MAX_VALUE and thereby leads to overflow. When we run this in Java, we get -2147483549, which confirms the overflow.
因此,在我们的示例中,start = 100 和end = Integer.MAX_VALUE. 当我们应用公式查找中点时,start + end 为 4294966747。该值大于 Integer.MAX_VALUE,因此导致溢出。在 Java 中运行时,我们得到 -2147483549,这证实了溢出。
The fix for this is rather simple. We start by finding the difference between the two values, start and end, and then add (end – start) / 2 to start. So, mid = start + (end – start) / 2. This always saves us from overflow:
解决这个问题的方法非常简单。我们首先找出 start 和 end 这两个值的差值,然后将 (end – start) / 2 加到 start 上。因此,mid = start + (end – start) / 2。这样就可以避免溢出:
int[] middleOfArrayWithStartEnd(int[] array, int start, int end) {
int mid = start + (end - start) / 2;
int n = end - start;
if (n % 2 == 0) {
int mid2 = mid - 1;
return new int[] { array[mid2], array[mid] };
} else {
return new int[] { array[mid] };
}
}
3.4. Performance of Array Operations to Find the Middle Elements
3.4.查找中间元素的数组操作性能
We know that accessing an element in an array is an O(1) operation. As array elements are placed in contiguous blocks in memory, jumping to a specific index is a constant time operation. Hence, we can say that all the above operations are constant time O(1) operations.
我们知道,访问数组中的元素是一个 O(1) 操作。由于数组元素被放置在内存中的连续块中,跳转到特定索引是一个恒定时间操作。因此,我们可以说上述所有操作都是定时 O(1) 操作。
4. Middle Element(s) Using Bitwise Operations
4.使用比特运算进行中间元素运算
We can use Bitwise operations as an alternative to find the middle elements of an array. Bitwise operations are operations which work on binary digits(bits) of input values. There are many categories of bitwise operators such as Bitwise Logical Operators and Bitwise Shift Operators.
我们可以使用 位运算 作为查找数组中间元素的替代方法。位运算是对输入值的二进制位(比特)进行运算。位运算符有很多类别,例如位逻辑运算符和位移运算符。
Here we’ll use a specific type of shift operator called the unsigned right shift operator, i.e. >>>.
在这里,我们将使用一种特殊类型的移位运算符,称为无符号右移运算符,即 >>>。
An unsigned right shift operator, as the name suggests shifts all the bits of the input value to the right and the newly created empty spaces are filled with 0. This helps in asserting that the output will always be positive.
无符号右移运算符,顾名思义,就是将输入值的所有位向右移动,并将新创建的空位填充为 0。
Unsigned shift operators are popularly used to divide a number by a power of 2. So, a >>> n is equivalent to a / (2 ^ n). We use this fact to find the middle element(s) between start and end:
无符号移位运算符常用于将一个数字除以 2 的幂次。因此,a >>> n 等同于 a / (2 ^ n) 。我们利用这一事实找到起点和终点之间的中间元素:
int[] middleOfArrayWithStartEndBitwise(int[] array, int start, int end) {
int mid = (start + end) >>> 1;
int n = end - start;
if (n % 2 == 0) {
int mid2 = mid - 1;
return new int[] { array[mid2], array[mid] };
} else {
return new int[] { array[mid] };
}
}
Bitwise operations such as these are faster as they are implemented at a lower level in the hardware, and modern CPUs can take advantage of it.
像这样的位运算速度更快,因为它们是在硬件的较低层次上实现的,现代 CPU 可以利用这一优势。
5. Median of an Array
5.数组的中位数
In our discussions, we didn’t talk about the nature of elements or their order. A special case arises if the elements in the array are all numerical and sorted in nature.
在讨论中,我们没有谈到元素的性质或顺序。如果数组中的元素都是数字元素,并且在性质上进行了排序,那么就会出现一种特殊情况。
The middle element of a sorted data set is called the median value of the dataset and is of great importance in mathematics and statistics. The Median value is a measure of the central tendency of any data set and provides insights into what the typical value of the dataset could be.
排序数据集的中间元素被称为数据集的中值,在数学和统计学中具有重要意义。 中值是对任何数据集中心倾向的衡量,可帮助我们了解数据集的典型值。
For an array of even length, the median is typically computed by finding the average of the two middle elements:
对于偶数长度的数组,中位数的计算通常是求中间两个元素的平均值:
int medianOfArray(int[] array, int start, int end) {
Arrays.sort(array); // for safety. This can be ignored
int mid = (start + end) >>> 1;
int n = end - start;
if (n % 2 == 0) {
int mid2 = mid - 1;
return (array[mid2] + array[mid]) / 2;
} else {
return array[mid];
}
}
The median value expects the data set to be in sorted order to be correct. So if we are unsure of the array’s nature, we should first sort the array in ascending or descending order and then find the middle value using any of the previous methods.
中位值要求数据集按排序顺序排列才正确。因此,如果我们不确定数组的性质,就应该先按升序或降序对数组进行排序,然后使用前面的任何一种方法找出中间值。
Consider a problem statement where we are required to find the median house price of a country. Given the nature of the problem, we can assume that the input data will be too large to fit in the available memory of a conventional computer. If the JVM is not able to load the entire array in memory at a time, it would be difficult to apply the methods mentioned above to find the median.
考虑这样一个问题:我们需要找出一个国家的房价中位数。考虑到问题的性质,我们可以假定输入数据太大,常规计算机的可用内存无法容纳。如果 JVM 一次无法在内存中加载整个数组,那么就很难应用上述方法找到中位数。
In such cases where the data set is too large to fit in memory, we can consider the input to be in a stream rather than a conventional array. We can then find the median of the data stream using additional data structures, such as a Heap with streaming data.
在这种情况下,如果数据集过大,内存无法容纳,我们可以将输入视为流,而不是传统的数组。然后,我们就可以使用额外的数据结构(例如带有流数据的堆)找到数据流的中值。
6. Conclusion
6.结论
In this article, we looked at several approaches to finding the middle elements of an array. We also talked about how this solution can help us find the median of an array.
在本文中,我们介绍了几种查找数组中间元素的方法。我们还讨论了这种方法如何帮助我们找到数组的中位数。
As usual, all code samples can be found over on GitHub.