1. Overview
1.概述
In this mini-article, we’ll provide a brief explanation of what checksums are and show how to use some of Java’s built-in features for calculating checksums.
在这篇小文章中,我们将简要解释什么是校验和,并展示如何使用Java的一些内置功能来计算校验和。
2. Checksums and Common Algorithms
2.校验码和常用算法
Essentially, a checksum is a minified representation of a binary stream of data.
从本质上讲,校验和是二进制数据流的一个最小化表示。
Checksums are commonly used for network programming in order to check that a complete message has been received. Upon receiving a new message, the checksum can be recomputed and compared to the received checksum to ensure that no bits have been lost. Additionally, they may also be useful for file management, for instance, to compare files or to detect changes.
校验和通常用于网络编程,以检查是否已收到完整的信息。在收到一个新的信息时,可以重新计算校验和,并与收到的校验和进行比较,以确保没有比特丢失。此外,它们也可能对文件管理有用,例如,比较文件或检测变化。
There are several common algorithms for creating checksums, such as Adler32 and CRC32. These algorithms work by converting a sequence of data or bytes into a much smaller sequence of letters and numbers. They are designed such that any small change in the input will result in a vastly different calculated checksum.
有几种创建校验和的常见算法,如Adler32和CRC32。这些算法的工作原理是将一串数据或字节转换为一串更小的字母和数字。 它们的设计使输入中的任何微小变化都会导致计算出的校验和大不相同。
Let’s take a look at Java’s support for CRC32. Note that while CRC32 may be useful for checksums, it’s not recommended for secure operations, like hashing a password.
让我们来看看Java对CRC32的支持。请注意,虽然CRC32可能对校验很有用,但不建议将其用于安全操作,如哈希密码。
3. Checksum From a String or Byte Array
3.字符串或字节数组的校验和
The first thing we need to do is to obtain the input to the checksum algorithm.
我们需要做的第一件事是获得检查和算法的输入。
If we’re starting with a String, we can use the getBytes() method to get a byte array from a String:
如果我们从一个字符串开始,我们可以使用getBytes() 方法来从一个字符串获得一个字节数。
String test = "test";
byte[] bytes = test.getBytes();
Next, we can calculate the checksum using the byte array:
接下来,我们可以使用字节数组来计算校验和。
public static long getCRC32Checksum(byte[] bytes) {
Checksum crc32 = new CRC32();
crc32.update(bytes, 0, bytes.length);
return crc32.getValue();
}
Here, we are using Java’s built-in CRC32 class. Once the class is instantiated, we use the update method to update the Checksum instance with the bytes from the input.
在这里,我们使用的是 Java 内置的 CRC32 类。一旦该类被实例化,我们就使用update方法,用输入的字节更新Checksum实例。
Simply put, the update method replaces the bytes held by the CRC32 Object – this helps with code re-use and negates the need to create new instances of Checksum. The CRC32 class provides a few overridden methods to replace either the whole byte array or a few bytes within it.
简单地说,update方法替换了CRC32Object所持有的字节–这有助于代码的重复使用,并且不需要创建新的Checksum实例。CRC32类提供了一些被重载的方法来替换整个字节数组或其中的几个字节。
Finally, after setting the bytes, we export the checksum with the getValue method.
最后,在设置字节之后,我们用getValue方法导出校验和。
4. Checksum From an InputStream
4.来自InputStream的校验和
When dealing with larger data sets of binary data, the above approach would not be very memory-efficient as every byte is loaded into memory.
当处理较大的二进制数据集时,由于每一个字节都要加载到内存中,上述方法的内存效率不会很高。
When we have an InputStream, we may opt to use CheckedInputStream to create our checksum. By using this approach, we can define how many bytes are processed at any one time.
当我们有一个InputStream时,我们可以选择使用CheckedInputStream来创建我们的校验和。通过使用这种方法,我们可以定义在任何一个时间处理多少字节。
In this example, we process a given amount of bytes at the time until we reach the end of the stream.
在这个例子中,我们每次处理一定数量的字节,直到我们到达流的终点。
The checksum value is then available from the CheckedInputStream:
然后从CheckedInputStream中获得校验值。
public static long getChecksumCRC32(InputStream stream, int bufferSize)
throws IOException {
CheckedInputStream checkedInputStream = new CheckedInputStream(stream, new CRC32());
byte[] buffer = new byte[bufferSize];
while (checkedInputStream.read(buffer, 0, buffer.length) >= 0) {}
return checkedInputStream.getChecksum().getValue();
}
5. Conclusion
5.总结
In this tutorial, we look at how to generate checksums from byte arrays and InputStreams using Java’s CRC32 support.
在本教程中,我们将研究如何使用Java的CRC32支持从字节数组和InputStreams生成校验和。
As always, the code is available over on GitHub.
像往常一样,代码可在GitHub上获得。