Detect EOF in Java – 在 Java 中检测 EOF

最后修改: 2023年 9月 21日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.导言

EOF (End of File) means a condition when we’re reading a file and have reached the end of that file. Understanding EOF detection is essential because, in some applications, we may need to read configuration files, process data, or validate files. In Java, there are several ways we can detect EOF.

EOF(文件结束)是指我们在读取文件时到达文件末尾的一种情况。了解 EOF 检测至关重要,因为在某些应用程序中,我们可能需要读取配置文件、处理数据或验证文件。在 Java 中,有几种方法可以检测 EOF。

In this tutorial, we’ll explore several methods for EOF detection in Java.

在本教程中,我们将探讨 Java 中 EOF 检测的几种方法。

2. Example Setup

2.设置示例

However, before we continue, let’s first create a sample text file containing dummy data for testing:

不过,在继续之前,让我们先创建一个包含虚拟数据的示例文本文件进行测试:

@Test
@Order(0)
public void prepareFileForTest() {
    File file = new File(pathToFile);

    if (!file.exists()) {
        try {
            file.createNewFile();
            FileWriter writer = new FileWriter(file);
            writer.write(LOREM_IPSUM);
            writer.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This method must be run first before the other methods because it ensures the existence of the test file. Therefore, we add the @Order(0) annotation.

该方法必须先于其他方法运行,因为它确保测试文件的存在。因此,我们添加了 @Order(0) 注解。

3. Detect EOF Using FileInputStream

3.使用 FileInputStream 检测 EOF

In the first approach, we’ll use FileInputStream, which is a subclass of InputStream.

在第一种方法中,我们将使用 FileInputStream,它是 InputStream 的子类。

There’s a read() method that works by reading data byte by byte so that it produces a value of -1 when it reaches the EOF.

有一个read()方法,它的工作原理是逐字节读取数据,这样 当它读到 EOF 时,就会产生一个-1

Let’s read our test file to the end of the file and store the data in a ByteArrayOutputStream object:

让我们读取测试文件至文件末尾,并将数据存储在 ByteArrayOutputStream 对象中:

String readWithFileInputStream(String pathFile) throws IOException {
    try (FileInputStream fis = new FileInputStream(pathFile);
        ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
        int data;
        while ((data = fis.read()) != -1) {
            baos.write(data);
        }
        return baos.toString();
    }
}

Now let’s create a unit test and make sure the test passes:

现在,让我们创建一个单元测试,并确保测试通过:

@Test
@Order(1)
public void givenDummyText_whenReadWithFileInputStream_thenReturnText() {
    try {
        String actualText = eofDetection.readWithFileInputStream(pathToFile);
        assertEquals(LOREM_IPSUM, actualText);
    } catch (IOException e) {
        fail(e.getMessage());
    }
}

The advantage of FileInputStream is in terms of efficiency – it’s very fast. Unfortunately, there’s no method to read text per line, so in the case of reading a text file, we must convert from bytes to characters.

FileInputStream 的优势在于效率–它非常快。遗憾的是,没有按行读取文本的方法,因此在读取文本文件时,我们必须将 bytes 转换为 characters。

So, this method is suitable for reading binary data and provides flexibility in byte-by-byte processing. However, it requires more data conversion code if we want to read text data in a structured format.

因此,这种方法适用于读取二进制数据,并能灵活地进行字节字节处理。但是,如果我们要读取结构化格式的文本数据,则需要更多的数据转换代码。

4. Detect EOF Using BufferedReader

4.使用 BufferedReader 检测 EOF

BufferedReader is a class in the java.io package that’s used to read text from the input stream. The way BufferedReader works is by buffering or temporarily storing data in memory.

BufferedReaderjava.io 包中的一个类,用于从输入流中读取文本。BufferedReader 的工作方式是在内存中缓冲或临时存储数据。

In BufferedReader, there’s a readline() method that reads the file line by line and returns a null value if it reaches EOF:

BufferedReader 中,有一个readline()方法,它可以逐行读取文件,如果读到 EOF,将返回一个null值:

String readWithBufferedReader(String pathFile) throws IOException {
    try (FileInputStream fis = new FileInputStream(pathFile);
        InputStreamReader isr = new InputStreamReader(fis);
        BufferedReader reader = new BufferedReader(isr)) {
        StringBuilder actualContent = new StringBuilder();
        String line;
        while ((line = reader.readLine()) != null) {
            actualContent.append(line);
        }
        return actualContent.toString();
    }
}

Here, the contents of the file are read by the readLine() method line by line. Then, the results are stored in the actualContent variable until it produces a null value which indicates EOF.

在这里,readLine() 方法会逐行读取文件内容。然后,读取结果被存储在 actualContent 变量中,直到产生表示 EOF 的 null 值为止。

Next, let’s do a test to ensure the accuracy of the results:

接下来,让我们做一个测试,以确保结果的准确性:

@Test
@Order(2)
public void givenDummyText_whenReadWithBufferedReader_thenReturnText() {
    try {
        String actualText = eofDetection.readWithBufferedReader(pathToFile);
        assertEquals(LOREM_IPSUM, actualText);
    } catch (IOException e) {
        fail(e.getMessage());
    }
}

Since we have a readLine() method, this technique is great for reading text data in a structured format like CSV. However, it’s not suitable for reading binary data.

由于我们有一个 readLine() 方法,因此该技术非常适合读取结构化格式的文本数据 如 CSV。

5. Detect EOF Using Scanner

5.使用扫描仪检测 EOF

Scanner is a class in the java.util package that can be used to read input with various types of data, such as text, integers, and others.

Scannerjava.util 包中的一个类,可用于读取各种类型数据的输入,如文本、整数等。

Scanner provides a hasNext() method to read the entire contents of the file until it produces a false value, which indicates EOF :

Scanner 提供了一个 hasNext()方法,用于读取文件的全部内容 直到产生一个 false 值,表示 EOF :

String readWithScanner(String pathFile) throws IOException{
    StringBuilder actualContent = new StringBuilder();
    File file = new File(pathFile);
    Scanner scanner = new Scanner(file);
    while (scanner.hasNext()) {
    	String line = scanner.nextLine();
        actualContent.append(line);
    }
    return actualContent.toString();
}

We can observe how scanner reads the file, as long as hasNext() evaluates to true. This means we can retrieve String values from the scanner using the nextLine() method until hasNext() evaluates to false, indicating that we’ve reached the EOF.

只要 hasNext() 的值为 true,我们就可以观察 scanner 如何读取文件。这意味着我们可以使用 nextLine() 方法从扫描仪获取 String 值,直到 hasNext() 返回 false 值,表明我们已经到达 EOF。

Let’s test to make sure the method works correctly:

让我们来测试一下,确保该方法能正常工作:

@Test
@Order(3)
public void givenDummyText_whenReadWithScanner_thenReturnText() {
    try {
        String actualText = eofDetection.readWithScanner(pathToFile);
        assertEquals(LOREM_IPSUM, actualText);
    } catch (IOException e) {
        fail(e.getMessage());
    }
}

The advantage of this method is that it’s very flexible and can read various types of data easily, but it’s less than ideal for binary data. However, performance can be slightly slower than BufferedReader, and it isn’t suitable for reading binary data.

这种方法的优点是非常灵活,可以轻松读取各种类型的数据 但对于二进制数据就不那么理想了。不过,它的性能可能比 BufferedReader 稍慢,而且不适合读取二进制数据。

6. Detect EOF Using FileChannel and ByteBuffer

6.使用 FileChannelByteBuffer 检测 EOF

FileChannel and ByteBuffer are classes in Java NIO (New I/O) that are improvements to traditional I/O.

FileChannelByteBuffer 是 Java NIO(新 I/O)中的类,是对传统 I/O 的改进。

FileChannel functions are used for handling file input and output operations, while ByteBuffer is utilized to handle binary data in the form of a byte array efficiently.

FileChannel 函数用于处理文件输入和输出操作,而 ByteBuffer 则用于高效处理字节数组形式的二进制数据。

For EOF detection, we’ll use these two classes – FileChannel to read the file and ByteBuffer to store the results. The approach we use is to read the buffer until it returns the value -1, which indicates the end of the file (EOF):

对于 EOF 检测,我们将使用这两个类 – FileChannel 来读取文件,ByteBuffer 来存储结果。我们使用的方法是读取缓冲区,直到返回值-1(表示文件结束(EOF))为止:

String readFileWithFileChannelAndByteBuffer(String pathFile) throws IOException {
    try (FileInputStream fis = new FileInputStream(pathFile);
        FileChannel channel = fis.getChannel()) {
        ByteBuffer buffer = ByteBuffer.allocate((int) channel.size());
        while (channel.read(buffer) != -1) {
            buffer.flip();
            buffer.clear();
        }
        return StandardCharsets.UTF_8.decode(buffer).toString();
    }
}

This time, we don’t need to use StringBuilder because we can get the results of reading the file from the converted or decoded ByteBuffer object.

这一次,我们不需要使用 StringBuilder ,因为我们可以从转换或解码后的 ByteBuffer 对象中获得读取文件的结果。

Let’s again test to ensure the method works:

让我们再次进行测试,确保方法有效:

@Test
@Order(4)
public void givenDummyText_whenReadWithFileChannelAndByteBuffer_thenReturnText() {
    try {
        String actualText = eofDetection.readFileWithFileChannelAndByteBuffer(pathToFile);
        assertEquals(LOREM_IPSUM, actualText);
    } catch (IOException e) {
        fail(e.getMessage());
    }
}

This method provides high performance when reading or writing data from or to files, is suitable for random access, and supports MappedByteBuffer. However, its usage is more intricate and demands meticulous buffer management.

该方法在从文件读取或向文件写入数据时提供高性能适合随机存取,并支持MappedByteBuffer。不过,该方法的使用较为复杂,需要细致的缓冲区管理。

It’s particularly well-suited for reading binary data and applications that necessitate random file access.

它特别适合读取二进制数据和需要随机文件访问的应用程序。

7. FileInputStream vs. BufferedReader vs. Scanner vs. FileChannel and ByteBuffer

7.FileInputStream vs. BufferedReader vs. Scanner vs. FileChannelByteBuffer

The following table summarizes the comparison between the four approaches, each of which has advantages and disadvantages:

下表总结了四种方法之间的比较,每种方法都各有利弊:

Feature FileInputStream BufferedReader Scanner FileChannel and ByteBuffer
Data Type Binary Structured text Structured text Binary
Performance Good Good Good Excellent
Flexibility High Medium High Low
Ease of use Low High High Low

8. Conclusion

8.结论

In this article, we learned four ways of EOF detection in Java.

在本文中,我们学习了 Java 中的四种 EOF 检测方法。

Each approach has its advantages and disadvantages. The right choice depends on the specific needs of our application, whether it involves reading structured text data or binary data, and how critical performance is in our use case.

每种方法都有其优缺点。正确的选择取决于我们应用程序的具体需求,是读取结构化文本数据还是二进制数据,以及在我们的使用案例中性能的重要性。

As always, the full source code is available over on GitHub.

与往常一样,完整的源代码可在 GitHub 上 获取。