Validate String as Filename in Java – 在Java中验证作为文件名的字符串

最后修改: 2021年 8月 15日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, we’ll discuss different ways to validate if a given String has a valid filename for the OS, using Java. We want to check the value against restricted characters or length limits.

在本教程中,我们将讨论使用Java验证一个给定的字符串是否具有操作系统的有效文件名的不同方法。我们想根据限制字符或长度限制来检查该值。

Through examples, we’ll just focus on core solutions, without using any external dependencies. We’ll check the SDK’s java.io and NIO2 packages, and finally implement our own solutions.

通过实例,我们将只关注核心解决方案,不使用任何外部依赖。我们将检查SDK的java.io和NIO2包,最后实现我们自己的解决方案。

2. Using java.io.File

2.使用java.io.File

Let’s start with the very first example, using the java.io.File class. In this solution, we need to create a File instance with a given string and then create a file on the local disk:

让我们从第一个例子开始,使用java.io.File。在这个解决方案中,我们需要用一个给定的字符串创建一个File实例,然后在本地磁盘上创建一个文件。

public static boolean validateStringFilenameUsingIO(String filename) throws IOException {
    File file = new File(filename);
    boolean created = false;
    try {
        created = file.createNewFile();
        return created;
    } finally {
        if (created) {
            file.delete();
        }
    }
}

When the given filename is incorrect, it throws an IOException. Let’s note, due to the file creation inside, this method requires that the given filename String doesn’t correspond to the already existing file.

当给定的文件名不正确时,它会抛出一个IOException让我们注意,由于里面的文件创建,这个方法要求给定的filenameString不对应于已经存在的文件。

We know that different file systems have their own filename limitations. Thus, by using java.io.File methods, we don’t need to specify the rules per OS, because Java automatically takes care of it for us.

我们知道,不同的文件系统有自己的文件名限制。因此,通过使用java.io.File方法,我们不需要指定每个操作系统的规则,因为Java自动为我们解决了这个问题。

However, we need to create a dummy file. When we succeed, we must remember to delete it at the end. Moreover, we must ensure that we have proper permissions to perform those actions. Any failures might also cause an IOException, so it’s also better to check the error message:

然而,我们需要创建一个假文件。当我们成功后,我们必须记得在最后删除它。此外,我们必须确保我们有适当的权限来执行这些操作。任何失败也可能导致IOException,所以也最好检查错误信息。

assertThatThrownBy(() -> validateStringFilenameUsingIO("baeldung?.txt"))
  .isInstanceOf(IOException.class)
  .hasMessageContaining("Invalid file path");

3. Using NIO2 API

3.使用NIO2 API

As we know the java.io package has many drawbacks, because it was created in the first versions of Java. The NIO2 API, the successor of the java.io package, brings many improvements, which also greatly simplifies our previous solution:

正如我们所知,java.io有很多缺点,因为它是在Java的第一个版本中创建的。NIO2 API是java.io包的后继者,它带来了许多改进,这也大大简化了我们之前的解决方案。

public static boolean validateStringFilenameUsingNIO2(String filename) {
    Paths.get(filename);
    return true;
}

Our function is now streamlined, so it’s the fastest way to perform such a test. We don’t create any files, so we don’t need to have any disk permissions and perform cleaning after the test.

我们的功能现在被精简了,所以它是进行这种测试的最快方式。我们不创建任何文件,所以我们不需要有任何磁盘权限,也不需要在测试后执行清理

The invalid filename throws the InvalidPathException, which extends the RuntimeException. The error message also contains more details than the previous one:

无效的文件名会抛出InvalidPathException它继承了RuntimeException。这个错误信息也包含了比前一个更多的细节

assertThatThrownBy(() -> validateStringFilenameUsingNIO2(filename))
  .isInstanceOf(InvalidPathException.class)
  .hasMessageContaining("character not allowed");

This solution has one serious drawback connected with the file system limitations. The Path class might represent the file path with subdirectories. Unlike the first example, this method doesn’t check the filename characters’ overflow limit. Let’s check it against a five-hundred-character random String generated using the randomAlphabetic() method from the Apache Commons:

这个解决方案有一个与文件系统限制有关的严重缺陷Path类可能表示带有子目录的文件路径。与第一个例子不同,这个方法没有检查文件名字符的溢出限制。让我们用Apache Commons中的randomAlphabetic()方法生成的5个字符的随机String来检查。

String filename = RandomStringUtils.randomAlphabetic(500);
assertThatThrownBy(() -> validateStringFilenameUsingIO(filename))
  .isInstanceOf(IOException.class)
  .hasMessageContaining("File name too long");

assertThat(validateStringFilenameUsingNIO2(filename)).isTrue();

To fix that, we should, as previously, create a file and check the result.

为了解决这个问题,我们应该像以前一样,创建一个文件并检查结果。

4. Custom Implementations

4.自定义实施

Finally, let’s try to implement our own custom function to test filenames. We’ll also try to avoid any I/O functionalities and use only core Java methods.

最后,让我们尝试实现我们自己的自定义函数来测试文件名。我们也将尽量避免任何I/O功能,只使用核心的Java方法。

These kinds of solutions give more control and allow us to implement our own rules. However, we must consider many additional limitations for different systems.

这类解决方案提供了更多的控制权,允许我们实施自己的规则。然而,我们必须考虑不同系统的许多额外限制

4.1. Using String.contains

4.1.使用String.contains

We can use the String.contains() method to check if the given String holds any of the forbidden characters. First of all, we need to manually specify some example values:

我们可以使用String.contains()方法来检查给定的String是否包含任何禁忌字符。首先,我们需要手动指定一些示例值。

public static final Character[] INVALID_WINDOWS_SPECIFIC_CHARS = {'"', '*', '<', '>', '?', '|'};
public static final Character[] INVALID_UNIX_SPECIFIC_CHARS = {'\000'};

In our example, let’s focus only on those two OS. As we know Windows filenames are more restricted than UNIX. Also, some whitespace characters might be problematic.

在我们的例子中,让我们只关注这两个操作系统。正如我们所知,Windows的文件名比UNIX的限制更多。另外,一些空白字符可能有问题

After defining the restricted character sets, let’s determine the current OS:

在定义了受限制的字符集之后,我们来确定当前的操作系统。

public static Character[] getInvalidCharsByOS() {
    String os = System.getProperty("os.name").toLowerCase();
    if (os.contains("win")) {
        return INVALID_WINDOWS_SPECIFIC_CHARS;
    } else if (os.contains("nix") || os.contains("nux") || os.contains("mac")) {
        return INVALID_UNIX_SPECIFIC_CHARS;
    } else {
        return new Character[]{};
    }
}

And now we can use it to test the given value:

而现在我们可以用它来测试给定的值。

public static boolean validateStringFilenameUsingContains(String filename) {
    if (filename == null || filename.isEmpty() || filename.length() > 255) {
        return false;
    }
    return Arrays.stream(getInvalidCharsByOS())
      .noneMatch(ch -> filename.contains(ch.toString()));
}

This Stream predicate returns true if any of our defined characters is not in a given filename. Additionally, we implemented support for null values and incorrect length.

如果我们定义的任何字符不在给定的文件名中,这个Stream谓词返回真。此外,我们实现了对null值和不正确长度的支持。

4.2. Regex Pattern Matching

4.2 Regex模式匹配

We can also use regular expressions directly on the given String. Let’s implement a pattern accepting only alphanumeric and dot characters, with the length not larger than 255:

我们也可以直接对给定的String使用regular expressions。让我们实现一个只接受字母数字和点字符的模式,其长度不超过255。

public static final String REGEX_PATTERN = "^[A-za-z0-9.]{1,255}$";

public static boolean validateStringFilenameUsingRegex(String filename) {
    if (filename == null) {
        return false;
    }
    return filename.matches(REGEX_PATTERN);
}

Now, we can test the given value against the previously prepared pattern. We can also easily modify the pattern. We skipped the OS check feature in this example.

现在,我们可以根据先前准备的模式测试给定的值。我们还可以轻松地修改模式。在这个例子中,我们跳过了操作系统的检查功能。

5. Conclusion

5.总结

In this article, we focused on filenames and their limitations. We introduced different algorithms to detect an invalid filename using Java.

在这篇文章中,我们重点讨论了文件名及其局限性。我们介绍了使用Java检测无效文件名的不同算法。

We started from the java.io package, which takes care of most of the system limitations for us, but performs additional I/O actions and might require some permissions. Then we checked the NIO2 API, which is the fastest solution, with the filename length check limitation.

我们从java.io包开始,它为我们解决了大部分的系统限制,但是执行了额外的I/O操作,可能需要一些权限。然后我们检查了NIO2 API,它是最快的解决方案,但有文件名长度检查限制

Finally, we implemented our own methods, without using any I/O API, but requiring the custom implementation of file system rules.

最后,我们实现了自己的方法,不使用任何I/O API,但需要自定义实现文件系统规则

You can find all the examples with additional tests over on GitHub.

你可以在GitHub上找到所有带有额外测试的例子