Find Files that Match Wildcard Strings in Java – 在Java中查找符合通配符字符串的文件

最后修改: 2022年 5月 19日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, we’ll learn how to find files using wildcard strings in Java.

在本教程中,我们将学习如何在Java中使用通配符字符串查找文件。

2. Introduction

2.绪论

In the programming realm, glob is a pattern with wildcards to match filenames. We’ll use glob patterns to filter a list of filenames for our example. We’ll use the popular wildcards “*” and “?”. Java has supported this feature since Java SE 7.

在编程领域,glob是一种带有通配符的模式,用于匹配文件名。在我们的例子中,我们将使用 glob 模式来过滤一个文件名列表。我们将使用流行的通配符 “*”和”?”。从Java SE 7开始,Java就支持这一功能。

Java has provided the getPathMatcher() method in their FileSystem class. It can take either a regular expression (regex) or a glob pattern. We’ll utilize glob patterns in this example because of the simplicity of applying wildcards as compared to regex.

Java 在其FileSystem类中提供了getPathMatcher()方法。它可以接受正则表达式(regex)或glob模式。我们将在这个例子中使用glob模式,因为与regex相比,应用通配符很简单。

Let’s see an example of using this method with a glob pattern:

让我们看一个用glob模式使用这个方法的例子。

String pattern = "myCustomPattern";
PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:" + pattern);

Here are some examples of glob patterns in Java:

下面是一些Java中glob模式的例子。

Glob Description
*.java Matches all files with extension “java”
*.{java,class} Matches all files with extensions of “java” or “class”
*.* Matches all files with a “.” somewhere in its name
???? Matches all files with four characters in its name
[test].docx Matches all files with filename ‘t’, ‘e’, ‘s’, or ‘t’ and “docx” extension
[0-4].csv Matches all files with filename ‘0′, ‘1′, ‘2′, ‘3′, or ‘4′ with “csv” extension
C:\\temp\\* Matches all files in the “C:\temp” directory on Windows systems
src/test/* Matches all files in the “src/test/” directory on Unix-based systems

3. Implementation

3.实施

Let’s get into the details of implementing this solution. There are two steps to complete this task.

让我们来了解一下实施这一解决方案的细节。有两个步骤来完成这项任务。

First, we create a method that takes two arguments – a root directory to search within and a wildcard pattern to look for. This method would contain the programming logic for visiting every file and directory, utilizing glob patterns, and finally returning a list of matching file names.

首先,我们创建一个方法,它需要两个参数–一个要搜索的根目录和一个要寻找的通配符模式。这个方法将包含访问每个文件和目录的编程逻辑,利用glob模式,最后返回一个匹配文件名的列表。

Second, we use the walkFileTree method from the Java provided Files class to invoke our search process.

其次,我们使用Java提供的Files 类中的walkFileTree方法来调用我们的搜索过程。

To start, let’s create our SearchFileByWildcard class with a searchWithWc() method, which takes a Path and String pattern as parameters:

首先,让我们创建我们的 SearchFileByWildcard 类,它有一个 searchWithWc() 方法,它接受一个 PathString 模式作为参数。

class SearchFileByWildcard {
    static List<String> matchesList = new ArrayList<String>();
    List<String> searchWithWc(Path rootDir, String pattern) throws IOException {
        matchesList.clear();
        FileVisitor<Path> matcherVisitor = new SimpleFileVisitor<Path>() {
            @Override
            public FileVisitResult visitFile(Path file, BasicFileAttributes attribs) throws IOException {
                FileSystem fs = FileSystems.getDefault();
                PathMatcher matcher = fs.getPathMatcher(pattern);
                Path name = file.getFileName();
                if (matcher.matches(name)) {
                    matchesList.add(name.toString);
                }
	        return FileVisitResult.CONTINUE;
            }
        };
        Files.walkFileTree(rootDir, matcherVisitor);
        return matchesList;
    }
}

To visit the files in rootDir, we use the FileVisitor interface. Once we obtain an interface to the filesystem by invoking the getDefault() method, we use the getPathMatcher() method from the FileSystem class. This is where we apply glob patterns on the individual file paths within rootDir.

为了访问rootDir中的文件,我们使用FileVisitor接口。一旦我们通过调用getDefault()方法获得了文件系统的接口,我们就使用FileSystem类中的getPathMatcher()方法。这就是我们在rootDir内的各个文件路径上应用glob模式的地方。

In our case, we can use the resulting PathMatcher to get an ArrayList of matching filenames.

在我们的案例中,我们可以使用产生的PathMatcher来获得一个ArrayList的匹配文件名。

Finally, we call the walkFileTree method from the NIO Files class. File traversal starts at rootDir, and each node in the tree is visited recursively in a depth-first manner. matcherVisitor contains an implementation for the visitFile method from the SimpleFileVisitor class.

最后,我们从NIO的Files类中调用walkFileTree方法。文件遍历从rootDir开始,树上的每个节点都以深度优先的方式被递归访问。matcherVisitor包含对SimpleFileVisitor类中visitFile方法的实现。

Now that we’ve discussed implementing a wildcard-based file search, let’s look at some sample output. We’ll use the following file structure for our examples:

现在我们已经讨论了实现基于通配符的文件搜索,让我们看看一些样本输出。我们将使用以下文件结构作为我们的例子。

fileStructureUnix

If we pass a String with the “glob:*.{txt,docx}” pattern, our code outputs the three filenames with the extension “txt” and one filename with the extension “docx”:

如果我们传递一个带有“glob:*.{txt,docx}”模式的字符串,我们的代码就会输出三个扩展名为“txt”的文件名和一个扩展名为“docx”的文件。

SearchFileByWildcard sfbw = new SearchFileByWildcard();
List<String> actual = sfbw.searchWithWc(Paths.get("src/test/resources/sfbw"), "glob:*.{txt,docx}");

assertEquals(new HashSet<>(Arrays.asList("six.txt", "three.txt", "two.docx", "one.txt")), 
  new HashSet<>(actual));

If we pass a String with the “glob:????.{csv}” pattern, our code outputs one filename with four characters followed by a “.” with extension “csv”:

如果我们传递一个带有“glob:????.{csv}”模式的字符串,我们的代码会输出一个带有四个字符的文件名,后面是一个”.”,扩展名为“csv”

SearchFileByWildcard sfbw = new SearchFileByWildcard();
List<String> actual = sfbw.searchWithWc(Paths.get("src/test/resources/sfbw"), "glob:????.{csv}");

assertEquals(new HashSet<>(Arrays.asList("five.csv")), new HashSet<>(actual));

4. Conclusion

4.总结

In this tutorial, we learned how to search for files using wildcard patterns in Java.

在本教程中,我们学习了如何在Java中使用通配符模式来搜索文件。

The source code is available over on GitHub.

源代码可在GitHub上获得