1. Overview
1.概述
When we work with files in Java, we often need to extract the filename from a given absolute path.
当我们在Java中处理文件时,我们经常需要从一个给定的绝对路径中提取文件名。
In this tutorial, we’ll explore how to extract the filename.
在本教程中,我们将探讨如何提取文件名。
2. Introduction to the Problem
2.对问题的介绍
The problem is pretty straightforward. Imagine we’re given an absolute file path string. We want to extract the filename from it. A couple of examples may explain the problem quickly:
这个问题很简单。想象一下,我们得到了一个绝对文件路径字符串。我们想从其中提取文件名。有几个例子可以快速解释这个问题。
String PATH_LINUX = "/root/with space/subDir/myFile.linux";
String EXPECTED_FILENAME_LINUX = "myFile.linux";
String PATH_WIN = "C:\\root\\with space\\subDir\\myFile.win";
String EXPECTED_FILENAME_WIN = "myFile.win";
As we’ve seen, different filesystems may have different file separators. Therefore, in this tutorial, we’ll address some platform-independent solutions. In other words, the same implementation will work on both *nix and Windows systems.
正如我们所见,不同的文件系统可能有不同的文件分隔符。因此,在本教程中,我们将讨论一些与平台无关的解决方案。换句话说,同样的实现方式在*nix和Windows系统上都可以使用。
For simplicity, we’ll use unit test assertions to verify if the solutions work as expected.
为了简单起见,我们将使用单元测试断言来验证这些解决方案是否按预期工作。
Next, let’s see them in action.
接下来,让我们看看他们的行动。
3. Parsing the Absolute Path as a String
3.将绝对路径解析为一个字符串
First of all, filesystems don’t allow filenames to contain file separators. So, for example, we cannot create a file whose name contains “/” on Linux’s Ext2, Ext3, or Ext4 filesystems:
首先,文件系统不允许文件名包含文件分隔符。因此,例如,我们不能在Linux的Ext2、Ext3或Ext4文件系统上创建一个名称包含”/”的文件。
$ touch "a/b.txt"
touch: cannot touch 'a/b.txt': No such file or directory
In the example above, the filesystem treats “a/” as a directory. Based on this rule, an idea to solve the problem is to take out the substring from the last file separator until the end of the string.
在上面的例子中,文件系统将 “a/”视为一个目录。基于这个规则,解决这个问题的一个想法是:取出从最后一个文件分隔符到字符串结尾的子串。
String’s lastIndexOf() method returns a substring’s last indexing in that string. And then, we can simply get the filename by calling absolutePath.substring(lastIndex+1).
String的lastIndexOf()方法返回一个子串在该字符串中的最后索引。然后,我们可以简单地通过调用absolutePath.substring(lastIndex+1)获得文件名。
As we can see, the implementation is straightforward. However, we should note that to make our solution system-independent, we shouldn’t hard code the file separator as “\\” for Windows or “/” for *nix systems. Instead, let’s use File.separator in our code so that our program automatically adapts to the system it’s running on:
正如我们所看到的,实现是直接的。然而,我们应该注意,为了使我们的解决方案与系统无关,我们不应该将文件分隔符硬编码为Windows系统的”\\”或*nix系统的”/”。相反,让我们在代码中使用File.separator,这样我们的程序就能自动适应它所运行的系统:。
int index = PATH_LINUX.lastIndexOf(File.separator);
String filenameLinux = PATH_LINUX.substring(index + 1);
assertEquals(EXPECTED_FILENAME_LINUX, filenameLinux);
The test above passes if we run it on a Linux machine. Similarly, the test below passes on a Windows machine:
如果我们在Linux机器上运行,上面的测试通过。同样地,下面的测试在Windows机器上也能通过。
int index = PATH_WIN.lastIndexOf(File.pathSeparator);
String filenameWin = PATH_WIN.substring(index + 1);
assertEquals(EXPECTED_FILENAME_WIN, filenameWin);
As we can see, the same implementation works on both systems.
正如我们所看到的,同样的实现在两个系统上都能发挥作用。
Apart from parsing the absolute path as a string, we can use the standard File class to solve the problem.
除了将绝对路径解析为字符串外,我们可以使用标准的文件类来解决这个问题。
4. Using the File.getName() Method
4.使用文件.getName()方法
The File class provides the getName() method to get the filename directly. Further, we can construct a File object from the given absolute path string.
File类提供了getName()方法来直接获取文件名。此外,我们可以从给定的绝对路径字符串构造一个文件对象。
Let’s first test it on the Linux system:
让我们先在Linux系统上测试一下。
File fileLinux = new File(PATH_LINUX);
assertEquals(EXPECTED_FILENAME_LINUX, fileLinux.getName());
The test passes if we give it a run. As File uses File.separator internally, if we test the same solution on a Windows system, it passes as well:
如果我们让它运行一下,测试就会通过。由于File内部使用File.separator,如果我们在Windows系统上测试同样的解决方案,它也能通过。
File fileWin = new File(PATH_WIN);
assertEquals(EXPECTED_FILENAME_WIN, fileWin.getName());
5. Using the Path.getFileName() Method
5.使用Path.getFileName()方法
File is a standard class from the java.io package. Since Java 1.7, the newer java.nio libraries ship with the Path interface.
文件是来自java.io包的一个标准类。自 Java 1.7 起,较新的 java.nio 库配备了 Path/em> 接口。
Once we have a Path object, we can get the filename by calling the Path.getFileName() method. Unlike the File class, we can create a Path instance using the static Paths.get() method.
一旦我们有了一个Path对象,我们就可以通过调用Path.getFileName()方法来获得文件名。与文件类不同,我们可以使用静态Paths.get()方法创建一个Path实例。
Next, let’s create a Path instance from the given PATH_LINUX string and test the solution on Linux:
接下来,让我们从给定的PATH_LINUX字符串中创建一个Path实例,并在Linux上测试该解决方案。
Path pathLinux = Paths.get(PATH_LINUX);
assertEquals(EXPECTED_FILENAME_LINUX, pathLinux.getFileName().toString());
When we execute the test, it passes. It’s worth mentioning that Path.getFileName() returns a Path object. Therefore, we call the toString() method explicitly to convert it into a string.
当我们执行该测试时,它通过了。值得一提的是,Path.getFileName()返回一个Path对象。因此,我们明确地调用toString()方法,将其转换为字符串。
The same implementation works on a Windows system with PATH_WIN as the path string too. This is because Path can detect the current FileSystem it’s running on:
同样的实现也可以在Windows系统上使用PATH_WIN作为路径字符串。这是因为Path可以检测到它所运行的当前文件系统。
Path pathWin = Paths.get(PATH_WIN);
assertEquals(EXPECTED_FILENAME_WIN, pathWin.getFileName().toString());
6. Using the FilenameUtils.getName() From Apache Commons IO
6.使用FilenameUtils.getName() 来自Apache Commons的IO
So far, we’ve addressed three solutions to extract the filename from an absolute path. As we’ve mentioned, they’re platform-independent. However, all these three solutions work correctly only if the given absolute path matches the system the program is running on. For instance, our program can only handle Windows paths if it runs on Windows.
到目前为止,我们已经解决了三种从绝对路径提取文件名的解决方案。正如我们所提到的,它们是与平台无关的。然而,只有当给定的绝对路径与程序运行的系统相匹配时,这三种解决方案才能正确工作。例如,我们的程序只有在Windows上运行时才能处理Windows路径。
6.1. The Intelligent FilenameUtils.getName() Method
6.1.智能的FilenameUtils.getName()方法
Well, in practice, the possibility of parsing a different system’s path format is relatively low. However, Apache Commons IO‘s FilenameUtils class can “intelligently” extract the filename from different path formats. So if our program runs on Windows, it can also work for Linux file paths and vice versa.
那么,在实践中,解析不同系统的路径格式的可能性是比较低的。然而,Apache Commons IO的FilenameUtils类可以 “智能地 “提取不同路径格式的文件名。因此,如果我们的程序在Windows上运行,它也可以对Linux文件路径起作用,反之亦然。
Next, let’s create a test:
接下来,让我们创建一个测试。
String filenameLinux = FilenameUtils.getName(PATH_LINUX);
assertEquals(EXPECTED_FILENAME_LINUX, filenameLinux);
String filenameWin = FilenameUtils.getName(PATH_WIN);
assertEquals(EXPECTED_FILENAME_WIN, filenameWin);
As we can see, the test above parses both PATH_LINUX and PATH_WIN. The test passes no matter whether we run it on Linux or Windows.
我们可以看到,上面的测试同时解析了PATH_LINUX和PATH_WIN。无论我们在Linux还是Windows上运行该测试,都能通过。
So next, we may want to know how FilenameUtils can automatically handle paths of different systems.
所以接下来,我们可能想知道FilenameUtils如何能自动处理不同系统的路径。
6.2. How FilenameUtils.getName() Works
6.2.FilenameUtils.getName()如何工作
If we have a look at FilenameUtils.getName()‘s implementation, its logic is similar to our “lastIndexOf” file separator approach. The difference is that FilenameUtils calls the lastIndexOf() method twice, once with the *nix separator (/), then with the Windows file separator (\). Finally, it takes the greater index as the “lastIndex”:
如果我们看一下FilenameUtils.getName()的实现,它的逻辑与我们的 “lastIndexOf “文件分隔符方法相似。不同的是,FilenameUtils调用lastIndexOf()方法两次,一次用*nix分隔符(/),然后用Windows文件分隔符(/)。最后,它把更大的索引作为 “lastIndex”。
...
final int lastUnixPos = fileName.lastIndexOf(UNIX_SEPARATOR); // UNIX_SEPARATOR = '/'
final int lastWindowsPos = fileName.lastIndexOf(WINDOWS_SEPARATOR); // WINDOWS_SEPARATOR = '\\'
return Math.max(lastUnixPos, lastWindowsPos);
Therefore, FilenameUtils.getName() doesn’t check the current filesystem or the system’s file separator. Instead, it finds the last file separator’s index, no matter which system it belongs to, and then extracts the substring from this index until the end of the string as the final result.
因此,FilenameUtils.getName()并不检查当前的文件系统或系统的文件分隔符。相反,它找到最后一个文件分隔符的索引,不管它属于哪个系统,然后从这个索引提取子串,直到字符串的结尾作为最终结果。
6.3. An Edge Case That Makes FilenameUtils.getName() Fail
6.3.一个边缘案例使FilenameUtils.getName()失败
Now we understand how FilenameUtils.getName() works. It’s indeed a clever solution, and it works in most cases. However, many Linux-supported filesystems allow a filename to contain backslashes (‘\’):
现在我们明白了FilenameUtils.getName()是如何工作的。这的确是一个聪明的解决方案,而且在大多数情况下是有效的。然而,许多Linux支持的文件系统允许文件名包含反斜线(’\’)。
$ echo 'Hi there!' > 'my\file.txt'
$ ls -l my*
-rw-r--r-- 1 kent kent 10 Sep 13 23:55 'my\file.txt'
$ cat 'my\file.txt'
Hi there!
If the filename in the given Linux file path contains backslashes, the FilenameUtils.getName() will fail. A test may explain it clearly:
如果给定的Linux文件路径中的文件名包含反斜线,FilenameUtils.getName()将失败。一个测试可能会清楚地解释它。
String filenameToBreak = FilenameUtils.getName("/root/somedir/magic\\file.txt");
assertNotEquals("magic\\file.txt", filenameToBreak); // <-- filenameToBreak = "file.txt", but we expect: magic\file.txt
We should keep this case in mind when we use this method.
我们在使用这种方法时应该记住这种情况。
7. Conclusion
7.结语
In this article, we’ve learned how to extract the filename from a given absolute path string.
在这篇文章中,我们已经学会了如何从一个给定的绝对路径字符串中提取文件名。
As always, the full source code of the example is available over on GitHub.
一如既往,该示例的完整源代码可在GitHub上获得over。