1. Overview
1.概述
A MIME type is a label that specifies the type and the format of data on the internet. A single MIME type can be associated with multiple file extensions. For instance, the “image/jpeg” MIME type encompasses extensions like “.jpg“, “.jpeg” or “.jpe“.
一种 MIME 类型可与多个文件扩展名相关联。例如,”image/jpeg“MIME 类型包括”.jpg“、”.jpeg“或”.jpe“等扩展名。
In this tutorial, we’ll explore different methods for determining the file extension for a particular MIME type in Java. We’ll focus on four major approaches to solve the problem.
在本教程中,我们将探讨用 Java 确定特定 MIME 类型的文件扩展名的不同方法。我们将重点讨论解决问题的四种主要方法。
Some of our implementations will include an optional last dot in the extension. For example, if our MIME type name is “image/jpeg“, either the string “jpg” or “.jpg” will be returned as the file’s extension.
我们的一些实现将在扩展名中包含一个可选的最后一个点。例如,如果我们的 MIME 类型名称是”image/jpeg“,那么将返回字符串”jpg“或”.jpg“作为文件的扩展名。
2. Using Apache Tika
2.使用 Apache Tika
Apache Tika is a toolkit that detects and extracts metadata and text from various files. It includes a rich and powerful API that can be used to detect file extensions for a MIME type.
Apache Tika 是一个可检测和提取各种文件中的元数据和文本的工具包。它包含一个丰富而强大的 API,可用于检测 MIME 类型的文件扩展名。
Let’s begin by configuring the Maven dependency:
让我们从配置 Maven 依赖关系开始:
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>2.9.0</version>
</dependency>
As mentioned before, a single MIME type can have multiple extensions. To handle this, the MimeType class provides two distinct methods: getExtension() and getExtensions().
如前所述,一个 MIME 类型可以有多个扩展名。为了处理这个问题,MimeType 类提供了两个不同的方法:getExtension() 和getExtensions()。
The getExtension() method returns the preferred file extension, while getExtensions() returns the list of all known file extensions for that MIME type.
getExtension()方法返回首选文件扩展名,而getExtensions() 返回该 MIME 类型的所有已知文件扩展名列表。
Next, we’ll use both the methods from the MimeType class to retrieve the extension:
接下来,我们将使用 MimeType 类中的两个方法来检索扩展名:
@Test
public void whenUsingTika_thenGetFileExtension() {
List<String> expectedExtensions = Arrays.asList(".jpg", ".jpeg", ".jpe", ".jif", ".jfif", ".jfi");
MimeTypes allTypes = MimeTypes.getDefaultMimeTypes();
MimeType type = allTypes.forName("image/jpeg");
String primaryExtension = type.getExtension();
assertEquals(".jpg", primaryExtension);
List<String> detectedExtensions = type.getExtensions();
assertThat(detectedExtensions).containsExactlyElementsOf(expectedExtensions);
}
3. Using Jodd Util
3.使用 Jodd Util
We can alternatively use the Jodd Util library, which contains a utility to find file extensions for a MIME type.
我们也可以使用 Jodd Util 库,其中包含一个查找 MIME 类型文件扩展名的实用程序。
Let’s begin by adding the Maven dependency:
首先,让我们添加 Maven 依赖项:
<dependency>
<groupId>org.jodd</groupId>
<artifactId>jodd-util</artifactId>
<version>6.2.1</version>
</dependency>
Next, we’ll use the findExtensionsByMimeTypes() method to get all the supported file extensions:
接下来,我们将使用 findExtensionsByMimeTypes() 方法获取所有支持的文件扩展名:
@Test
public void whenUsingJodd_thenGetFileExtension() {
List<String> expectedExtensions = Arrays.asList("jpeg", "jpg", "jpe");
String[] detectedExtensions = MimeTypes.findExtensionsByMimeTypes("image/jpeg", false);
assertThat(detectedExtensions).containsExactlyElementsOf(expectedExtensions);
}
Jodd Util provides a limited set of recognized file types and extensions. It prioritizes simplicity over comprehensive coverage.
Jodd Util 只提供一套有限的可识别文件类型和扩展名。它优先考虑的是简单性而不是全面性。
In the findExtensionsByMimeTypes() method, we can activate wildcard mode with the second boolean parameter set to true. When a wildcard pattern is provided as a MIME type, we’ll get extensions for all the MIME types that match the specified wildcard pattern.
在findExtensionsByMimeTypes()方法中,我们可以将第二个boolean参数设置为true,从而激活通配符模式。
For instance, when we set the MIME type as image/* and enable wildcard mode, we obtain extensions for all MIME types within the image category.
例如,当我们将 MIME 类型设置为 image/* 并启用通配符模式时,我们将获得 image 类别内所有 MIME 类型的扩展名。
4. Using SimpleMagic
4.使用 SimpleMagic
SimpleMagic is a utility package whose primary use is MIME type detection for files. It also contains a way to convert a MIME type to a file extension.
SimpleMagic 是一个实用程序包,主要用于检测文件的 MIME 类型。它还包含一种将 MIME 类型转换为文件扩展名的方法。
Let’s start by adding the Maven dependency:
让我们从添加 Maven 依赖项开始:
<dependency>
<groupId>com.j256.simplemagic</groupId>
<artifactId>simplemagic</artifactId>
<version>1.17</version>
</dependency>
Now, we’ll use the getFileExtensions() method of the ContentInfo class to get all the supported file extensions:
现在,我们将使用 ContentInfo 类的 getFileExtensions() 方法来获取所有支持的文件扩展名:
@Test
public void whenUsingSimpleMagic_thenGetFileExtension() {
List<String> expectedExtensions = Arrays.asList("jpeg", "jpg", "jpe");
String[] detectedExtensions = ContentType.fromMimeType("image/jpeg").getFileExtensions();
assertThat(detectedExtensions).containsExactlyElementsOf(expectedExtensions);
}
We have an enum ContentType in the SimpleMagic library, which includes mappings of MIME types along with their corresponding file extensions and simple names. getFileExtensions() uses this enum, enabling us to retrieve the file extension based on the provided MIME type.
SimpleMagic 库中有一个枚举 ContentType,其中包括 MIME 类型及其相应文件扩展名和简单名称的映射。getFileExtensions()使用该枚举,使我们能够根据提供的 MIME 类型检索文件扩展名。
5. Using a Custom Map of MIME Type to Extensions
5.使用自定义 Map MIME 类型到扩展名
We can also obtain a file extension from a MIME type without depending on external libraries. We’ll create a custom mapping of MIME types to file extensions to do this.
我们还可以从 MIME 类型中获取文件扩展名,而无需依赖外部库。为此,我们将创建一个 MIME 类型到文件扩展名的自定义映射。
Let’s create a HashMap named mimeToExtensionMap to associate MIME types with their corresponding file extensions. The get() method allows us to look up the preconfigured file extensions for the provided MIME type in the map and return them:
让我们创建一个名为 mimeToExtensionMap 的 HashMap 来将 MIME 类型与相应的文件扩展名关联起来。 get() 方法允许我们在映射中为所提供的 MIME 类型查找预先配置的文件扩展名,并返回它们: HashMap mimeToExtensionMap 用于将 MIME 类型与相应的文件扩展名关联起来。
@Test
public void whenUsingCustomMap_thenGetFileExtension() {
Map<String, Set<String>> mimeToExtensionMap = new HashMap<>();
List<String> expectedExtensions = Arrays.asList(".jpg", ".jpe", ".jpeg");
addMimeExtensions(mimeToExtensionMap, "image/jpeg", ".jpg");
addMimeExtensions(mimeToExtensionMap, "image/jpeg", ".jpe");
addMimeExtensions(mimeToExtensionMap, "image/jpeg", ".jpeg");
Set<String> detectedExtensions = mimeToExtensionMap.get("image/jpeg");
assertThat(detectedExtensions).containsExactlyElementsOf(expectedExtensions);
}
void addMimeExtensions(Map<String, Set> map, String mimeType, String extension) {
map.computeIfAbsent(mimeType, k-> new HashSet<>()).add(extension);
}
The sample map includes a few examples, but it can be easily customized by adding additional mappings as necessary.
示例地图包括几个示例,但也可以根据需要添加其他映射,从而轻松进行定制。
6. Conclusion
6.结论
In this article, we explored different methods for extracting file extensions from MIME types. We examined two distinct approaches: leveraging existing libraries and crafting custom logic tailored to our needs.
在本文中,我们探讨了从 MIME 类型中提取文件扩展名的不同方法。我们研究了两种不同的方法:利用现有库和根据我们的需求定制逻辑。
When dealing with a limited set of MIME types, custom logic is an option, though it can have maintenance challenges. Conversely, libraries such as Apache Tika or Jodd Util offer broad MIME type coverage and ease of use, making them a reliable choice for handling a wide array of MIME types.
在处理有限的 MIME 类型时,自定义逻辑是一种选择,但它可能会面临维护方面的挑战。相反,Apache Tika 或 Jodd Util 等库提供了广泛的 MIME 类型覆盖范围和易用性,使它们成为处理各种 MIME 类型的可靠选择。
As always, the source code used in this article is available over on GitHub.
与往常一样,本文中使用的源代码可在 GitHub 上获取。