Getting a File’s Mime Type in Java – 在Java中获取一个文件的MIME类型

最后修改: 2018年 7月 22日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, we’ll take a look at various strategies for getting MIME types of a file. We’ll look at ways to extend the MIME types available to the strategies, wherever applicable.

在本教程中,我们将看一下获取文件MIME类型的各种策略。我们将研究如何扩展策略可用的MIME类型,只要适用。

We’ll also point out where we should favor one strategy over the other.

我们还将指出我们应该在哪些方面偏重于一种策略。

2. Using Java 7

2.使用Java 7

Let’s start with Java 7 – which provides the method Files.probeContentType(path) for resolving the MIME type:

让我们从Java 7开始–它提供了Files.probeContentType(path)方法来解决MIME类型。

@Test
public void whenUsingJava7_thenSuccess() {
    Path path = new File("product.png").toPath();
    String mimeType = Files.probeContentType(path);
 
    assertEquals(mimeType, "image/png");
}

This method makes use of the installed FileTypeDetector implementations to probe the MIME type. It invokes the probeContentType of each implementation to resolve the type.

这个方法利用已安装的FileTypeDetector实现来探测MIME类型。它调用每个实现的probeContentType来解决该类型。

Now, if the file is recognized by any of the implementations, the content type is returned. However, if that doesn’t happen, a system-default file type detector is invoked.

现在,如果该文件被任何一个实现所识别,就会返回内容类型。然而,如果这没有发生,就会调用一个系统默认的文件类型检测器。

However, the default implementations are OS specific and might fail depending on the OS that we are using.

然而,默认的实现是针对操作系统的,可能会根据我们使用的操作系统而失败。

In addition to that, it’s also important to note that the strategy will fail if the file isn’t present in the filesystem. Furthermore, if the file doesn’t have an extension, it will result in failure.

除此之外,还需要注意的是,如果文件不在文件系统中,该策略将失败。此外,如果文件没有扩展名,也会导致失败。

 3. Using URLConnection

3.使用URLConnection

URLConnection provides several APIs for detecting MIME types of a file. Let’s briefly explore each of them.

URLConnection提供了几个API来检测文件的MIME类型。让我们简单地探讨一下它们中的每一个。

3.1. Using getContentType()

3.1.使用getContentType()

We can use getContentType() method of URLConnection to retrieve a file’s MIME type:

我们可以使用URLConnectiongetContentType()方法来检索一个文件的MIME类型。

@Test
public void whenUsingGetContentType_thenSuccess(){
    File file = new File("product.png");
    URLConnection connection = file.toURL().openConnection();
    String mimeType = connection.getContentType();
 
    assertEquals(mimeType, "image/png");
}

However, a major drawback of this approach is that it’s very slow.

然而,这种方法的一个主要缺点是,它非常慢

3.2. Using guessContentTypeFromName()

3.2.使用guessContentTypeFromName()

Next, let’s see how we can make use of the guessContentTypeFromName() for the purpose:

接下来,让我们看看如何利用guessContentTypeFromName()来实现这一目的。

@Test
public void whenUsingGuessContentTypeFromName_thenSuccess(){
    File file = new File("product.png");
    String mimeType = URLConnection.guessContentTypeFromName(file.getName());
 
    assertEquals(mimeType, "image/png");
}

This method makes use of the internal FileNameMap to resolve the MIME type from the extension.

该方法利用内部FileNameMap从扩展名解析MIME类型

We also have the option of using guessContentTypeFromStream() instead, which uses the first few characters of the input stream, to determine the type.

我们也可以选择使用guessContentTypeFromStream()来代替,它使用输入流的前几个字符来确定其类型。

3.3. Using getFileNameMap()

3.3.使用getFileNameMap()

A faster way to obtain the MIME type using URLConnection is using the getFileNameMap() method:

使用URLConnection获得MIME类型的一个更快方法是使用getFileNameMap()方法。

@Test
public void whenUsingGetFileNameMap_thenSuccess(){
    File file = new File("product.png");
    FileNameMap fileNameMap = URLConnection.getFileNameMap();
    String mimeType = fileNameMap.getContentTypeFor(file.getName());
 
    assertEquals(mimeType, "image/png");
}

The method returns the table of MIME types used by all instances of URLConnection. This table is then used to resolve the input file type.

该方法返回URLConnection的所有实例所使用的MIME类型表。该表随后被用于解决输入文件类型。

The built-in table of MIME types is very limited when it comes to URLConnection.

当涉及到URLConnection时,内置的MIME类型表非常有限。

By default, the class uses content-types.properties file in JRE_HOME/lib. We can, however, extend it, by specifying a user-specific table using the content.types.user.table property:

默认情况下,该类使用content-types.properties文件,位于JRE_HOME/lib但是,我们可以通过使用content.types.user.table属性指定一个用户特定的表来扩展它:

System.setProperty("content.types.user.table","<path-to-file>");

4. Using MimeTypesFileTypeMap

4.使用MimeTypesFileTypeMap

MimeTypesFileTypeMap resolves MIME types by using file’s extension. This class came with Java 6, and hence comes very handy when we’re working with JDK 1.6.

MimeTypesFileTypeMap通过使用文件的扩展名来解决MIME类型。这个类是在Java 6中出现的,因此在我们使用JDK 1.6时非常方便。

Now let’s see how to use it:

现在我们来看看如何使用它。

@Test
public void whenUsingMimeTypesFileTypeMap_thenSuccess() {
    File file = new File("product.png");
    MimetypesFileTypeMap fileTypeMap = new MimetypesFileTypeMap();
    String mimeType = fileTypeMap.getContentType(file.getName());
 
    assertEquals(mimeType, "image/png");
}

Here, we can either pass the name of the file or the File instance itself as the parameter to the function. However, the function with File instance as the parameter internally calls the overloaded method that accepts the filename as the parameter.

在这里,我们可以将文件名或File实例本身作为参数传递给函数。然而,以File实例为参数的函数在内部调用接受文件名为参数的重载方法。

Internally, this method looks up a file called mime.types for the type resolution. It’s very important to note that the method searches for the file in a specific order:

在内部,该方法查找一个名为mime.types的文件来解决类型问题。需要注意的是,该方法是按照特定的顺序搜索该文件的:非常重要。

  1. Programmatically added entries to the MimetypesFileTypeMap instance
  2. .mime.types in the user’s home directory
  3. <java.home>/lib/mime.types
  4. resources named META-INF/mime.types
  5. resource named META-INF/mimetypes.default (usually found only in the activation.jar file)

However, if no file is found, it will return application/octet-stream as the response.

然而,如果没有找到文件,它将返回application/octet-stream作为响应。

5. Using jMimeMagic

5.使用jMimeMagic

jMimeMagic is a restrictively licensed library that we can use to obtain the MIME type of a file.

jMimeMagic是一个有限制性许可的库,我们可以用它来获取文件的MIME类型。

Let’s start by configuring the Maven dependency:

我们先来配置一下Maven的依赖性。

<dependency>
    <groupId>net.sf.jmimemagic</groupId>
    <artifactId>jmimemagic</artifactId>
    <version>0.1.5</version>
</dependency>

We can find the latest version of this library on Maven Central.

我们可以在Maven Central上找到该库的最新版本。

Next, we’ll explore how to work with the library:

接下来,我们将探讨如何与图书馆合作。

@Test    
public void whenUsingJmimeMagic_thenSuccess() {
    File file = new File("product.png");
    Magic magic = new Magic();
    MagicMatch match = magic.getMagicMatch(file, false);
 
    assertEquals(match.getMimeType(), "image/png");
}

This library can work with a stream of data and hence doesn’t require the file to be present in the file system.

这个库可以处理数据流,因此不要求文件存在于文件系统中。

6. Using Apache Tika

6.使用Apache Tika

Apache Tika is a toolset that detects and extracts metadata and text from a variety of files. It has a rich and powerful API and comes with tika-core which we can make use of, for detecting MIME type of a file.

Apache Tika是一个工具集,可以检测和提取各种文件的元数据和文本。它有一个丰富而强大的API,并带有tika-core,我们可以利用它来检测文件的MIME类型。

Let’s begin by configuring the Maven dependency:

我们先来配置一下Maven的依赖性。

<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-core</artifactId>
    <version>1.18</version>
</dependency>

Next, we’ll make use of the detect() method to resolve the type:

接下来,我们将利用detect() 方法来解析类型。

@Test
public void whenUsingTika_thenSuccess() {
    File file = new File("product.png");
    Tika tika = new Tika();
    String mimeType = tika.detect(file);
 
    assertEquals(mimeType, "image/png");
}

The library relies on magic markers in the stream prefix, for type resolution.

该库依靠流前缀中的神奇标记来解决类型问题。

7. Conclusion

7.结论

In this article, we’ve looked at the various strategies of obtaining the MIME type of a file. Furthermore, we have also analyzed the tradeoffs of the approaches. We have also pointed out the scenarios where we should favor one strategy over the other.

在这篇文章中,我们研究了获取文件的MIME类型的各种策略。此外,我们还分析了这些方法的利弊。我们还指出了在哪些情况下我们应该选择一种策略而不是另一种。

The full source code that is used in this article is available over at GitHub, as always.

本文中使用的完整源代码可在GitHub上获得,一如既往。