Convert XML to HTML in Java – 用Java将XML转换为HTML

最后修改: 2019年 9月 19日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this tutorial, we’ll describe how to convert XML to HTML using common Java libraries and template engines – JAXP, StAX, Freemarker, and Mustache.

在本教程中,我们将介绍如何使用常见的Java库和模板引擎–JAXP、StAX、Freemarker和Mustache将XML转换为HTML。

2. An XML to Unmarshal

2.一个XML到Unmarshal

Let’s start off with a simple XML document that we’ll unmarshal into a suitable Java representation before we convert it into HTML. We’ll bear in mind a few key goals:

让我们从一个简单的XML文档开始,在将其转换为HTML之前,我们将把它解读为一个合适的Java表示法。我们将牢记几个关键目标。

  1. Keep the same XML for all of our samples
  2. Create a syntactically and semantically valid HTML5 document at the end
  3. Convert all XML elements into text

Let’s use a simple Jenkins notification as our sample XML:

让我们使用一个简单的Jenkins通知作为我们的XML样本。

<?xml version="1.0" encoding="UTF-8"?>
<notification>
    <from>builds@baeldung.com</from>
    <heading>Build #7 passed</heading>
    <content>Success: The Jenkins CI build passed</content>
</notification>

And it’s pretty straightforward. It includes a root element and some nested elements.

而且它非常简单明了。它包括一个根元素和一些嵌套元素。

We’ll aim to remove all of the unique XML tags and print out key-value pairs when we create our HTML file.

我们的目标是在创建HTML文件时删除所有独特的XML标签并打印出键值对。

3. JAXP

3.JAXP

Java Architecture for XML Processing (JAXP) is a library that was intended to expand the functionality of the popular SAX Parser with additional DOM support. JAXP provides the ability to marshal and unmarshal XML-defined objects into and from POJOs using SAX Parser. We’ll also make use of the built-in DOM helpers.

Java Architecture for XML Processing(JAXP)是一个库,旨在通过额外的 DOM 支持来扩展流行的 SAX Parser 的功能。JAXP提供了使用SAX解析器将XML定义的对象汇入和汇出POJO的能力。我们还将利用内置的DOM帮助器。

Let’s add the Maven dependency for JAXP to our project:

让我们把JAXP的Maven依赖性加入我们的项目。

<dependency>
    <groupId>javax.xml</groupId>
    <artifactId>jaxp-api</artifactId>
    <version>1.4.2</version>
</dependency>

3.1. Unmarshalling Using DOM Builder

3.1.使用DOM生成器解压缩

Let’s begin by first unmarshalling our XML file into a Java Element object:

让我们首先把我们的XML文件解读为一个JavaElement对象。

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Document input = factory
  .newDocumentBuilder()
  .parse(resourcePath);
Element xml = input.getDocumentElement();

3.2. Extracting the XML File Contents in a Map

3.2.在地图中提取XML文件内容

Now, let’s build a Map with the relevant contents of our XML file:

现在,让我们用我们的XML文件的相关内容建立一个Map

Map<String, String> map = new HashMap<>();
map.put("heading", 
  xml.getElementsByTagName("heading")
    .item(0)
    .getTextContent());
map.put("from", String.format("from: %s",
  xml.getElementsByTagName("from")
    .item(0)
    .getTextContent()));
map.put("content", 
  xml.getElementsByTagName("content")
    .item(0)
    .getTextContent());

3.3. Marshalling Using DOM Builder

3.3.使用DOM生成器进行编组

Marshalling our XML into an HTML file is a little more involved.

将我们的XML合并成一个HTML文件的过程比较复杂。

Let’s prepare a transfer Document that we’ll use to write out the HTML:

让我们准备一个转移Document,我们将用它来写出HTML。

Document doc = factory
  .newDocumentBuilder()
  .newDocument();

Next, we’ll fill the Document with the Elements in our map:

接下来,我们将用map中的Elements来填充Document

Element html = doc.createElement("html");

Element head = doc.createElement("head");
html.setAttribute("lang", "en");

Element title = doc.createElement("title");
title.setTextContent(map.get("heading"));

head.appendChild(title);
html.appendChild(head);

Element body = doc.createElement("body");

Element from = doc.createElement("p");
from.setTextContent(map.get("from"));

Element success = doc.createElement("p");
success.setTextContent(map.get("content"));

body.appendChild(from);
body.appendChild(success);

html.appendChild(body);
doc.appendChild(html);

Finally, let’s marshal our Document object using a TransformerFactory:

最后,让我们使用TransformerFactorymarshal我们的Document对象。

TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");

try (Writer output = new StringWriter()) {
    Transformer transformer = transformerFactory.newTransformer();
    transformer.transform(new DOMSource(doc), new StreamResult(output));
}

If we call output.toString(), we’ll get the HTML representation.

如果我们调用output.toString(),我们将得到HTML表示。

Note that some of the extra features and attributes we set on our factory were taken from the recommendations of the OWASP project to avoid XXE injection.

注意,我们在工厂上设置的一些额外功能和属性取自OWASP项目的建议,以避免XXE注入

4. StAX

4.斯塔克斯

Another library we can use is the Streaming API for XML (StAX). Like JAXP, StAX has been around for a long time — since 2004.

我们可以使用的另一个库是Streaming API for XML(StAX)。与JAXP一样,StAX已经存在了很长时间–自2004年以来。

The other two libraries simplify parsing XML files. That’s great for simple tasks or projects but less so when we need to iterate or have explicit and fine-grained control over element parsing itself. That’s where StAX comes in handy.

其他两个库简化了XML文件的解析。这对简单的任务或项目来说是很好的,但当我们需要迭代或对元素解析本身进行明确和细粒度的控制时,就不那么容易了。这就是StAX的用武之地。

Let’s add the Maven dependency for the StAX API to our project:

让我们把StAX API的Maven依赖项加入我们的项目。

<dependency>
    <groupId>javax.xml.stream</groupId>
    <artifactId>stax-api</artifactId>
    <version>1.0-2</version>
</dependency>

4.1. Unmarshalling Using StAX

4.1.使用StAX进行解密

We’ll use a simple iteration control flow to store XML values into our Map:

我们将使用一个简单的迭代控制流程来将XML值存入我们的Map

XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE);
factory.setProperty(XMLInputFactory.SUPPORT_DTD, Boolean.FALSE);
XMLStreamReader input = null;
try (FileInputStream file = new FileInputStream(resourcePath)) {
    input = factory.createXMLStreamReader(file);

    Map<String, String> map = new HashMap<>();
    while (input.hasNext()) {
        input.next();
        if (input.isStartElement()) {
            if (input.getLocalName().equals("heading")) {
                map.put("heading", input.getElementText());
            }
            if (input.getLocalName().equals("from")) {
                map.put("from", String.format("from: %s", input.getElementText()));
            }
            if (input.getLocalName().equals("content")) {
                map.put("content", input.getElementText());
            }
        }
    }
} finally {
    if (input != null) {
        input.close();
    }
}

4.2. Marshalling Using StAX

4.2.使用StAX进行编组

Now, let’s use our map and write out the HTML:

现在,让我们使用我们的map写出HTML

try (Writer output = new StringWriter()) {
    XMLStreamWriter writer = XMLOutputFactory
      .newInstance()
      .createXMLStreamWriter(output);

    writer.writeDTD("<!DOCTYPE html>");
    writer.writeStartElement("html");
    writer.writeAttribute("lang", "en");
    writer.writeStartElement("head");
    writer.writeDTD("<META http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">");
    writer.writeStartElement("title");
    writer.writeCharacters(map.get("heading"));
    writer.writeEndElement();
    writer.writeEndElement();

    writer.writeStartElement("body");

    writer.writeStartElement("p");
    writer.writeCharacters(map.get("from"));
    writer.writeEndElement();

    writer.writeStartElement("p");
    writer.writeCharacters(map.get("content"));
    writer.writeEndElement();

    writer.writeEndElement();
    writer.writeEndDocument();
    writer.flush();
}

Like in the JAXP example, we can call output.toString() to get the HTML representation.

像JAXP的例子一样,我们可以调用output.toString()来获得HTML表示。

5. Using Template Engines

5.使用模板引擎

As an alternative to writing the HTML representation, we can use template engines. There multiple options in the Java ecosystem. Let’s explore some of them.

作为编写HTML表示法的替代方法,我们可以使用模板引擎。在Java生态系统中,有多种选择。让我们来探索其中的一些。

5.1. Using Apache Freemarker

5.1.使用Apache Freemarker

Apache FreeMarker is a Java-based template engine for generating text output (HTML web pages, e-mails, configuration files, source code, etc.) based on templates and changing data.

Apache FreeMarker是一个基于Java的模板引擎,用于根据模板和更改数据生成文本输出(HTML网页、电子邮件、配置文件、源代码等)。

In order to use it, we’ll need to add the freemarker dependency to our Maven project:

为了使用它,我们需要将freemarker依赖性添加到我们的Maven项目。

<dependency>
    <groupId>org.freemarker</groupId>
    <artifactId>freemarker</artifactId>
    <version>2.3.29</version>
</dependency>

First, let’s create a template using the FreeMarker syntax:

首先,让我们使用FreeMarker的语法创建一个模板。

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>${heading}</title>
</head>
<body>
<p>${from}</p>
<p>${content}</p>
</body>
</html>

Now, let’s reuse our map and fill the gaps in the template:

现在,让我们重新使用我们的map,填补模板中的空白。

Configuration cfg = new Configuration(Configuration.VERSION_2_3_29);
cfg.setDirectoryForTemplateLoading(new File(templateDirectory));
cfg.setDefaultEncoding(StandardCharsets.UTF_8.toString());
cfg.setTemplateExceptionHandler(TemplateExceptionHandler.RETHROW_HANDLER);
cfg.setLogTemplateExceptions(false);
cfg.setWrapUncheckedExceptions(true);
cfg.setFallbackOnNullLoopVariable(false);
Template temp = cfg.getTemplate(templateFile);
try (Writer output = new StringWriter()) {
    temp.process(staxTransformer.getMap(), output);
}

5.2. Using Mustache

5.2.使用Mustache

Mustache is a logic-less template engine. Mustache can be used for HTML, config files, source code — pretty much anything. It works by expanding tags in a template using values provided in a hash or object.

Mustache是一个无逻辑的模板引擎。Mustache可以用于HTML、配置文件、源代码–几乎是任何东西。它的工作原理是使用哈希或对象中提供的值扩展模板中的标签。

To use it, we’ll need to add the mustache dependency to our Maven project:

要使用它,我们需要将mustache依赖性添加到我们的Maven项目。

<dependency>
    <groupId>com.github.spullara.mustache.java</groupId>
    <artifactId>compiler</artifactId>
    <version>0.9.6</version>
</dependency>

Let’s start creating a template using the Mustache syntax:

让我们开始使用Mustache的语法创建一个模板。

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>{{heading}}</title>
</head>
<body>
<p>{{from}}</p>
<p>{{content}}</p>
</body>
</html>

Now, let’s fill the template with our map:

现在,让我们用我们的map填充模板。

MustacheFactory mf = new DefaultMustacheFactory();
Mustache mustache = mf.compile(templateFile);
try (Writer output = new StringWriter()) {
    mustache.execute(output, staxTransformer.getMap());
    output.flush();
}

6. The Resulting HTML

6.结果的HTML

In the end, with all our code samples, we’ll get the same HTML output:

最后,通过我们所有的代码样本,我们会得到相同的HTML输出。

<!DOCTYPE html>
<html lang="en">
<head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Build #7 passed</title>
</head>
<body>
<p>from: builds@baeldung.com</p>
<p>Success: The Jenkins CI build passed</p>
</body>
</html>

7. Conclusion

7.结论

In this tutorial, we’ve learned the basics of using JAXP, StAX, Freemarker, and Mustache to convert XML into HTML.

在本教程中,我们已经学习了使用JAXP、StAX、Freemarker和Mustache将XML转换成HTML的基本知识。

For more information about XML in Java, check out these other great resources right here on Baeldung:

关于Java中的XML的更多信息,请查看Baeldung上的这些其他优秀资源。

As always, the complete code samples seen here are available over on GitHub.

一如既往,这里看到的完整代码样本可在GitHub上获得over。