Pretty-Print XML in Java – 在Java中漂亮地打印XML

最后修改: 2022年 3月 12日


1. Overview


When we need to read an XML file manually, usually, we would like to read the content in a pretty-printed format. Many text editors or IDEs can reformat XML documents. If we work in Linux, we can pretty-print XML files from the command line.

当我们需要手动阅读XML文件时,通常,我们希望以一种漂亮的打印格式来阅读内容。许多文本编辑器或IDE都可以对XML文件进行重新格式化。如果我们在Linux中工作,我们可以从命令行pretty-print XML文件

However, sometimes, we have requirements to convert a raw XML string to the pretty-printed format in our Java program. For example, we may want to show a pretty-printed XML document in the user interface for better visual comprehension.


In this tutorial, we’ll explore how to pretty-print XML in Java.


2. Introduction to the Problem


For simplicity, we’ll take a non-formatted emails.xml file as the input:


<emails> <email> <from>Kai</from> <to>Amanda</to> <time>2018-03-05</time>
<subject>I am flying to you</subject></email> <email>
<from>Jerry</from> <to>Tom</to> <time>1992-08-08</time> <subject>Hey Tom, catch me if you can!</subject>
</email> </emails>

As we can see, the emails.xml file is well-formed. However, it’s not easy to read due to the messy format.


Our goal is to create a method to convert this ugly, raw XML string to a pretty-formatted string.


Further, we’ll discuss customizing two common output properties: indent-size (integer) and suppressing XML declaration (boolean).


The indent-size property is pretty straightforward: It’s the number of spaces to indent (per level). On the other hand, the suppressing XML declaration option decides if we want to have the XML declaration tag in the generated XML. A typical XML declaration looks like:


<?xml version="1.0" encoding="UTF-8"?>

In this tutorial, we’ll address a solution with the standard Java API and another approach using an external library.

在本教程中,我们将讨论一个使用标准Java API的解决方案和另一个使用外部库的方法。

Next, let’s see them in action.


3. Pretty-Printing XML With the Transformer Class


Java API provides the Transformer class to do XML transformations.

Java API提供了Transformer类来做XML的转换。

3.1. Using the Default Transformer


First, let’s see the pretty-print solution using the Transformer class:


public static String prettyPrintByTransformer(String xmlString, int indent, boolean ignoreDeclaration) {

    try {
        InputSource src = new InputSource(new StringReader(xmlString));
        Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src);

        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        transformerFactory.setAttribute("indent-number", indent);
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, ignoreDeclaration ? "yes" : "no");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");

        Writer out = new StringWriter();
        transformer.transform(new DOMSource(document), new StreamResult(out));
        return out.toString();
    } catch (Exception e) {
        throw new RuntimeException("Error occurs when pretty-printing xml:\n" + xmlString, e);

Now, let’s walk through the method quickly and figure out how it works:


  • First, we parse the raw XML string and get a Document object.
  • Next, we obtain a TransformerFactory instance and set the required indent-size attribute.
  • Then, we can get a default transformer instance from the configured tranformerFactory object.
  • The transformer object supports various output properties. To decide if we want to skip the declaration, we set the OutputKeys.OMIT_XML_DECLARATION attribute.
  • Since we would like to have a pretty-formatted String object, finally, we transform() the parsed XML Document to a StringWriter and return the transformed String.

We’ve set the indent size on the TransformerFactory object in the method above. Alternatively, we can also define the indent-amount property on the transformer instance:


transformer.setOutputProperty("{}indent-amount", String.valueOf(indent));

Next, let’s test if the method works as expected.


3.2. Testing the Method


Our Java project is a Maven project, and we’ve put the emails.xml under src/main/resources/xml/email.xml. We’ve created the readFromInputStream method to read the input file as a String. But, we won’t go into the details of this method since it doesn’t have much to do with our topic here. Let’s say we want to set the indent-size=2 and skip the XML declaration in the result:


public static void main(String[] args) throws IOException {
    InputStream inputStream = XmlPrettyPrinter.class.getResourceAsStream("/xml/emails.xml");
    String xmlString = readFromInputStream(inputStream);
    System.out.println("Pretty printing by Transformer");
    System.out.println(prettyPrintByTransformer(xmlString, 2, true));

As the main method shows, we read the input file as a String and then call our prettyPrintByTransformer method to get a pretty-printed XML String.


Next, let’s run the main method with Java 8:

接下来,让我们用Java 8运行mainmethod

Pretty printing by Transformer
    <subject>I am flying to you</subject>
    <subject>Hey Tom, catch me if you can!</subject>

As the output above shows, our method works as expected.


However, if we test it once again with Java 9 or a later version, we may see different output.

然而,如果我们用Java 9或更高版本再测试一次,我们可能会看到不同的输出。

Next, let’s see what it produces if we run it with Java 9:

接下来,让我们看看如果我们用Java 9运行它,会产生什么结果

Pretty printing by Transformer
    <subject>I am flying to you</subject>
    <subject>Hey Tom, catch me if you can!</subject>


As we can see in the output above, there are unexpected empty lines in the output.


This is because our raw input contains whitespace between elements, for example:


<emails> <email> <from>Kai</from> ...

As of Java 9, the Transformer class’s pretty-print feature doesn’t define the actual format. Therefore, whitespace-only nodes will be outputted as well. This has been discussed in this JDK bug ticket. Also, Java 9’s release note has explained this in the xml/jaxp section.

从Java 9开始,Transformer类的pretty-print功能并没有定义实际的格式。因此,纯白的节点也将被输出。这个问题已经在这个JDK 错误票中讨论过。此外,Java 9 的发行说明也在 xml/jaxp 部分解释了这一点。

If we want our pretty-print method to always generate the same format under various Java versions, we need to provide a stylesheet file.


Next, let’s create a simple xsl file to achieve that.


3.3. Providing an XSLT File


First, let’s create the prettyprint.xsl file to define the output format:


<xsl:stylesheet version="1.0" xmlns:xsl="">
    <xsl:strip-space elements="*"/>
    <xsl:output method="xml" encoding="UTF-8"/>

    <xsl:template match="@*|node()">
            <xsl:apply-templates select="@*|node()"/>


As we can see, in the prettyprint.xsl file, we’ve used the <xsl:strip-space/> element to remove whitespace-only nodes so that they do not appear in the output.


Next, we still need to make a small change to our method. We won’t use the default transformer anymore. Instead, we’ll create a Transformer object with our XSLT document:


Transformer transformer = transformerFactory.newTransformer(new StreamSource(new StringReader(readPrettyPrintXslt())));

Here, the readPrettyPrintXslt() method reads prettyprint.xsl content.


Now, if we test the method in Java 8 and Java 9, both produce the same output:

现在,如果我们在Java 8和Java 9中测试这个方法,两者都会产生相同的输出。

Pretty printing by Transformer
    <subject>I am flying to you</subject>

We’ve solved the problem with the standard Java API. Next, let’s pretty print the emails.xml using an external library.

我们已经用标准的Java API解决了这个问题。接下来,让我们用一个外部库来漂亮地打印emails.xml

4. Pretty-Printing XML With the Dom4j Library


Dom4j is a popular XML library. It allows us to easily pretty-print XML documents.


First, let’s add the Dom4j dependency into our pom.xml:



We’ve used the 2.1.3 version as an example. We can find the latest version in the Maven Central repository.

我们以2.1.3版本为例。我们可以在Maven Central资源库中找到最新版本

Next, let’s see how to pretty-print XML using the Dom4j library:


public static String prettyPrintByDom4j(String xmlString, int indent, boolean skipDeclaration) {
    try {
        OutputFormat format = OutputFormat.createPrettyPrint();

        org.dom4j.Document document = DocumentHelper.parseText(xmlString);
        StringWriter sw = new StringWriter();
        XMLWriter writer = new XMLWriter(sw, format);
        return sw.toString();
    } catch (Exception e) {
        throw new RuntimeException("Error occurs when pretty-printing xml:\n" + xmlString, e);

D0m4j’s OutputFormat class has provided a createPrettyPrint method to create a pre-defined pretty-print OutputFormat object. As the method above shows, we can add some customizations on the default pretty-print format. In this case, we set the indent size and decide if we would like to include the declaration in the result.


Next, we parse the raw XML string and create an XMLWritter object with the prepared OutputFormat instance.


Finally, the XMLWriter object will write the parsed XML document in the required format.


Next, let’s test if it can pretty-print the emails.xml file. This time, let’s say we would like to include the declaration and have an indent size of 8 in the result:


System.out.println("Pretty printing by Dom4j");
System.out.println(prettyPrintByDom4j(xmlString, 8, false));

When we run the method, we’ll see the output:


Pretty printing by Dom4j
<?xml version="1.0" encoding="UTF-8"?>

                <subject>I am flying to you</subject>
                <subject>Hey Tom, catch me if you can!</subject> 

As the output above shows, the method has solved the problem.


5. Conclusion


In this article, we’ve addressed two approaches to pretty-print an XML file in Java.


We can pretty-print XMLs using the standard Java API. However, we need to keep in mind the Transformer object may produce different results depending on the Java version. The solution is to provide an XSLT file.

我们可以使用标准的Java API对XML进行pretty-print。然而,我们需要记住,Transformer对象可能会根据Java版本产生不同的结果。解决方案是提供一个XSLT文件。

Alternatively, the Dom4j library can solve the problem straightforwardly.


