Convert String Containing XML to org.w3c.dom.Document – 将包含 XML 的字符串转换为 org.w3c.dom.Document

最后修改: 2023年 10月 31日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.导言

One of the most common data formats today is XML (Extensible Markup Language), which is widely used in structuring and exchanging data between applications.

当今最常见的数据格式之一是 XML(可扩展标记语言),它被广泛用于应用程序之间的数据结构和交换。

Moreover, this use case is common in Java, where we must change some pieces of XML markup text to org.w3c.dom.Document object.

此外,这种用例在 Java 中很常见,我们必须将某些 XML 标记文本更改为 org.w3c.dom.Document 对象。

In this tutorial, we’ll discuss converting a string with XML-based content into Org.w3c.dom.Document in Java.

在本教程中,我们将讨论如何在 Java 中将基于 XML 内容的字符串转换为 Org.w3c.dom.Document

2. org.w3c.dom.Document

2. org.w3c.dom.Document

The org.w3c.dom.Document is an integral component of the Document Object Model (DOM) XML API in Java. This essential class represents an entire XML document and provides a comprehensive set of methods for navigating, modifying, and retrieving data from XML documents. When working with XML in Java, the org.w3c.dom.Document object becomes an indispensable tool.

org.w3c.dom.Document 是 Java 中 Document Object Model (DOM) XML API 不可分割的组成部分。这个基本类代表了整个 XML 文档,并为从 XML 文档中导航、修改和检索数据提供了一套全面的方法。在使用 Java 处理 XML 时,org.w3c.dom.Document 对象成为不可或缺的工具。

To better understand how to create an org.w3c.dom.Document object, let’s look at the following example:

为了更好地理解如何创建 org.w3c.dom.Document 对象,让我们看看下面的示例:

try {
    // Create a DocumentBuilderFactory
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    // Create a DocumentBuilder
    DocumentBuilder builder = factory.newDocumentBuilder();

    // Create a new Document
    Document document = builder.newDocument();

    // Create an example XML structure
    Element rootElement = document.createElement("root");
    document.appendChild(rootElement);

    Element element = document.createElement("element");
    element.appendChild(document.createTextNode("XML Document Example"));
    rootElement.appendChild(element);
    
} catch (ParserConfigurationException e) {
    e.printStackTrace();
}

In the previous code, we start by creating the necessary elements for the parsing of XML, such as DocumentBuilderFactory and DocumentBuilder. After that, it builds a basic XML schema with an initial node element labeled “root” encompassing another child node element referred to as “element” that has the string “XML document example”. Moreover, the XML output should be as follows:

在前面的代码中,我们首先创建了解析 XML 所需的元素,如 DocumentBuilderFactoryDocumentBuilder 。然后,它会创建一个基本的 XML 模式,其中包含一个标为 “root” 的初始节点元素,以及另一个标为 “element” 的子节点元素,该元素的字符串为 “XML文档示例”。此外,XML 输出应如下所示:

<root>
    <element>XML Document Example</element>
</root>

3. Parsing XML from a String

3.从字符串解析 XML

Parsing of the XML string is needed for converting the string containing XML into an org.w3c.dom.Document. Fortunately, there are several XML parsing libraries in Java, which include DOM, SAX, and StAX.

要将包含 XML 的字符串转换成 org.w3c.dom.Document 就需要对 XML 字符串进行解析。幸运的是,Java 中有几个 XML 解析库,其中包括 DOMSAXStAX

This article takes it easy by concentrating on the DOM parser for a simple explanation. Let’s walk through a step-by-step example of how to parse a string with XML and create an org.w3c.dom.Document object:

本文将集中对 DOM 解析器进行简单说明。让我们通过一个循序渐进的示例来了解如何使用 XML 解析字符串并创建 org.w3c.dom.Document 对象:

@Test
public void givenValidXMLString_whenParsing_thenDocumentIsCorrect()
  throws ParserConfigurationException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    String xmlString = "<root><element>XML Parsing Example</element></root>";
    InputSource is = new InputSource(new StringReader(xmlString));
    Document xmlDoc = null;
    try {
        xmlDoc = builder.parse(is);
    } catch (SAXException e) {
        throw new RuntimeException(e);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    assertEquals("root", xmlDoc.getDocumentElement().getNodeName());
    assertEquals("element", xmlDoc.getDocumentElement().getElementsByTagName("element").item(0).getNodeName());
    assertEquals("XML Parsing Example",
      xmlDoc.getDocumentElement().getElementsByTagName("element").item(0).getTextContent());
}

In the above code, we create a DocumentBuilderFactory and DocumentBuilder that are critical for XML parsing. Additionally, we define a sample XML string (xmlString) that is converted into an InputSource for parsing. We parse XML within a try-catch block and catch any possible exception like SAXException or IOException.

在上述代码中,我们创建了对 XML 解析至关重要的 DocumentBuilderFactoryDocumentBuilder此外,我们还定义了一个 XML 字符串示例 (xmlString),该示例将被转换为用于解析的 InputSource 。我们在一个 try-catch 块中解析 XML,并捕获任何可能的异常,如 SAXExceptionIOException

Finally, we employ a series of assertions to verify the correctness of the parsed XML document, including checks for the root element’s name using getDocumentElement().getNodeName(), the child element’s name using getDocumentElement().getElementsByTagName(), and the text content within the child element.

最后,我们使用一系列断言来验证解析 XML 文档的正确性,包括使用 getDocumentElement().getNodeName() 检查根元素的名称,使用 getDocumentElement().getElementsByTagName() 检查子元素的名称,以及子元素中的文本内容。

4. Conclusion

4.结论

In conclusion, for any competent Java developer who deals with XML-based data in numerous applications, from data processing to web services or configurational tasks, it is vital to know how to operate org.w3c.dom.Document (NS).

总之,对于在众多应用程序(从数据处理到网络服务或配置任务)中处理基于 XML 数据的 Java 开发人员来说,了解如何操作 org.w3c.dom.Document (NS)至关重要。

As always, the complete code samples for this article can be found over on GitHub.

与往常一样,本文的完整代码示例可在 GitHub 上找到