1. Introduction
1.介绍
In this article we will be comparing Java XML libraries and APIs.
在这篇文章中,我们将对Java XML库和API进行比较。
This is the second article from the series about Java support for XML, if you want to go deeper into the XPath support in Java have a look at the previous article.
这是关于Java支持XML系列的第二篇文章,如果你想深入了解Java中的XPath支持,请看前一篇文章。
2. Overview
2.概述
Now we’re going to dig deeper into the XML world support and for that we’re going to start by explaining as simple as possible all the subject-related initials.
现在我们要深入挖掘XML世界的支持,为此我们要从尽可能简单地解释所有与主题相关的首字母开始。
In Java XML support we can find few API definitions, each one has its pros and cons.
在Java XML支持中,我们可以找到一些API定义,每一种都有其优点和缺点。
• SAX: It is an event based parsing API, it provides a low level access, is memory efficient and faster than DOM since it doesn’t load the whole document tree in memory but it doesn’t provide support for navigation like the one provided by XPath, although it is more efficient it is harder to use too.
– SAX:它是一个基于事件的解析API,它提供了一个低层次的访问,内存效率高,比DOM快,因为它不在内存中加载整个文档树,但它不提供像XPath那样的导航支持,尽管它的效率更高,但也更难使用。
• DOM: It as model based parser that loads a tree structure document in memory, so we have the original elements order, we can navigate our document both directions, it provides an API for reading and writing, it offers XML manipulation and it is very easy to use although the price is high strain on memory resources.
– DOM:它是基于模型的解析器,在内存中加载一个树状结构的文档,因此我们有原始的元素顺序,我们可以双向浏览我们的文档,它提供了一个读写的API,它提供了XML操作,它非常容易使用,尽管代价是对内存资源的高度紧张。
• StAX: It offers the ease of DOM and the efficiency of SAX but it lacks of some functionality provided by DOM like XML manipulation and it only allows us to navigate the document forward.
– StAX: 它提供了DOM的简易性和SAX的效率,但它缺乏DOM提供的一些功能,如XML操作,它只允许我们向前浏览文档。
• JAXB: It allows us to navigate the document in both directions, it is more efficient than DOM, it allows conversion from XML to java types and it supports XML manipulation but it can only parse a valid XML document.
– JAXB:它允许我们对文档进行双向导航,它比DOM更有效率,它允许从XML转换为java类型,它支持XML操作,但它只能解析一个有效的XML文档。
You could still find some references to JAXP but last release of this project is from March 2013 and it is practically dead.
你仍然可以找到一些关于JAXP的参考资料,但这个项目的最后一次发布是在2013年3月,它实际上已经死亡。
3. The XML
3.XML
In this section we are going to see the most popular implementations, so that we can test real working samples and check differences between them.
在本节中,我们将看到最流行的实现方式,这样我们就可以测试真实的工作样本,并检查它们之间的差异。
In the following examples we will be working with a simple XML file with a structure like this:
在下面的例子中,我们将处理一个简单的XML文件,其结构是这样的。
<tutorials>
<tutorial tutId="01" type="java">
<title>Guava</title>
<description>Introduction to Guava</description>
<date>04/04/2016</date>
<author>GuavaAuthor</author>
</tutorial>
...
</tutorials>
4. DOM4J
4.DOM4J
We’re going to start by taking a look at what we can do with DOM4J and for this example we need to add the last version of this dependency.
我们首先要看看我们能用DOM4J做什么,对于这个例子,我们需要添加这个依赖的最后一个版本。
This is one of the most popular libraries to work with XML files, since it allows us to perform bi-directional reading, create new documents and update existing ones.
这是处理XML文件的最流行的库之一,因为它允许我们进行双向阅读,创建新文件和更新现有文件。
DOM4J can work with DOM, SAX, XPath and XLST. SAX is supported via JAXP.
DOM4J可以与DOM、SAX、XPath和XLST一起工作。SAX通过JAXP支持。
Let’s take a look here for example, how can we select an element filtering by a given id.
让我们看一下这里的例子,我们如何通过给定的id来选择一个元素的过滤。
SAXReader reader = new SAXReader();
Document document = reader.read(file);
List<Node> elements = document.selectNodes("//*[@tutId='" + id + "']");
return elements.get(0);
The SAXReader class is responsible for creating a DOM4J tree from SAX parsing events. Once we have a org.dom4j.Document we just need to call the necessary method and pass to it the XPath expression as a String.
SAXReader类负责从SAX解析事件中创建一个DOM4J树。一旦我们有了org.dom4j.Document,我们只需要调用必要的方法并将XPath表达式作为String.传递给它。
We can load an existing document, make changes to its content and then update the original file.
我们可以加载一个现有的文件,对其内容进行修改,然后更新原始文件。
for (Node node : nodes) {
Element element = (Element)node;
Iterator<Element> iterator = element.elementIterator("title");
while (iterator.hasNext()) {
Element title =(Element)iterator.next();
title.setText(title.getText() + " updated");
}
}
XMLWriter writer = new XMLWriter(
new FileWriter(new File("src/test/resources/example_updated.xml")));
writer.write(document);
writer.close();
In the example above, we are changing every title’s content and create a new file.
在上面的例子中,我们要改变每个标题的内容,并创建一个新的文件。
Notice here how simple it is to get every title’s node in a list by calling elementIterator and passing the name of the node.
请注意,通过调用elementIterator并传递节点的名称,在一个列表中获得每个标题的节点是多么简单。
Once we have our content modified, we will use the XMLWriter that takes a DOM4J tree and formats it to a stream as XML.
一旦我们修改了内容,我们将使用XMLWriter,它接收一个DOM4J树,并将其格式化为XML流。
Creating a new document from the scratch is as simple as we see below.
从头开始创建一个新的文件,就像我们看到的那样简单。
Document document = DocumentHelper.createDocument();
Element root = document.addElement("XMLTutorials");
Element tutorialElement = root.addElement("tutorial").addAttribute("tutId", "01");
tutorialElement.addAttribute("type", "xml");
tutorialElement.addElement("title").addText("XML with Dom4J");
...
OutputFormat format = OutputFormat.createPrettyPrint();
XMLWriter writer = new XMLWriter(
new FileWriter(new File("src/test/resources/example_new.xml")), format);
writer.write(document);
writer.close();
DocumentHelper gives us a collection of methods to use by DOM4J, such as createDocument that creates an empty document to start working with it.
DocumentHelper为我们提供了一个方法集合,供DOM4J使用,例如createDocument,它创建了一个空的文档,以便开始使用它。
We can create as many attributes or elements as we need with the methods provided by DOM4J, and once we have our document completed we just write it to a file as we did with the update case before.
我们可以用DOM4J提供的方法创建任意多的属性或元素,一旦我们的文档完成,我们只需像之前的更新案例那样将其写入文件。
5. JDOM
5.JDOM
In order to work with JDOM, we have to add this dependency to our pom.
为了使用JDOM,我们必须将这个依赖性添加到我们的pom。
JDOM’s working style is pretty similar to DOM4J’s, so we are going to take a look at just a couple of examples:
JDOM的工作风格与DOM4J的相当相似,所以我们将只看几个例子。
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(this.getFile());
Element tutorials = doc.getRootElement();
List<Element> titles = tutorials.getChildren("tutorial");
In the example above, we are retrieving all elements from the root element in a very simple way as we can do with DOM4J:
在上面的例子中,我们正在以一种非常简单的方式从根元素中检索所有的元素,就像我们可以用DOM4J:
来做一样。
SAXBuilder builder = new SAXBuilder();
Document document = (Document) builder.build(file);
String filter = "//*[@tutId='" + id + "']";
XPathFactory xFactory = XPathFactory.instance();
XPathExpression<Element> expr = xFactory.compile(filter, Filters.element());
List<Element> node = expr.evaluate(document);
Again, here in the code above, we have a SAXBuilder creating a Document instance from a given file. We are retrieving an element by its tutId attribute by passing an XPath expression to the XPathFactory provided by JDOM2.
同样,在上面的代码中,我们有一个SAXBuilder从一个给定的文件中创建一个Document实例。我们通过向JDOM2.提供的XPathFactory传递tutId属性来检索一个元素。
6. StAX
6.斯塔克斯
Now, we are going to see how we could retrieve all elements from our root element using the Stax API. Stax is included in the JDK since Java 6 so you don’t need to add any dependencies.
现在,我们要看看如何使用Stax API从我们的根元素中检索所有元素。Stax从Java 6开始就包含在JDK中,所以你不需要添加任何依赖项。
Firstly, we need to create a Tutorial class:
首先,我们需要创建一个Tutorial类。
public class Tutorial {
private String tutId;
private String type;
private String title;
private String description;
private String date;
private String author;
// standard getters and setters
}
and then we are ready to follow with:
然后我们就准备好了,接着是。
List<Tutorial> tutorials = new ArrayList<>();
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(this.getFile()));
Tutorial current;
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
switch (event.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
...
break;
case XMLStreamConstants.CHARACTERS:
Characters characters = event.asCharacters();
...
break;
case XMLStreamConstants.END_ELEMENT:
EndElement endElement = event.asEndElement();
// check if we found the closing element
// close resources that need to be explicitly closed
break;
}
}
In the example above, in order to help us retrieve the information, we needed to create a class to store the retrieved data in.
在上面的例子中,为了帮助我们检索信息,我们需要创建一个类来存储检索的数据。
To read the document, we declared what is called event handlers and we used them to navigate our document ahead. Remember that the SAX implementations don’t provide bi-directional navigation. As you can see here, a lot of work needs to be done just to retrieve a simple list of elements.
为了阅读文档,我们声明了所谓的事件处理程序,我们用它们来导航我们的文档前进。记住,SAX的实现并不提供双向导航。正如你在这里看到的,仅仅为了检索一个简单的元素列表,就需要做很多工作。
7. JAXB
7.JAXB
JAXB is included with the JDK, as well as Xerces, se don’t need any extra dependency for this one.
JAXB包含在JDK中,还有Xerces,我们不需要额外的依赖。
It’s very simple to load, create and manipulate information from an XML file using JAXB.
使用JAXB从XML文件加载、创建和操作信息非常简单。
We just need to create the correct java entities to bind the XML and that’s it.
我们只需要创建正确的java实体来绑定XML,就可以了。
JAXBContext jaxbContext = JAXBContext.newInstance(Tutorials.class);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
Tutorials tutorials = (Tutorials) jaxbUnmarshaller.unmarshal(this.getFile());
In the example above, we load our XML file into our object and from there we can handle everything as a normal Java structure;
在上面的例子中,我们将XML文件加载到我们的对象中,从那里我们可以像普通的Java结构一样处理一切。
To create a new document, it is as simple as reading it but doing the reverse way, like done in the below code.
要创建一个新的文件,就像读它一样简单,但要做相反的事情,就像下面的代码那样。
Firstly, we are going to modify our Tutorial class to add JAXB annotations to getters and setters:
首先,我们要修改我们的Tutorial类,为getters和setters添加JAXB注释。
public class Tutorial {
...
public String getTutId() {
return tutId;
}
@XmlAttribute
public void setTutId(String tutId) {
this.tutId = tutId;
}
...
@XmlElement
public void setTitle(String title) {
this.title = title;
}
...
}
@XmlRootElement
public class Tutorials {
private List<Tutorial> tutorial;
// standard getters and setters with @XmlElement annotation
}
With @XmlRootElement we define what object is going to represent the root node of our document and then we use @XmlAttribute or @XmlElement to define whether that attribute represents an attribute of a node or an element of the document.
通过@XmlRootElement,我们定义什么对象将代表我们文档的根节点,然后我们使用@XmlAttribute或@XmlElement来定义该属性是代表一个节点的属性还是文档的一个元素。
Then we can follow with:
然后我们可以接着说。
Tutorials tutorials = new Tutorials();
tutorials.setTutorial(new ArrayList<>());
Tutorial tut = new Tutorial();
tut.setTutId("01");
...
tutorials.getTutorial().add(tut);
JAXBContext jaxbContext = JAXBContext.newInstance(Tutorials.class);
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
jaxbMarshaller.marshal(tutorials, file);
As you can see, binding XML file to Java objects is the easiest way to work this kind of files.
正如你所看到的,将XML文件与Java对象绑定是处理这类文件的最简单方法。
8. XPath Expression Support
8.XPath表达式支持
To create complex XPath expressions, we can use Jaxen. This is an open source XPath library adaptable to many different object models, including DOM, XOM, DOM4J, and JDOM.
为了创建复杂的XPath表达式,我们可以使用Jaxen。这是一个开源的XPath库,可以适应许多不同的对象模型,包括DOM、XOM、DOM4J和JDOM。
We can create XPath expressions and compile them against many supported documents.
我们可以创建XPath表达式,并针对许多支持的文档进行编译。
String expression = "/tutorials/tutorial";
XPath path = new DOMXPath(expression);
List result = path.selectNodes(xmlDocument);
To make it work we’ll need to add this dependency to our project.
为了使其发挥作用,我们需要将这个依赖添加到我们的项目中。
9. Conclusion
9.结论
As you can see there are many options for working with XML, depending on the requirements of your application, you could work with any of them or you may have to choose between efficiency and simplicity.
正如你所看到的,使用XML有很多选择,根据你的应用程序的要求,你可以使用其中的任何一种,或者你可能不得不在效率和简单性之间做出选择。
You can find the full working samples for this article in our git repository here.
你可以在我们的git仓库这里找到这篇文章的完整工作样本。