Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Introduction

One of the most common data formats today is XML (Extensible Markup Language), which is widely used in structuring and exchanging data between applications.

Moreover, this use case is common in Java, where we must change some pieces of XML markup text to org.w3c.dom.Document object.

In this tutorial, we’ll discuss converting a string with XML-based content into Org.w3c.dom.Document in Java.

2. org.w3c.dom.Document

The org.w3c.dom.Document is an integral component of the Document Object Model (DOM) XML API in Java. This essential class represents an entire XML document and provides a comprehensive set of methods for navigating, modifying, and retrieving data from XML documents. When working with XML in Java, the org.w3c.dom.Document object becomes an indispensable tool.

To better understand how to create an org.w3c.dom.Document object, let’s look at the following example:

try {
    // Create a DocumentBuilderFactory
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    // Create a DocumentBuilder
    DocumentBuilder builder = factory.newDocumentBuilder();

    // Create a new Document
    Document document = builder.newDocument();

    // Create an example XML structure
    Element rootElement = document.createElement("root");
    document.appendChild(rootElement);

    Element element = document.createElement("element");
    element.appendChild(document.createTextNode("XML Document Example"));
    rootElement.appendChild(element);
    
} catch (ParserConfigurationException e) {
    e.printStackTrace();
}

In the previous code, we start by creating the necessary elements for the parsing of XML, such as DocumentBuilderFactory and DocumentBuilder. After that, it builds a basic XML schema with an initial node element labeled “root” encompassing another child node element referred to as “element” that has the string “XML document example”. Moreover, the XML output should be as follows:

<root>
    <element>XML Document Example</element>
</root>

3. Parsing XML from a String

Parsing of the XML string is needed for converting the string containing XML into an org.w3c.dom.Document. Fortunately, there are several XML parsing libraries in Java, which include DOM, SAX, and StAX.

This article takes it easy by concentrating on the DOM parser for a simple explanation. Let’s walk through a step-by-step example of how to parse a string with XML and create an org.w3c.dom.Document object:

@Test
public void givenValidXMLString_whenParsing_thenDocumentIsCorrect()
  throws ParserConfigurationException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    String xmlString = "<root><element>XML Parsing Example</element></root>";
    InputSource is = new InputSource(new StringReader(xmlString));
    Document xmlDoc = null;
    try {
        xmlDoc = builder.parse(is);
    } catch (SAXException e) {
        throw new RuntimeException(e);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    assertEquals("root", xmlDoc.getDocumentElement().getNodeName());
    assertEquals("element", xmlDoc.getDocumentElement().getElementsByTagName("element").item(0).getNodeName());
    assertEquals("XML Parsing Example",
      xmlDoc.getDocumentElement().getElementsByTagName("element").item(0).getTextContent());
}

In the above code, we create a DocumentBuilderFactory and DocumentBuilder that are critical for XML parsing. Additionally, we define a sample XML string (xmlString) that is converted into an InputSource for parsing. We parse XML within a try-catch block and catch any possible exception like SAXException or IOException.

Finally, we employ a series of assertions to verify the correctness of the parsed XML document, including checks for the root element’s name using getDocumentElement().getNodeName(), the child element’s name using getDocumentElement().getElementsByTagName(), and the text content within the child element.

4. Conclusion

In conclusion, for any competent Java developer who deals with XML-based data in numerous applications, from data processing to web services or configurational tasks, it is vital to know how to operate org.w3c.dom.Document (NS).

As always, the complete code samples for this article can be found over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.