Java Read XML with StAX Parser – Cursor and Iterator APIs

Learn to parse and read XML files using Java StAX parser. StAX (Streaming API for XML) provides two ways to parse XML:

  • Cursor-based API
  • Iterator-based API

This tutorial will discuss both APIs for parsing an XML file.

1. Introduction to StAX Parser

Just like SAX parser, StAX API is designed for parsing XML streams. The difference is:

  • StAX is a “pull” API. SAX is a “push” API.
  • StAX can do both XML reading and writing. SAX can only do XML reading.

StAX is a pull-style API. This means that we have to move the StAX parser from item to item in the XML file ourself, just like we do with a standard Iterator or JDBC ResultSet. We can then access the XML information via the StAX parser for each such “item” encountered in the XML file.

2. Difference between Cursor and Iterator APIs

While reading the XML document, the iterator reader returns an XML event object from its nextEvent() calls. This event provides information about what type of XML tag (element, text, comment etc) we have encountered. The event received is immutable so we can pass around the application to process it safely.

XMLEventReader reader = ...;
 
while(reader.hasNext()){
    XMLEvent event = reader.nextEvent();
 
    if(event.getEventType() == XMLEvent.START_ELEMENT){
        //process data
    } 
    //... more event types handled here...
}

Unlike Iterator, the cursor works like Resultset in JDBC. If the cursor moves to the next element in the XML document. You can then call methods directly on the cursor to obtain more information about the current event.

XMLStreamReader streamReader = ...;
 
while(streamReader.hasNext()){
    int eventType = streamReader.next();
 
    if(eventType == XMLStreamReader.START_ELEMENT){
        System.out.println(streamReader.getLocalName());
    }
 
    //... more event types handled here...
}

3. Iterator API Example

Given below demonstrate how to use StAX iterator-based API to read the XML document to an object.

The XML file is as follows:

<employees>
  <employee id="101">
     <name>Lokesh Gupta</name>
      <title>Author</title>
  </employee>
  <employee id="102">
     <name>Brian Lara</name>
      <title>Cricketer</title>
  </employee>
</employees>

To read the file, I have written the program in these steps:

  • Create an iterator and start receiving events.
  • As soon as you get open 'employee' tag – create new Employee object.
  • Read id attribute from employee tag and set it to current Employee object.
  • Iterate to the next start tag events. These are XML elements inside employee tag. Read data inside these tags. Set read data to current Employee object.
  • Continue iterating the event. When you find the end element event for 'employee' tag, you can say that you have read the data for the current employee, so add the current employee object to employeeList collection.
  • At last, verify the read data by printing the employeeList.
import com.howtodoinjava.xml.model.Employee;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

public class ReadXmlWithIterator {

  public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
    File file = new File("employees.xml");

    // Instance of the class which helps on reading tags
    XMLInputFactory factory = XMLInputFactory.newInstance();

    // Initializing the handler to access the tags in the XML file
    XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(file));

    //All read employees objects will be added to this list
    List<Employee> employeeList = new ArrayList<>();

    //Create Employee object. It will get all the data using setter methods.
    //And at last, it will stored in above 'employeeList'
    Employee employee = null;

    // Checking the availability of the next tag
    while (eventReader.hasNext()) {
      XMLEvent xmlEvent = eventReader.nextEvent();

      if (xmlEvent.isStartElement()) {
        StartElement startElement = xmlEvent.asStartElement();

        //As soo as employee tag is opened, create new Employee object
        if ("employee".equalsIgnoreCase(startElement.getName().getLocalPart())) {
          employee = new Employee();
        }

        //Read all attributes when start tag is being read
        @SuppressWarnings("unchecked")
        Iterator<Attribute> iterator = startElement.getAttributes();

        while (iterator.hasNext()) {
          Attribute attribute = iterator.next();
          QName name = attribute.getName();
          if ("id".equalsIgnoreCase(name.getLocalPart())) {
            employee.setId(Integer.valueOf(attribute.getValue()));
          }
        }

        //Now everytime content tags are found;
        //Move the iterator and read data
        switch (startElement.getName().getLocalPart()) {
          case "name":
            Characters nameDataEvent = (Characters) eventReader.nextEvent();
            employee.setName(nameDataEvent.getData());
            break;

          case "title":
            Characters titleDataEvent = (Characters) eventReader.nextEvent();
            employee.setTitle(titleDataEvent.getData());
            break;
        }
      }

      if (xmlEvent.isEndElement()) {
        EndElement endElement = xmlEvent.asEndElement();

        //If employee tag is closed then add the employee object to list;
        //and be ready to read next employee data
        if ("employee".equalsIgnoreCase(endElement.getName().getLocalPart())) {
          employeeList.add(employee);
        }
      }
    }
    System.out.println(employeeList); //Verify read data
  }
}

The program output:

[Employee [id=101, name=Lokesh Gupta, title=Author], 
  Employee [id=102, name=Brian Lara,   title=Cricketer]]

4. Cursor API Example

I will read the same employees.xml file – now with cursor-based API.

import com.howtodoinjava.xml.model.Employee;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.List;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class ReadXmlWithCursor {

  public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
    //All read employees objects will be added to this list
    List<Employee> employeeList = new ArrayList<>();

    //Create Employee object. It will get all the data using setter methods.
    //And at last, it will stored in above 'employeeList'
    Employee employee = null;

    File file = new File("employees.xml");
    XMLInputFactory factory = XMLInputFactory.newInstance();
    XMLStreamReader streamReader = factory.createXMLStreamReader(new FileReader(file));

    while (streamReader.hasNext()) {
      //Move to next event
      streamReader.next();

      //Check if its 'START_ELEMENT'
      if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT) {
        //employee tag - opened
        if (streamReader.getLocalName().equalsIgnoreCase("employee")) {

          //Create new employee object asap tag is open
          employee = new Employee();

          //Read attributes within employee tag
          if (streamReader.getAttributeCount() > 0) {
            String id = streamReader.getAttributeValue(null, "id");
            employee.setId(Integer.valueOf(id));
          }
        }

        //Read name data
        if (streamReader.getLocalName().equalsIgnoreCase("name")) {
          employee.setFirstName(streamReader.getElementText());
        }

        //Read title data
        if (streamReader.getLocalName().equalsIgnoreCase("title")) {
          employee.setLastName(streamReader.getElementText());
        }
      }

      //If employee tag is closed then add the employee object to list
      if (streamReader.getEventType() == XMLStreamReader.END_ELEMENT) {
        if (streamReader.getLocalName().equalsIgnoreCase("employee")) {
          employeeList.add(employee);
        }
      }
    }
    //Verify read data
    System.out.println(employeeList);
  }
}

5. Conclusion

So in this StAX parser tutorial, we learned the following things:

  1. What is StAX parser based on XML streaming API?
  2. Difference between StAX vs SAX parsers.
  3. How to read XML with StAX iterator API with example.
  4. How to read XML with StAX cursor API with example.

Both APIs are capable of parsing any kind of XML document but the cursor API is more memory-efficient than the iterator API. So, if your application needs better performance, consider using the cursor-based API.

Drop me your questions in the comments section.

Happy Learning !!

Sourcecode in Github

Comments

Subscribe
Notify of
guest
2 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.