Learn to parse and read XML files using Java StAX parser. StAX (Streaming API for XML) provides two ways to parse XML:
- Cursor-based API
- Iterator-based API
This tutorial will discuss both APIs for parsing an XML file.
1. Introduction to StAX Parser
Just like SAX parser, StAX API is designed for parsing XML streams. The difference is:
- StAX is a “
pull
” API. SAX is a “push
” API. - StAX can do both XML reading and writing. SAX can only do XML reading.
StAX is a pull-style API. This means that we have to move the StAX parser from item to item in the XML file ourself, just like we do with a standard Iterator
or JDBC ResultSet
. We can then access the XML information via the StAX parser for each such “item” encountered in the XML file.
2. Difference between Cursor and Iterator APIs
While reading the XML document, the iterator reader returns an XML event object from its nextEvent() calls. This event provides information about what type of XML tag (element, text, comment etc) we have encountered. The event received is immutable so we can pass around the application to process it safely.
XMLEventReader reader = ...;
while(reader.hasNext()){
XMLEvent event = reader.nextEvent();
if(event.getEventType() == XMLEvent.START_ELEMENT){
//process data
}
//... more event types handled here...
}
Unlike Iterator, the cursor works like Resultset in JDBC. If the cursor moves to the next element in the XML document. You can then call methods directly on the cursor to obtain more information about the current event.
XMLStreamReader streamReader = ...;
while(streamReader.hasNext()){
int eventType = streamReader.next();
if(eventType == XMLStreamReader.START_ELEMENT){
System.out.println(streamReader.getLocalName());
}
//... more event types handled here...
}
3. Iterator API Example
Given below demonstrate how to use StAX iterator-based API to read the XML document to an object.
The XML file is as follows:
<employees>
<employee id="101">
<name>Lokesh Gupta</name>
<title>Author</title>
</employee>
<employee id="102">
<name>Brian Lara</name>
<title>Cricketer</title>
</employee>
</employees>
To read the file, I have written the program in these steps:
- Create an iterator and start receiving events.
- As soon as you get
open 'employee' tag
– create newEmployee
object. - Read
id
attribute from employee tag and set it to currentEmployee
object. - Iterate to the next start tag events. These are XML elements inside
employee
tag. Read data inside these tags. Set read data to currentEmployee
object. - Continue iterating the event. When you find the end element event for
'employee'
tag, you can say that you have read the data for the currentemployee
, so add the currentemployee
object toemployeeList
collection. - At last, verify the read data by printing the
employeeList
.
import com.howtodoinjava.xml.model.Employee;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
public class ReadXmlWithIterator {
public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
File file = new File("employees.xml");
// Instance of the class which helps on reading tags
XMLInputFactory factory = XMLInputFactory.newInstance();
// Initializing the handler to access the tags in the XML file
XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(file));
//All read employees objects will be added to this list
List<Employee> employeeList = new ArrayList<>();
//Create Employee object. It will get all the data using setter methods.
//And at last, it will stored in above 'employeeList'
Employee employee = null;
// Checking the availability of the next tag
while (eventReader.hasNext()) {
XMLEvent xmlEvent = eventReader.nextEvent();
if (xmlEvent.isStartElement()) {
StartElement startElement = xmlEvent.asStartElement();
//As soo as employee tag is opened, create new Employee object
if ("employee".equalsIgnoreCase(startElement.getName().getLocalPart())) {
employee = new Employee();
}
//Read all attributes when start tag is being read
@SuppressWarnings("unchecked")
Iterator<Attribute> iterator = startElement.getAttributes();
while (iterator.hasNext()) {
Attribute attribute = iterator.next();
QName name = attribute.getName();
if ("id".equalsIgnoreCase(name.getLocalPart())) {
employee.setId(Integer.valueOf(attribute.getValue()));
}
}
//Now everytime content tags are found;
//Move the iterator and read data
switch (startElement.getName().getLocalPart()) {
case "name":
Characters nameDataEvent = (Characters) eventReader.nextEvent();
employee.setName(nameDataEvent.getData());
break;
case "title":
Characters titleDataEvent = (Characters) eventReader.nextEvent();
employee.setTitle(titleDataEvent.getData());
break;
}
}
if (xmlEvent.isEndElement()) {
EndElement endElement = xmlEvent.asEndElement();
//If employee tag is closed then add the employee object to list;
//and be ready to read next employee data
if ("employee".equalsIgnoreCase(endElement.getName().getLocalPart())) {
employeeList.add(employee);
}
}
}
System.out.println(employeeList); //Verify read data
}
}
The program output:
[Employee [id=101, name=Lokesh Gupta, title=Author],
Employee [id=102, name=Brian Lara, title=Cricketer]]
4. Cursor API Example
I will read the same employees.xml
file – now with cursor-based API.
import com.howtodoinjava.xml.model.Employee;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.List;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class ReadXmlWithCursor {
public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
//All read employees objects will be added to this list
List<Employee> employeeList = new ArrayList<>();
//Create Employee object. It will get all the data using setter methods.
//And at last, it will stored in above 'employeeList'
Employee employee = null;
File file = new File("employees.xml");
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader streamReader = factory.createXMLStreamReader(new FileReader(file));
while (streamReader.hasNext()) {
//Move to next event
streamReader.next();
//Check if its 'START_ELEMENT'
if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT) {
//employee tag - opened
if (streamReader.getLocalName().equalsIgnoreCase("employee")) {
//Create new employee object asap tag is open
employee = new Employee();
//Read attributes within employee tag
if (streamReader.getAttributeCount() > 0) {
String id = streamReader.getAttributeValue(null, "id");
employee.setId(Integer.valueOf(id));
}
}
//Read name data
if (streamReader.getLocalName().equalsIgnoreCase("name")) {
employee.setFirstName(streamReader.getElementText());
}
//Read title data
if (streamReader.getLocalName().equalsIgnoreCase("title")) {
employee.setLastName(streamReader.getElementText());
}
}
//If employee tag is closed then add the employee object to list
if (streamReader.getEventType() == XMLStreamReader.END_ELEMENT) {
if (streamReader.getLocalName().equalsIgnoreCase("employee")) {
employeeList.add(employee);
}
}
}
//Verify read data
System.out.println(employeeList);
}
}
5. Conclusion
So in this StAX parser tutorial, we learned the following things:
- What is StAX parser based on XML streaming API?
- Difference between StAX vs SAX parsers.
- How to read XML with StAX iterator API with example.
- How to read XML with StAX cursor API with example.
Both APIs are capable of parsing any kind of XML document but the cursor API is more memory-efficient than the iterator API. So, if your application needs better performance, consider using the cursor-based API.
Drop me your questions in the comments section.
Happy Learning !!
Comments