Learn to parse and read XML file using Java StAX parser. StAX (Streaming API for XML) provides two ways to parse XML i.e. cursor based API and iterator based API.
1) StAX Parser
Just like SAX parser, StAX API is designed for parsing XML streams. The difference is:
- StAX is a “
pull
” API. SAX is a “push
” API. - StAX can do both XML reading and writing. SAX can only do XML reading.
StAX is a pull style API. This means that you have to move the StAX parser from item to item in the XML file yourself, just like you do with a standard Iterator
or JDBC ResultSet
. You can then access the XML information via the StAX parser for each such “item” encountered in the XML file.
Cursor vs Iterator
- While reading the XML document, the iterator reader returns an XML event object from it’s
nextEvent()
calls. This event provides information about what type of XML tag (element, text, comment etc) your have encountered. The event received is immutable so you can pass around your application to processs it safely.XMLEventReader reader = ...; while(reader.hasNext()){ XMLEvent event = reader.nextEvent(); if(event.getEventType() == XMLEvent.START_ELEMENT){ //process data } //... more event types handled here... }
- Unlike Iterator, cursor works like
Resultset
in JDBC. If moves the cursor to next element in XML document. You can then call methods directly on the cursor to obtain more information about the current event.XMLStreamReader streamReader = ...; while(streamReader.hasNext()){ int eventType = streamReader.next(); if(eventType == XMLStreamReader.START_ELEMENT){ System.out.println(streamReader.getLocalName()); } //... more event types handled here... }
2) StAX Iterator API Example
Given below demonstrate how to use StAX iterator based API to read the XML document to object.
XML file
<employees> <employee id="101"> <name>Lokesh Gupta</name> <title>Author</title> </employee> <employee id="102"> <name>Brian Lara</name> <title>Cricketer</title> </employee> </employees>
Read XML with StAX Iterator
To read the file, I have written the program in these steps:
- Create iterator and start receiving events.
- As soon as you get
open 'employee' tag
– create newEmployee
object. - Read
id
attribute from employee tag and set to currentEmployee
object. - Iterate to next start tag events. These are XML elements inside
employee
tag. Read data inside these tags. Set read data to currentEmployee
object. - Continue iterating the event. When you find end element event for
'employee'
tag, you can say that you have read the data for currentemployee
, so add the currentemployee
object toemployeeList
collection. - At last, verify the read data by printing the
employeeList
.
package com.howtodoinjava.demo.stax; import java.io.File; import java.io.FileNotFoundException; import java.io.FileReader; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import javax.xml.namespace.QName; import javax.xml.stream.XMLEventReader; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.events.Attribute; import javax.xml.stream.events.Characters; import javax.xml.stream.events.EndElement; import javax.xml.stream.events.StartElement; import javax.xml.stream.events.XMLEvent; public class ReadXMLExample { public static void main(String[] args) throws FileNotFoundException, XMLStreamException { File file = new File("employees.xml"); // Instance of the class which helps on reading tags XMLInputFactory factory = XMLInputFactory.newInstance(); // Initializing the handler to access the tags in the XML file XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(file)); //All read employees objects will be added to this list List<Employee> employeeList = new ArrayList<>(); //Create Employee object. It will get all the data using setter methods. //And at last, it will stored in above 'employeeList' Employee employee = null; // Checking the availability of the next tag while (eventReader.hasNext()) { XMLEvent xmlEvent = eventReader.nextEvent(); if (xmlEvent.isStartElement()) { StartElement startElement = xmlEvent.asStartElement(); //As soo as employee tag is opened, create new Employee object if("employee".equalsIgnoreCase(startElement.getName().getLocalPart())) { employee = new Employee(); } //Read all attributes when start tag is being read @SuppressWarnings("unchecked") Iterator<Attribute> iterator = startElement.getAttributes(); while (iterator.hasNext()) { Attribute attribute = iterator.next(); QName name = attribute.getName(); if("id".equalsIgnoreCase(name.getLocalPart())) { employee.setId(Integer.valueOf(attribute.getValue())); } } //Now everytime content tags are found; //Move the iterator and read data switch (startElement.getName().getLocalPart()) { case "name": Characters nameDataEvent = (Characters) eventReader.nextEvent(); employee.setName(nameDataEvent.getData()); break; case "title": Characters titleDataEvent = (Characters) eventReader.nextEvent(); employee.setTitle(titleDataEvent.getData()); break; } } if (xmlEvent.isEndElement()) { EndElement endElement = xmlEvent.asEndElement(); //If employee tag is closed then add the employee object to list; //and be ready to read next employee data if("employee".equalsIgnoreCase(endElement.getName().getLocalPart())) { employeeList.add(employee); } } } System.out.println(employeeList); //Verify read data } } //Output: [Employee [id=101, name=Lokesh Gupta, title=Author], Employee [id=102, name=Brian Lara, title=Cricketer]]
3) StAX Cursor API Example
I will read the same employees.xml
file – now with cursor based API.
package com.howtodoinjava.demo.stax; import java.io.File; import java.io.FileNotFoundException; import java.io.FileReader; import java.util.ArrayList; import java.util.List; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; public class ReadXMLExample { public static void main(String[] args) throws FileNotFoundException, XMLStreamException { //All read employees objects will be added to this list List<Employee> employeeList = new ArrayList<>(); //Create Employee object. It will get all the data using setter methods. //And at last, it will stored in above 'employeeList' Employee employee = null; File file = new File("employees.xml"); XMLInputFactory factory = XMLInputFactory.newInstance(); XMLStreamReader streamReader = factory.createXMLStreamReader(new FileReader(file)); while(streamReader.hasNext()) { //Move to next event streamReader.next(); //Check if its 'START_ELEMENT' if(streamReader.getEventType() == XMLStreamReader.START_ELEMENT) { //employee tag - opened if(streamReader.getLocalName().equalsIgnoreCase("employee")) { //Create new employee object asap tag is open employee = new Employee(); //Read attributes within employee tag if(streamReader.getAttributeCount() > 0) { String id = streamReader.getAttributeValue(null,"id"); employee.setId(Integer.valueOf(id)); } } //Read name data if(streamReader.getLocalName().equalsIgnoreCase("name")) { employee.setName(streamReader.getElementText()); } //Read title data if(streamReader.getLocalName().equalsIgnoreCase("title")) { employee.setTitle(streamReader.getElementText()); } } //If employee tag is closed then add the employee object to list if(streamReader.getEventType() == XMLStreamReader.END_ELEMENT) { if(streamReader.getLocalName().equalsIgnoreCase("employee")) { employeeList.add(employee); } } } //Verify read data System.out.println(employeeList); } } //Output: [Employee [id=101, name=Lokesh Gupta, title=Author], Employee [id=102, name=Brian Lara, title=Cricketer]]
4) Summary
So in this StAX parser tutorial, we learned following things:
- What is StAX parser based on XML streaming API.
- Difference between StAX vs SAX parsers.
- How to read XML with StAX iterator API with example.
- How to read XML with StAX cursor API with example.
Both API are capable of parsing any kind of XML document but the cursor API is more memory-efficient than the iterator API. So, if your application needs better performance, consider using the cursor based API.
Drop me your questions in comments section.
Happy Learning !!
Leave a Reply