Java SAX Parser – Read XML Example

SAX parser, or Simple API for XML has been around for many years and was originally a development lead by David Megginson before the turn of the millennium. In those days, you had to download the Java version of SAX from David’s personal web site. This developed into the SAX Project before finally being added to Java Standard Edition 1.4.

SAX is a streaming interface for XML, which means that applications using SAX receive event notifications about the XML document being processed an element, and attribute, at a time in sequential order starting at the top of the document, and ending with the closing of the ROOT element. This means that it’s extremely efficient at processing XML in linear time without placing too many demands upon system memory.

Lets create a demo program to read xml file with SAX parser to understand fully.

1. Prepare xml file to be parsed

This xml file contains xml attributes also along with xml elements.

<users>
	<user id="100">
		<firstname>Tom</firstname>
		<lastname>Hanks</lastname>
	</user>
	<user id="101">
		<firstname>Lokesh</firstname>
		<lastname>Gupta</lastname>
	</user>
	<user id="102">
		<firstname>HowToDo</firstname>
		<lastname>InJava</lastname>
	</user>
</users>

2. Create model class

package com.howtodoinjava.xml.sax;

/**
 * Model class. Its instances will be populated using SAX parser.
 * */
public class User
{
	//XML attribute id
	private int id;
	//XML element fisrtName
	private String firstName;
	//XML element lastName
	private String lastName;

	public int getId() {
		return id;
	}
	public void setId(int id) {
		this.id = id;
	}
	public String getFirstName() {
		return firstName;
	}
	public void setFirstName(String firstName) {
		this.firstName = firstName;
	}
	public String getLastName() {
		return lastName;
	}
	public void setLastName(String lastName) {
		this.lastName = lastName;
	}

	@Override
	public String toString() {
		return this.id + ":" + this.firstName +  ":" +this.lastName ;
	}
}

3. Build the handler by extending DefaultParser

Below the code for parse handler. I have put additional information in code comments. Still, is you have any query, drop me a comment.

package com.howtodoinjava.xml.sax;

import java.util.ArrayList;
import java.util.Stack;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class UserParserHandler extends DefaultHandler
{
	//This is the list which shall be populated while parsing the XML.
    private ArrayList userList = new ArrayList();

    //As we read any XML element we will push that in this stack
    private Stack elementStack = new Stack();

    //As we complete one user block in XML, we will push the User instance in userList
    private Stack objectStack = new Stack();

    public void startDocument() throws SAXException
    {
        //System.out.println("start of the document   : ");
    }

    public void endDocument() throws SAXException
    {
        //System.out.println("end of the document document     : ");
    }

    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
    {
    	//Push it in element stack
        this.elementStack.push(qName);

        //If this is start of 'user' element then prepare a new User instance and push it in object stack
        if ("user".equals(qName))
        {
            //New User instance
        	User user = new User();

            //Set all required attributes in any XML element here itself
            if(attributes != null &amp;&amp; attributes.getLength() == 1)
            {
            	user.setId(Integer.parseInt(attributes.getValue(0)));
            }
            this.objectStack.push(user);
        }
    }

    public void endElement(String uri, String localName, String qName) throws SAXException
    {
    	//Remove last added  element
        this.elementStack.pop();

        //User instance has been constructed so pop it from object stack and push in userList
        if ("user".equals(qName))
        {
            User object = this.objectStack.pop();
            this.userList.add(object);
        }
    }

    /**
     * This will be called everytime parser encounter a value node
     * */
    public void characters(char[] ch, int start, int length) throws SAXException
    {
        String value = new String(ch, start, length).trim();

        if (value.length() == 0)
        {
            return; // ignore white space
        }

        //handle the value based on to which element it belongs
        if ("firstName".equals(currentElement()))
        {
            User user = (User) this.objectStack.peek();
            user.setFirstName(value);
        }
        else if ("lastName".equals(currentElement()))
        {
            User user = (User) this.objectStack.peek();
            user.setLastName(value);
        }
    }

    /**
     * Utility method for getting the current element in processing
     * */
    private String currentElement()
    {
        return this.elementStack.peek();
    }

    //Accessor for userList object
    public ArrayList getUsers()
    {
    	return userList;
    }
}

4. SAX parser to read XML file

package com.howtodoinjava.xml.sax;

import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;

import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;

public class UsersXmlParser
{
	public ArrayList parseXml(InputStream in)
	{
		//Create a empty link of users initially
		ArrayList<user> users = new ArrayList</user><user>();
		try
		{
			//Create default handler instance
			UserParserHandler handler = new UserParserHandler();

			//Create parser from factory
			XMLReader parser = XMLReaderFactory.createXMLReader();

			//Register handler with parser
			parser.setContentHandler(handler);

			//Create an input source from the XML input stream
			InputSource source = new InputSource(in);

			//parse the document
			parser.parse(source);

			//populate the parsed users list in above created empty list; You can return from here also.
			users = handler.getUsers();

		} catch (SAXException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} finally {

		}
		return users;
	}
}

5) Test SAX parser

Lets write some code to test whether our handler is actually working.

package com.howtodoinjava.xml.sax;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;

public class TestSaxParser
{
	public static void main(String[] args) throws FileNotFoundException
	{
		//Locate the file
		File xmlFile = new File("D:/temp/sample.xml");

		//Create the parser instance
		UsersXmlParser parser = new UsersXmlParser();

		//Parse the file
		ArrayList users = parser.parseXml(new FileInputStream(xmlFile));

		//Verify the result
		System.out.println(users);
	}
}

Output:
[100:Tom:Hanks, 101:Lokesh:Gupta, 102:HowToDo:InJava]

Happy Learning !!

Was this post helpful?

Join 7000+ Fellow Programmers

Subscribe to get new post notifications, industry updates, best practices, and much more. Directly into your inbox, for free.

29 thoughts on “Java SAX Parser – Read XML Example”

  1. Need an help ….have used SAX-reader but for the description field for eg Test & return values
    the reader doesn’t read the complete value but reads value after &.

    i.e. it displays “return values” only

    Reply
  2. <work>
                <ratings_sum type="integer">26798</ratings_sum>
                <ratings_count type="integer">7324</ratings_count>
            </work>
    

    using your method, how would you suggest i parse the info for ratings sum

    Reply
  3. <isbn><![CDATA[ 0062380001 ]]></isbn>
    

    i’m trying to parse some data and i’m trying to get the isbn (above). i am aware that you deal with cdata within the part of the characters function where the value.length == 0, so do you have any idea i can access and obtain the isbn?

    Reply
  4. Hi

    I want a generic solution for this . Means let say User has changed the xml , no need to those attributes in the Pojo class and i will run and will get correct output. Also anways suppose i will eliminate that Pojo also its very good . Bcoz i dont know what will be the xml format . So that it will work for the all xml input file . I want to go for only SAX parser.

    Thanks
    Tony

    Reply
  5. Sax will support to read XML file with multiple root elements? XML file has multiple encoding declarions for every root element?

    Reply
      • If we see below xml which has “multiple encoding declarations” for each “Contact” element..The is my target xml got from client. So please suggest me how to proceed with below formatted file.

        00000009151666829922

        1

        00000009151666829922

        1

        While reading getting error: The processing instruction target matching “[xX][mM][lL]” is not allowed

        Thanks for your quick response.

        Reply
    • Try this

      
      URL oracle = new URL(&quot;http://www.oracle.com/&quot;);
      BufferedReader in = new BufferedReader(new InputStreamReader(oracle.openStream()));
      
      String inputLine;
      while ((inputLine = in.readLine()) != null)
             System.out.println(inputLine);
      in.close();
      
      
      Reply
  6. Its a nice try, but SAX means parse on fly. but you holds the data into some Stacks and Arrays.. seems its not a good arch. idea

    Reply
  7. hello can u plz me out by telling how parse any xml file without knowing the elements i mean the tags of xml.. jst knowing the xml file source… i want to parse webservices available on net jst by giving url n get the content of that xml content which is returned by the webservice. so how to get it done. plz help me

    Reply
    • interesting question. what you are planning to do with data without knowing what you are getting? programmatically it’s possible, but I can not think of why somebody would like to do this.

      Reply
      • example

        for example above only tag seperator will be same and inside content can vary, this is wen you are not sure about input
        how can we parse to read only the content between tag_seprator

        Reply
  8. Hello, Thank you very much for your tutorial. I am learning SAX parsing at the same time that I have a very basic kwoledge of Java. I’ve been fiercely googling around trying to make sense of this error but ArrayList is driving me crazy.
    Can you tell me why your UsersXmlParser.java is throwing this compile error (I copy and paste your source code):

    ( just in case is relevant, I am writing in Linux )

    $ javac model.java UserParserHandler.java UsersXmlParser.java TestSAXParser.java
    UsersXmlParser.java:17: illegal start of type
    ArrayList users = new ArrayList();
    ^
    UsersXmlParser.java:17: ‘(‘ or ‘[‘ expected
    ArrayList users = new ArrayList();
    ^
    UsersXmlParser.java:17: illegal start of expression
    ArrayList users = new ArrayList();
    ^
    UsersXmlParser.java:17: illegal start of expression
    ArrayList users = new ArrayList();
    ^
    4 errors


    thank you again!
    Andres

    Reply
  9. Hey nice tutorial! However I spot a little bug in the sample code paste on this page: in the xml file, value node “firstname” and “lastname” should be changed to “firstName” and “lastName”, otherwise the firstname and lastname value will never be set to the User object. Anyway great tutorial buddy!

    Reply
  10. Excellent article. Thanks. I’ve been trying to wrap my head around an XML parsing example I found elsewhere and it wasn’t making sense. Yours is the cleanest, clearest example I’ve found. Thanks a bunch.

    Reply

Leave a Comment

HowToDoInJava

A blog about Java and its related technologies, the best practices, algorithms, interview questions, scripting languages, and Python.