Udemy logo

sax parser exampleJava is the most popular programming language in the world today. Once you write it, you can run it on any system, regardless of the architecture of that system. It is object oriented and class based, and much of its syntax is based on C and C++. This makes it both powerful and flexible. Some of the included libraries make it capable of processing XML- which makes it even more relevant in today’s internet-based world. You will find it easier to learn this language if you already know C or C++ before. Java can also be picked up as a first language – this beginners Java Course can help you. We teach you everything you need to know about Java programming, from the basics all the way up to the advanced stuff.

SAX Parser

In this tutorial, we’ll give you an idea of how the Sax Parser works. This tutorial is designed to be as easy to understand as possible, but you still need to be familiar with the basics of Java and XML. You can take this Java course to help you ramp up fast.

First, do you know what a parser is? A parser is a computer program that essentially breaks down input code into parts, based on certain criteria. These parts of the input are then used by other programs, if necessary. A parser may be a part of a compiler. The input it receives may include online commands, source instructions or markup tags.

There are two types of parsers available in XML: DOM and SAX. The Document Object Model (DOM) parser operates on an entire XML document as a whole, while the Simple API for XML (SAX) parser considers an XML document to be made of parts and operates on one part at a time. In other words, it is a sequential access API. Learn more about XML with this  course.

SAX parsers are more popular than DOM parsers for several reasons. Because the SAX parser treats an individual piece of the document at a time, it requires less memory than a DOM parser. This makes it much faster, especially for XML documents that are large – and most XML documents are very large, especially when we look at the documents in a typical website. The DOM parsers create objects and store them in a tree structure, which requires quite a bit of memory. The SAX parsers, on the other hand, need callback methods to operate. There are 3 callback methods a SAX parser needs:

These methods will help the SAX parser operate on the XML document and pass on the result to the programmer or developer. These methods can be called when you create a class that will extend the base SAX class:  org.xml.sax.helpers.DefaultHandler. To learn more about how HTML and XML are parsed, you can take this course on web development.

Creating an XML File

You need to create an XML file before you can begin using the SAX parser. If you’re unfamiliar with XML, we recommend taking one of our web development courses for an introduction to the language.

Let’s create a simple XML file (organization.xml) that you can use with the SAX parser. We’ll include two employee names and their salary, along with a nickname:

<?xml version="1.0"?>

<organization>

                <employee>

                                <firstname>john</firstname>

                                <lastname>smith</lastname>

                                <nickname>js</nickname>

                                <salary>200000</salary>

                </employee>

                <employee>

                                <firstname>Harry</firstname>

                                <lastname>Smith</lastname>

                                <nickname>hs</nickname>

                                <salary>300000</salary>

                </employee>

</organization>

Now that we have a XML file, we can just parse it with a SAX parser. A SAX parser will import a large number of library files, as in the example below. Our goal is to create an Organization object that will take the elements from the XML document organization.xml and give us an organization object that contains all the elements, in a list form (broken in parts, as is the goal of a parser). We’ll show you how it’s done in the example below:

import javax.xml.parsers.SAXParser;

import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;

import org.xml.sax.SAXException;

import org.xml.sax.helpers.DefaultHandler;

public class OrganizationXMLRead {

   public static void main(String argv[]) {

    try {

                SAXParserFactory factory = SAXParserFactory.newInstance();

                SAXParser saxParser = factory.newSAXParser();

                DefaultHandler handler = new DefaultHandler() {

                boolean bfirstname = false;

                boolean blastname = false;

                boolean bnickname = false;

                boolean bthesalary = false;

                public void startElement(String uri, String localName,String qName,

                Attributes attributes) throws SAXException {

                                System.out.println("Start Element Method :" + qName);

                                if (qName.equalsIgnoreCase("FIRSTNAME")) {

                                                bfirstname = true;

                                }

                                if (qName.equalsIgnoreCase("LASTNAME")) {

                                                blastname = true;

                                }

                                if (qName.equalsIgnoreCase("NICKNAME")) {

                                                bnickname = true;

}

                                if (qName.equalsIgnoreCase("SALARY")) {

                                                bthesalary = true;

                                }

                }

                public void endElement(String uri, String localName,

                                String qName) throws SAXException {

                                System.out.println("End Element Method :" + qName);

                }

                public void characters(char ch[], int start, int length) throws SAXException {

                                if (bfirstname) {

                                                System.out.println("Employee first name : " + new String(ch, start

 length));

                                                bfirstname = false;

                                }

                                if (blastname) {

                                                System.out.println("Employee last name : " + new String(ch, start,

length));

                                                blastname = false;

                                }

                                if (bnickname) {

                                                System.out.println("Employee nick name : " + new String(ch, start,

length));

                                                bnickname = false;

                                }

                                if (bthesalary) {

                                                System.out.println("Employee salary : " + new String(ch, start, le

ngth));

                                               bthesalary = false;

                                }

                }

     };

       saxParser.parse("c:\\organization.xml", handler);

     } catch (Exception e) {

       e.printStackTrace();

     }

   }

}

Output: If all goes well, the output will display the XML document we created, with the start element methods and end element methods displayed (print line command) to help you understand where the parser operated with the callback methods.

You will better understand this example if you type it and run it on your own. You will need to understand how parsers work when you develop your websites. This guide on website development can help you learn how the pieces all come together.

Page Last Updated: April 2014

Top courses in Web Scraping

Web Scraping in Nodejs & JavaScript
Stefan Hyltoft
4.6 (815)
Bestseller
Web Scraping and API Fundamentals in Python
365 Careers
4.6 (1,396)
Highest Rated
Modern Web Scraping Fundamentals with Python
Jordan Sauchuk, SuperDataScience Team, Ligency Team
4.2 (822)
Master Python Web Scraping & Automation using BS4 & Selenium
Hussain Mustafa, Codestars • over 2 million students worldwide!
4.3 (371)

More Web Scraping Courses

Web Scraping students also learn

Empower your team. Lead the industry.

Get a subscription to a library of online courses and digital learning tools for your organization with Udemy Business.

Request a demo