Skip to Content

StAX, the Streaming API for XML, is a new API for pull-parsing of XML, developed under the Java Community Process as JSR 173.   This blog gives an introduction to this API, which combines the efficiency of SAX with the ease of use of tree-based APIs.

Most of the XML parsers fall into two broad categories: tree based (e.g., DOM) or event based (e.g., SAX). Although StAX is   more closely aligned with the latter, it bridges the gap between the two. In SAX, data is pushed via events to application   code handlers. In StAX, the application “pulls” the data from the XML data stream at its convenience. Application code can   filter, skip tags, or stop parsing at any time. The application–not the parser–is in control, which enables a more   intuitive way to process data.

StAX API

The StAX API exposes methods for iterative, event-based processing of XML documents. XML documents are treated as a filtered   series of events, and infoset states can be stored in a procedural fashion. Moreover, unlike SAX, the StAX API is   bidirectional, enabling both reading and writing of XML documents.

  The StAX API is really two distinct API sets: a cursor API and an iterator API.

Cursor API

As the name implies, the StAX cursor API represents a cursor with which you can walk an XML document from beginning to end.   This cursor can point to one thing at a time, and always moves forward, never backward, usually one infoset element at a   time.

The two main cursor interfaces are XMLStreamReader and XMLStreamWriter.

XMLStreamReader

An Instance of XMLStreamReader is used to read the XML Content. The StAX API provides XMLInputFactory to create an instance   of XMLStreamReader.

By Calling the next method of XMLStreamReader, it emits one of the following events:

         

  • XMLStreamConstants.START_ELEMENT
  •        

  • XMLStreamConstants.END_ELEMENT
  •          

  • XMLStreamConstants.PROCESSING_INSTRUCTION
  •        

  • XMLStreamConstants.CHARACTERS
  •        

  • XMLStreamConstants.COMMENT
  •          

  • XMLStreamConstants.SPACE
  •          

  • XMLStreamConstants.START_DOCUMENT
  •          

  • XMLStreamConstants.END_DOCUMENT
  •          

  • XMLStreamConstants.ENTITY_ REFERENCE
  •          

  • XMLStreamConstants.ATTRIBUTE
  •        

  • XMLStreamConstants.DTD
  •        

  • XMLStreamConstants.CDATA
  •          

  • XMLStreamConstants.NAMESPACE
  •        

  • XMLStreamConstants.NOTATION_DECLARATION
  •        

  • XMLStreamConstants.ENTITY_DECLARATION
  • Depending on the event, one can get more information by calling other corresponding methods appropriate to the event. For   example, if the START_ELEMENT event is thrown, then calling getLocalName() will return the local name of the element.

    XMLStreamWriter

    An Instance of XMLStreamWriter is used to write the XML content to output. The StAX API provides XMLOutputFactory to create   an instance of XMLStreamWriter

     

    Then this writer can be used to write events. For example:

    To write a start element: writer.writeStartElement(“Name”)

    To write an end element: writer.writeEndElement()

    To write a comment: writer.writeComment(“This is a comment”)

    Iterator API

    The StAX iterator API represents an XML document stream as a set of discrete event objects. These events are pulled by the   application and provided by the parser in the order in which they are read in the source XML document.

    The base iterator interface is called XMLEvent, and there are many other subinterfaces for each event type.The primary parser   interface for reading iterator events is XMLEventReader, and the primary interface for writing iterator events is   XMLEventWriter. The XMLEventReader  interface contains five methods, the most important of which is nextEvent(), which   returns the next event in an XML stream. XMLEventReader implements java.util.Iterator, which means that returns from

    XMLEventReader can be cached or passed into routines that can work with the standard Java Iterator; for example:

      Similarly, on the output side of the iterator API, you have:

     

    Summing up, StAX XML Processing gives more control to the client application than to the parser, enabling much faster and   more memory-efficient processing.

    To report this post you need to login first.

    1 Comment

    You must be Logged on to comment or reply to a post.

    Leave a Reply