SAX XMLReader, XMLFilter, ContentHandler and XMLWriter question

Discussion in 'XML' started by Jeff Calico, Feb 22, 2006.

  1. Jeff Calico

    Jeff Calico Guest

    Hello all. I am implementing a SAX filter to strip a bunch of unneeded
    elements out of a large XML file. I found a book "Java & XML" by Brett
    McLaughlin, and an interesting article by him wich address my issues:
    http://www-128.ibm.com/developerworks/xml/library/x-tipbigdoc3.html

    However, doing it in that specified way does not seem to work at all!
    Doing it very differently *seems* to work, but most likely I am not
    understanding something.

    Here is my code, contrasted with the book's code. The class
    KeepSpecificElementsFilter is at the end:

    --------------------------
    MY TRIAL AND ERROR WAY:
    ---------------------------
    FileReader r = new FileReader( "filename" );
    XMLReader xr = XMLReaderFactory.createXMLReader();
    KeepSpecificEltsFilter filter = new KeepSpecificEltsFilter( xr,
    "elt");
    XMLWriter xw = new XMLWriter( filter, new FileWriter( "Out.xml" ) );
    xw.parse( new InputSource(r) );

    --------------------------------------------------------
    THE WAY THE BOOK SAYS TO DO IT (maybe I misunderstand):
    ---------------------------------------------------------
    FileReader r = new FileReader( s );
    XMLReader xr = XMLReaderFactory.createXMLReader();
    XMLWriter xw = new XMLWriter( xr, new FileWriter( "jeffOut.xml" ) );
    KeepSpecificEltsFilter filter = new KeepSpecificEltsFilter( xw,
    "elt");

    //DefaultHandler dh = new DefaultHandler();
    JeffContentHandler dh = new JeffContentHandler(xr);

    filter.setContentHandler( dh );
    filter.parse( new InputSource(r) );
    ------------------------------------------------------------
    Note the difference between who does the parsing
    (writer or filter) and the way they are chained together.

    And Last, here is the Filter class:
    ------------------------------------------------------------

    public class KeepSpecificEltsFilter extends XMLFilterImpl {

    private List elementsToKeep;

    private boolean inKeptElement = false;

    public KeepSpecificEltsFilter( XMLReader parent, String
    elementToKeep )
    {
    super( parent );
    elementsToKeep = new LinkedList();
    elementsToKeep.add(elementToKeep);
    }

    //---------------------------------------------------------------------------

    public KeepSpecificEltsFilter( XMLReader parent, List elementsToKeep
    )
    {
    super(parent);
    this.elementsToKeep = elementsToKeep;
    }

    //---------------------------------------------------------------------------

    public void startElement( String uri, String localName, String qName,
    Attributes atts)
    throws SAXException
    {
    if( elementsToKeep.contains(localName) )
    {
    System.out.println("In kept element = " + localName);
    super.startElement( uri, localName, qName, atts );
    inKeptElement = true;
    }
    else
    {
    }
    }

    //---------------------------------------------------------------------------

    public void endElement( String uri, String localName, String qName )
    throws SAXException
    {
    if( elementsToKeep.contains(localName) )
    {
    super.endElement( uri, localName, qName );
    inKeptElement = false;
    }
    else
    {
    // DON'T DO ANYTHING... PREVENTS PROCESSING OF
    ELEMENTS
    }
    }

    //---------------------------------------------------------------------------

    public void characters( char ch[], int start, int len )
    throws
    SAXException
    {
    if( inKeptElement )
    {
    super.characters( ch, start, len );
    }
    }
    }

    Any insight would be appreciated!

    --Jeff
     
    Jeff Calico, Feb 22, 2006
    #1
    1. Advertising

  2. Jeff Calico

    Jeff Calico Guest

    I forgot to add that I don't understand what to do with the
    ContentHandler class;
    I tried to use the DefaultHandler, and then I tried my own class
    "JeffContentHandler"
    with an empty implementation. It seems to me that the Filter class is
    doing this
    work though, so why would I register a ContentHandler?

    --Jeff
     
    Jeff Calico, Feb 22, 2006
    #2
    1. Advertising

  3. Jeff Calico wrote:
    > I forgot to add that I don't understand what to do with the
    > ContentHandler class;
    > I tried to use the DefaultHandler, and then I tried my own class
    > "JeffContentHandler"
    > with an empty implementation. It seems to me that the Filter class is
    > doing this
    > work though, so why would I register a ContentHandler?


    Normally, the filter is a ContentHandler whose only job is to pass
    selected events along to another ContentHandler which actually uses the
    filtered document. You have to register your "real" ContentHandler with
    the filter so it knows what to do with the events after deciding whether
    to keep them or not.

    Alternatively, of course, you can combine both the filtering and the
    operate-on-the-data stages in a single custom ContentHandler. But in
    that case there's no need for it to claim to be a Filter.


    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Feb 22, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. curwen

    sax xmlreader.parse problem

    curwen, Nov 10, 2003, in forum: Java
    Replies:
    4
    Views:
    3,194
    curwen
    Nov 11, 2003
  2. Gary Robinson

    xml.sax.xmlreader and expat

    Gary Robinson, Jun 27, 2006, in forum: Python
    Replies:
    2
    Views:
    343
    Stefan Behnel
    Jun 28, 2006
  3. =?Utf-8?B?UGF1bA==?=

    using xmlwriter to create attributes for an excel tag

    =?Utf-8?B?UGF1bA==?=, Nov 7, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    376
    Olaf Rabbachin
    Nov 7, 2006
  4. Armel
    Replies:
    0
    Views:
    1,029
    Armel
    Dec 15, 2008
  5. Neil Cerutti

    sax.handler.Contenthandler.__init__

    Neil Cerutti, Aug 30, 2013, in forum: Python
    Replies:
    1
    Views:
    111
    Prasad, Ramit
    Aug 30, 2013
Loading...

Share This Page