HTML parsing with Xerces

Discussion in 'XML' started by Hans Bijvoet, Jan 28, 2005.

  1. Hans Bijvoet

    Hans Bijvoet Guest

    Hello,
    I'm trying to parse a HTML document with the SAX parser from Xerces.
    The parser throws a fatal error when attribute values in the document are
    not surrounded by quotes?
    How can I prevent this parser's behaviour?
    Greetings,
    Hans
    Hans Bijvoet, Jan 28, 2005
    #1
    1. Advertising

  2. /Hans Bijvoet/:

    > I'm trying to parse a HTML document with the SAX parser from Xerces.
    > The parser throws a fatal error when attribute values in the document are
    > not surrounded by quotes?
    > How can I prevent this parser's behaviour?


    Which Xerces? Perhaps using a parser configuration which uses a HTML
    scanner, as you know HTML is not XML compatible. For Java there's
    one from Andy Clarck's CyberNeko Tools for XNI (I haven't tried it
    myself, though):

    http://www.apache.org/~andyc/neko/doc/html/index.html

    You may also consider posting Xerces specific questions to the
    Xerces User mailing list:

    http://xml.apache.org/mail.html#xerces-j-user

    --
    Stanimir
    Stanimir Stamenkov, Jan 28, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. trent ohannessian

    Xerces parsing losing data...

    trent ohannessian, Sep 8, 2003, in forum: Java
    Replies:
    2
    Views:
    432
    trent ohannessian
    Sep 9, 2003
  2. Kevin Flood
    Replies:
    0
    Views:
    1,009
    Kevin Flood
    Sep 8, 2004
  3. Kevin Flood
    Replies:
    1
    Views:
    2,712
    Kevin Flood
    Sep 13, 2004
  4. cvissy
    Replies:
    0
    Views:
    600
    cvissy
    Nov 16, 2004
  5. Camk
    Replies:
    1
    Views:
    499
    Chris
    Mar 20, 2007
Loading...

Share This Page