How to specify DTD to DTD.getDTD for DocumentParser?

Discussion in 'Java' started by Ronald Fischer, Mar 8, 2005.

  1. I would like to check a HTML file for errors such as mismatched start/end
    tags or typos in the tags. Hence I need a validating HTML parser, which
    gives me error messages when it encounters an error. I thought that
    javax.swing.text.html.parser.DocumentParser could be a good choice.

    I have now the problem that I don't know how to specify the DTD:

    dp=new DocumentParser(DTD.getDTD(???WHAT SHOULD I WRITE HERE???));

    I tried to call getDTD with the following arguments:

    "http://www.w3.org/TR/html4/strict.dtd"
    "-//W3C//DTD HTML 4.0 Transitional//EN"

    In neither case was the parser able to recognize any tag. It called
    the handleError callback on any tag encountered.

    Unfortunately, the documentation for the class DTD is next to
    non-existent. Could someone please help me with this?

    Ronald
     
    Ronald Fischer, Mar 8, 2005
    #1
    1. Advertising

  2. Ronald Fischer wrote:
    > I would like to check a HTML file for errors such as mismatched start/end
    > tags or typos in the tags. Hence I need a validating HTML parser, which
    > gives me error messages when it encounters an error. I thought that
    > javax.swing.text.html.parser.DocumentParser could be a good choice.


    It is maybe the worst choice (see Q6.3.2 of the comp.lang.java.gui FAQ).
    Instead consider jtidy or the original tidy.

    /Thomas


    --
    The comp.lang.java.gui FAQ:
    ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq
     
    Thomas Weidenfeller, Mar 8, 2005
    #2
    1. Advertising

  3. Thomas Weidenfeller <> wrote in message news:<d0kcmr$6fi$>...
    > Ronald Fischer wrote:
    > > I would like to check a HTML file for errors such as mismatched start/end
    > > tags or typos in the tags. Hence I need a validating HTML parser, which
    > > gives me error messages when it encounters an error. I thought that
    > > javax.swing.text.html.parser.DocumentParser could be a good choice.

    >
    > It is maybe the worst choice (see Q6.3.2 of the comp.lang.java.gui FAQ).
    > Instead consider jtidy or the original tidy.


    I tried out jtidy before, but found it completely useless for my purpose
    (it complained about EVERY correct tag I tried). I then learned that
    jtidy was implemented towards XHTML, not HTML, and for instance requires
    that every tag must be in lower case etc. Also, I don't see how to have
    jtidy distinguish between strict HTMl and transitional HTML.

    Can't it be that there is no free HTML syntax checker available?????

    Ronald
     
    Ronald Fischer, Mar 10, 2005
    #3
  4. Ronald Fischer wrote:
    > I then learned that
    > jtidy was implemented towards XHTML, not HTML, and for instance requires
    > that every tag must be in lower case etc.


    Tidy.setXHTML(false);

    > Also, I don't see how to have
    > jtidy distinguish between strict HTMl and transitional HTML.


    Tidy.setDocType("strict");

    > Can't it be that there is no free HTML syntax checker available?????


    Well ...

    /Thomas

    --
    The comp.lang.java.gui FAQ:
    ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq
     
    Thomas Weidenfeller, Mar 10, 2005
    #4
  5. Thomas Weidenfeller <> wrote in message news:<d0p4n5$meb$>...
    > Ronald Fischer wrote:
    > > I then learned that
    > > jtidy was implemented towards XHTML, not HTML, and for instance requires
    > > that every tag must be in lower case etc.

    >
    > Tidy.setXHTML(false);
    >
    > > Also, I don't see how to have
    > > jtidy distinguish between strict HTMl and transitional HTML.

    >
    > Tidy.setDocType("strict");


    Thank you, this indeed works fine. Do you know wheather there is any
    documentation available which explains how to use Jtidy? Sure, there
    comes the class list with the package, but for example the HTML docs
    for the class Tidy don't explain what the function setHTML does, and
    they give only a coars description of what the argument to setDocType
    must look like, but do also not explain how to use this function in
    practice.

    What I'm missing is kind of a "user manual" for this class....

    Ronald
     
    Ronald Fischer, Mar 17, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Philip
    Replies:
    11
    Views:
    1,701
    sunil_k
    May 13, 2008
  2. Kent Tong
    Replies:
    4
    Views:
    442
    Richard Tobin
    Feb 23, 2004
  3. unwiseone
    Replies:
    1
    Views:
    477
    Peter Flynn
    Aug 10, 2005
  4. ezmiller
    Replies:
    1
    Views:
    698
    Richard Tobin
    Nov 26, 2005
  5. ezmiller
    Replies:
    0
    Views:
    358
    ezmiller
    Nov 26, 2005
Loading...

Share This Page