How to specify DTD to DTD.getDTD for DocumentParser?

R

Ronald Fischer

I would like to check a HTML file for errors such as mismatched start/end
tags or typos in the tags. Hence I need a validating HTML parser, which
gives me error messages when it encounters an error. I thought that
javax.swing.text.html.parser.DocumentParser could be a good choice.

I have now the problem that I don't know how to specify the DTD:

dp=new DocumentParser(DTD.getDTD(???WHAT SHOULD I WRITE HERE???));

I tried to call getDTD with the following arguments:

"http://www.w3.org/TR/html4/strict.dtd"
"-//W3C//DTD HTML 4.0 Transitional//EN"

In neither case was the parser able to recognize any tag. It called
the handleError callback on any tag encountered.

Unfortunately, the documentation for the class DTD is next to
non-existent. Could someone please help me with this?

Ronald
 
T

Thomas Weidenfeller

Ronald said:
I would like to check a HTML file for errors such as mismatched start/end
tags or typos in the tags. Hence I need a validating HTML parser, which
gives me error messages when it encounters an error. I thought that
javax.swing.text.html.parser.DocumentParser could be a good choice.

It is maybe the worst choice (see Q6.3.2 of the comp.lang.java.gui FAQ).
Instead consider jtidy or the original tidy.

/Thomas
 
R

Ronald Fischer

Thomas Weidenfeller said:
It is maybe the worst choice (see Q6.3.2 of the comp.lang.java.gui FAQ).
Instead consider jtidy or the original tidy.

I tried out jtidy before, but found it completely useless for my purpose
(it complained about EVERY correct tag I tried). I then learned that
jtidy was implemented towards XHTML, not HTML, and for instance requires
that every tag must be in lower case etc. Also, I don't see how to have
jtidy distinguish between strict HTMl and transitional HTML.

Can't it be that there is no free HTML syntax checker available?????

Ronald
 
T

Thomas Weidenfeller

Ronald said:
I then learned that
jtidy was implemented towards XHTML, not HTML, and for instance requires
that every tag must be in lower case etc.
Tidy.setXHTML(false);

Also, I don't see how to have
jtidy distinguish between strict HTMl and transitional HTML.
Tidy.setDocType("strict");

Can't it be that there is no free HTML syntax checker available?????

Well ...

/Thomas
 
R

Ronald Fischer

Thomas Weidenfeller said:
Tidy.setDocType("strict");

Thank you, this indeed works fine. Do you know wheather there is any
documentation available which explains how to use Jtidy? Sure, there
comes the class list with the package, but for example the HTML docs
for the class Tidy don't explain what the function setHTML does, and
they give only a coars description of what the argument to setDocType
must look like, but do also not explain how to use this function in
practice.

What I'm missing is kind of a "user manual" for this class....

Ronald
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top