Problem w/ DocumentBuilder parse method

J

John L.

I'm pre-processing a file in an attempt to use the subject method, and receive the following error:

[Fatal Error] EXTRACT.TMP:51:23: The entity "nbsp" was referenced, but not declared.
Exception in thread "main" org.xml.sax.SAXParseException: The entity "nbsp" was referenced, but not declared.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at Extract.CmdLine(Extract.java:144)
at Extract.main(Extract.java:79)

The pertinent portion of the file being parsed follows:

[45]<div>
[46]<input type="hidden" name="cx" value="partner-pub-5436175752152469:m8vqbgi2n
21" />
[47]<input type="hidden" name="cof" value="FORID:10" />
[48]<input type="hidden" name="ie" value="ISO-8859-1" />
[49]<input type="text" name="q" size="55" />
[50]<input type="submit" name="sa" value="PCM Search" />
[51] &nbsp; &nbsp; &nbsp; &nbsp; </div >

What is the required declaration syntax for &nbsp; to allow the file to be parsed?

Thanks in advance for your time and consideration.
 
A

Arne Vajhøj

I'm pre-processing a file in an attempt to use the subject method, and receive the following error:

[Fatal Error] EXTRACT.TMP:51:23: The entity "nbsp" was referenced, but not declared.
Exception in thread "main" org.xml.sax.SAXParseException: The entity "nbsp" was referenced, but not declared.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at Extract.CmdLine(Extract.java:144)
at Extract.main(Extract.java:79)

The pertinent portion of the file being parsed follows:

[45]<div>
[46]<input type="hidden" name="cx" value="partner-pub-5436175752152469:m8vqbgi2n
21" />
[47]<input type="hidden" name="cof" value="FORID:10" />
[48]<input type="hidden" name="ie" value="ISO-8859-1" />
[49]<input type="text" name="q" size="55" />
[50]<input type="submit" name="sa" value="PCM Search" />
[51] &nbsp; &nbsp; &nbsp; &nbsp; </div >

What is the required declaration syntax for &nbsp; to allow the file to be parsed?

Entities should be defined in the DTD.

The above looks like XHTML, so maybe it will work if you add a proper
DOCTYPE at the top (I think XHTML DTD defines nbsp)..

Arne
 
S

Stanimir Stamenkov

Sun, 30 Dec 2012 11:30:24 -0800 (PST), /John L./:
I'm pre-processing a file in an attempt to use the subject method, and receive the following error:

[Fatal Error] EXTRACT.TMP:51:23: The entity "nbsp" was referenced, but not declared.
[...]
What is the required declaration syntax for &nbsp; to allow the file to be parsed?

As Arne Vajhøj points in another reply, there should be an XHTML
DOCTYPE declaration at the beginning of the document. Browsers
usually don't have problem processing XHTML containing entity
references from the XHTML DTD, even without DOCTYPE declaration,
because either:

1. The document is served as text/html, which is not processed as
XML at all, or;

2. Browsers have and refer to the XHTML DTD locally and are
automatically associating it automatically based on content-type:
application/xhtml+xml, or xmlns="http://www.w3.org/1999/xhtml" on
the root html element.

If the document you're trying to parse is at your control, you could:

1. Add the XHTML DOCTYPE declaration manually:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

or even:

<!DOCTYPE html
SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

You may still want to supply EntityResolver [1] to serve this
DTD from a local resource;

2. Add a DOCTYPE with a local subset containing just the necessary
entity declarations, like:

<!DOCTYPE html [
<!ENTITY nbsp " ">
]>

If you're parsing documents which don't have DOCTYPE declaration and
are not in your control, you may supply EntityResolver2
implementation which defines additional interface for just that purpose:

http://docs.oracle.com/javase/6/doc...nalSubset(java.lang.String, java.lang.String)

[1]
http://docs.oracle.com/javase/6/doc...setEntityResolver(org.xml.sax.EntityResolver)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top