UTF-8 BOM w/ ISO-8859-1 encoding pseudo attribute

E

Erik Wahlstrom

I have an XML document that includeds a UTF-8 BOM (0xEF 0xBB 0xBF).
The document is properly encoded as UTF-8. However the XMLDecl
encoding pseudo attribute indicates 'ISO-8859-1'. So how SHOULD the
XML processor handle this? Is it a fatal error? Clearly it cannot be
processed as ISO-8859-1 because the content is scrambled. It appears
the Xerces is parsing it according to ISo-8859-1 even though the BOM
is there. Hmm. Any sugestions? How should one handle this case?


Erik
 
R

Richard Tobin

I have an XML document that includeds a UTF-8 BOM (0xEF 0xBB 0xBF).
The document is properly encoded as UTF-8. However the XMLDecl
encoding pseudo attribute indicates 'ISO-8859-1'. So how SHOULD the
XML processor handle this?

XML (3rd edition) 4.3.3 says:

In the absence of information provided by an external transport
protocol (e.g. HTTP or MIME), it is a fatal error for an entity
including an encoding declaration to be presented to the XML
processor in an encoding other than that named in the declaration

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,522
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top