Peter Hansen said:
Hmm... so it's your opinion that *all* XML parsers must handle *all*
aspects of XML?
XML is clear on what a Parser *must* support. The full character
production is one of those things. From XML 1.0, section 2.2:
Character Range
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
There is no "option" to not support characters greater than #xFF. XML
parsers *can* leave off handling some aspects of XML, external DTD
subsets, for example, but you can not be as fundamentally
non-conformant as PyRXP and still call yourself an XML parser.
This is not just an academic matter. There are a *vast* number of
useful and heavily-used characters of code point higher than U+FF and
if parsers decided on a whim to pick and choose what to support the
result would be complete and utter chaos.
If not, I think you should back off on the criticism
of PyRXP as being "not an XML parser" and simply point out that it
doesn't handle all aspects of XML because it is intended to provide
a very fast/heavily optimized approach to parsing only certain kinds
of XML. It's a valid choice to do so, though of course if PyRXP is
promoted as a "full" XML solution that might be inaccurate.
PyRXP is not an XML parser. It's that simple. I stand by that veru
strong satement, and I'd be surprised if XML expert refusaes to
corroborate it.
I do want to point out that PyRXPU does seem to be a proper XML
parser, and is what people should use instead if they like the
ReportLab products.
Of course if yu don't really need an XML parser, feel free to use
PyRXP. Just don't call it what it isn't.
--Uche
http://uche.ogbuji.net