Wellformedness in Xerces

I

indo3

Hello

Is it right that a XML document is only wellformed if all
entity references can be resolved in the document? this would
mean that Xerces needs to retrieve external DTD files
to get entity definitions. I always thought
that to check on wellformedness, no external files would
be needed. And IF the document is wellformed and you
retrieve a DOM tree, then Xerces has replaced already all
entities with the "replacing text" or how it is called.
Is Xerces required to do this during the wellformedness-checking as
requirement of the XML 1.0 specification or can this
feature be turned off so that the entities are somehow
encoded in the text nodes of the DOM tree? And if this is
possible, are there any specs which define such an encoding?
I have the expression "unparsed entity" in mind, and this should
need to be represented in a DOM tree somehow (maybe I confused
some terms now..).

THANKS
 
K

Kenneth Stephen

indo3 said:
Hello

Is it right that a XML document is only wellformed if all
entity references can be resolved in the document? this would
Hi,

Yes. Entity references have to be resolved for an XML document to be
well-formed. It is possible to define the entity definitions within the
document itself - so that there is no need to go get an external DTD
document to verify well-formedness.

Regards,
Kenneth
 
R

Richard Tobin

indo3 said:
Is it right that a XML document is only wellformed if all
entity references can be resolved in the document?

Not exactly. In some circumstances an undeclared entity is only a
validity error. The idea of this is that a minimal parser doesn't
have to read anything but the main document (the document entity), so
if there is an external DTD it won't know whether it contains
declarations for entities.

If you want to check that all entities are defined, you need to use
a validating parser (or a non-validating parser that happens to read
the external subset).

If the document doesn't have an external DTD, or is declared be be
standalone, then all undeclared entities are a well-formedness error.
I have the expression "unparsed entity" in mind

This is something quite different. Its main use is to refer to things
which aren't XML, such as JPEG images. You can only refer to unparsed
entities as attribute values, and a parser will not attempt to read
them.

-- Richard
 
K

Kenneth Stephen

Richard said:
Not exactly. In some circumstances an undeclared entity is only a
validity error. The idea of this is that a minimal parser doesn't
have to read anything but the main document (the document entity), so
if there is an external DTD it won't know whether it contains
declarations for entities.
Richard,

I see that there is more to this than I had previously thought. Thanks
for the enlightening answer.

Kenneth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top