Can someone confirm my DTD & namespace solution?

Discussion in 'XML' started by D McGilvray, Jun 27, 2007.

  1. D McGilvray

    D McGilvray Guest

    Hi, I've been researching for a while to understand namespaces to find a
    solution to my problem. I now have a solution which works in my test
    examples, but before I roll it out through my software, I was hoping
    someone might tell me if I've fallen into any traps.

    Particularly, I would like to know what type of parser I am restricted
    to using - will it work for any fully conforming validating parser? Or
    are there certain levels of conformity to the specification?


    I want to include another XML file within my own. The other DTD has no
    support for namespaces. I want to avoid naming conflicts, so I need
    namespaces. However, I also require validation because I'll be using
    ID/IDREF references extensively.

    Here is an example xml file. inc represents the included structure. It
    and it's children are not bound to any namespaces. The rest of the
    elements are part of my own structure, and belong to the namespace ns.


    <ns:doc xmlns:ns="http://dougie/test/">
    <ns:a/>
    <inc/>
    </ns:doc>

    This is the rather complex DTD for the included structure contained in
    'inc.dtd':

    <!ELEMENT inc ANY >


    Below is the DTD for my document, which contains inc. The other DTD is
    included, which defines its structure. The parameter entity nsp defines
    the prefix for the namespace so that it can be overridden. However, when
    this is entity is expanded, it is surrounded by white space unless it is
    expanded within another parameter entity. So any time I want to use the
    prefix with a tagname or attribute I have to combine the text within
    another param entity. Hence, I have to create the doc entity and
    substitute that for the element tag name (rather than writing %nsp;:doc
    straight into the ELEMENT definition). And the same for 'a' and the
    namespace declaration within the doc element.


    <!-- Include other DTD -->
    <!ENTITY % incdtd SYSTEM 'inc.dtd' > %incdtd;

    <!-- This defines the prefix for the namespace -->
    <!ENTITY % nsp 'ns' >

    <!-- This combines prefix with namespace attribute name -->
    <!ENTITY % nspdec 'xmlns:%nsp;' >

    <!-- This combines the prefix with the tagname -->
    <!ENTITY % doc '%nsp;:doc'>
    <!ELEMENT %doc; ( ns:a , inc ) >
    <!ATTLIST %doc;
    %nspdec; CDATA #REQUIRED >


    <!ENTITY % a '%nsp;:a'>
    <!ELEMENT %a; ANY >


    Specifying my namespace prefix in the entity nsp means that the
    namespace prefix for my document can be overridden in another DTD like so:

    <!ENTITY % nsp 'ns' >

    <!ENTITY % otherdtd SYSTEM 'test.dtd' > %otherdtd;


    Does this all look kosher? It works in my test example, but I don't have
    enough experience to say that this complexity is worthwhile.

    Many thanks for looking,
    Dougie
    D McGilvray, Jun 27, 2007
    #1
    1. Advertising

  2. DTDs predate namespaces, and are unaware of them... so if you're trying
    to use this combination, you need to write your instance documents to
    use EXACTLY the prefixes (or lack thereof) called for by the DTD, and to
    declare namespace bindings only where the DTD says they can be declared.

    > I want to include another XML file within my own. The other DTD has no
    > support for namespaces. I want to avoid naming conflicts, so I need
    > namespaces.


    The DTDs can and will conflict if they attempt to declare the same
    element or attribute names. The only way to avoid that is to have at
    least one of them use a prefix, which requires designing the DTD to
    expect that prefix.

    The kluge of using parameter entities to control which prefix a DTD is
    using is ugly, but does work as long as you are careful to make sure the
    instance document uses that prefix and only that prefix when referring
    to that namespace. And, yes, you need to do a bit of ugly magic to keep
    whitespace from being introduced next to the parameter entity, as you've
    discovered.

    > Does this all look kosher? It works in my test example, but I don't have
    > enough experience to say that this complexity is worthwhile.


    Yes, it works. But I would suggest that forcing DTDs to deal with
    namespaces is not particularly worthwhile these days. XML Schemas are
    fully namespace-aware, and will handle all of this without requiring so
    much magic or being so fragile.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
    Joseph Kesselman, Jun 27, 2007
    #2
    1. Advertising

  3. D McGilvray

    D McGilvray Guest

    Joseph Kesselman wrote:
    > DTDs predate namespaces, and are unaware of them... so if you're trying
    > to use this combination, you need to write your instance documents to
    > use EXACTLY the prefixes (or lack thereof) called for by the DTD, and to
    > declare namespace bindings only where the DTD says they can be declared.


    Ahhhh, a penny just dropped - because parsers are unaware of namespaces,
    they won't recognise default namespaces? So the namespace will have to
    be explicitly specified for every element which requires it? I actually
    hadn't thought of that, but I can live with it.

    >> Does this all look kosher? It works in my test example, but I don't
    >> have enough experience to say that this complexity is worthwhile.

    >
    > Yes, it works. But I would suggest that forcing DTDs to deal with
    > namespaces is not particularly worthwhile these days. XML Schemas are
    > fully namespace-aware, and will handle all of this without requiring so
    > much magic or being so fragile.


    Unfortunately, I have no control over the documents which will be
    included and, from the sounds of it they have no intention of switching
    to Schema's any time soon. It is quite important to remain faithful to
    the included document's DTD as an independent document, and valid as an
    external entity included within my document. Perhpas later I'll look to
    see if there's a way to combine DTD's and Schemas, but I'm happy to use
    this for now.

    Thanks for your help,
    Dougie
    D McGilvray, Jun 27, 2007
    #3
  4. D McGilvray wrote:
    > Ahhhh, a penny just dropped - because parsers are unaware of namespaces,
    > they won't recognise default namespaces?


    That's not what I said.

    Modern parsers are aware of namespaces (or can be told to be so), and
    will do the right things with them as far as reading and processing the
    document's contents, including inheriting namespace bindings and
    applying the default namespace (when one has been asserted). However,
    "the right things" doesn't suffice for DTD validation.

    DTDs are *NOT* aware of namespaces. As far as the DTD is concerned, a
    namespace declaration is just an attribute, and a prefix is just part of
    the name of the element or attribute. DTDs will not correctly handle the
    case where the name isn't exactly what the DTD says it has to be, so
    they won't handle cases where a different prefix was used (or, as is not
    uncommon, several prefixes were used with the intent that they refer to
    the same namespace).

    > Unfortunately, I have no control over the documents which will be
    > included and, from the sounds of it they have no intention of switching
    > to Schema's any time soon.


    If you really insist on doing DTD validation and namespaces at the same
    time, using parameter entities will at least let you explicitly avoid
    the cases where they decide to use the same prefix you wanted to use,
    and still get correct behavior from both the DTD and the document.

    But that is going to force you to deal with that explicit avoidance,
    which strikes me as excessively fragile. So I'd still be inclined to ask
    them them to reconsider this, or to let you process their data as
    well-formed rather than trying to do combined DTD validation on the
    composite document.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
    Joseph Kesselman, Jun 27, 2007
    #4
  5. D McGilvray

    D McGilvray Guest

    Joseph Kesselman wrote:
    > D McGilvray wrote:
    >> Ahhhh, a penny just dropped - because parsers are unaware of
    >> namespaces, they won't recognise default namespaces?

    >
    > That's not what I said.
    >
    > Modern parsers are aware of namespaces (or can be told to be so), and
    > will do the right things with them as far as reading and processing the
    > document's contents, including inheriting namespace bindings and
    > applying the default namespace (when one has been asserted). However,
    > "the right things" doesn't suffice for DTD validation.


    Sorry, I explained myself terribly there. I meant to refer to
    validation, not parsing. Declaring a default namepsace in my document
    (say in a sibling of the included document) doesn't allow me to leave
    out the prefix for children of that node.


    >
    > DTDs are *NOT* aware of namespaces. As far as the DTD is concerned, a
    > namespace declaration is just an attribute, and a prefix is just part of
    > the name of the element or attribute. DTDs will not correctly handle the
    > case where the name isn't exactly what the DTD says it has to be, so
    > they won't handle cases where a different prefix was used (or, as is not
    > uncommon, several prefixes were used with the intent that they refer to
    > the same namespace).
    >
    >> Unfortunately, I have no control over the documents which will be
    >> included and, from the sounds of it they have no intention of
    >> switching to Schema's any time soon.

    >
    > If you really insist on doing DTD validation and namespaces at the same
    > time, using parameter entities will at least let you explicitly avoid
    > the cases where they decide to use the same prefix you wanted to use,
    > and still get correct behavior from both the DTD and the document.
    >
    > But that is going to force you to deal with that explicit avoidance,
    > which strikes me as excessively fragile. So I'd still be inclined to ask
    > them them to reconsider this, or to let you process their data as
    > well-formed rather than trying to do combined DTD validation on the
    > composite document.
    >

    People bring up the question of moving to Schema on the representation's
    mailing list fairly frequently. Each discussion is entertained less than
    the previous one.
    However, I am the only one forcing combined validation. I have multiple
    hierarchies referencing the same data using ID/IDREFS so I really need
    these validated because inconsistencies could mean hours investigating
    the XML by hand :O.
    I appreciate everything you say, it is fragile. Perhaps in the long-term
    I could write an application which checks validity rather than relying
    on a DTD and a validator. My priority now though, is to get a first
    draft finalised with minimal further development of tools.

    Thanks very much for the discussion it has proven very helpful.

    Cheers,
    Dougie
    D McGilvray, Jun 28, 2007
    #5
  6. D McGilvray wrote:
    > Sorry, I explained myself terribly there. I meant to refer to
    > validation, not parsing. Declaring a default namepsace in my document
    > (say in a sibling of the included document) doesn't allow me to leave
    > out the prefix for children of that node.


    The DTD will have been written to either require a specific prefix, or
    to require that no prefix be present. Whatever it says, that's how you
    have to write the instance document. The DTD will also constrain where
    namespace declarations can occur, and may or may not enforce one (which
    is the most common kluge for attempting to make DTDs play halfway nicely
    with namespaces).

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
    Joseph Kesselman, Jun 28, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joseph Tilian
    Replies:
    0
    Views:
    340
    Joseph Tilian
    Dec 21, 2004
  2. Ronald Fischer
    Replies:
    4
    Views:
    1,743
    Ronald Fischer
    Mar 17, 2005
  3. test
    Replies:
    2
    Views:
    1,999
    Oliver Wong
    Jul 28, 2006
  4. Aaron Bertrand - MVP

    Can someone please confirm this behavior?

    Aaron Bertrand - MVP, Dec 18, 2003, in forum: ASP General
    Replies:
    7
    Views:
    189
    Mark Schupp
    Dec 18, 2003
  5. kazaam
    Replies:
    3
    Views:
    128
    Logan Capaldo
    Oct 6, 2007
Loading...

Share This Page