including xml entities with their own doctypes

Discussion in 'XML' started by the.computational biologist, Jul 21, 2011.

  1. hi all - this is a pretty newbie question, so sorry if it's easily
    found (though i've searched for a while and can't find a definitive
    answer)...

    i have a DTD (doctype), say A:

    <!DOCTYPE my_container [
    <!ELEMENT my_container (my_parent_item)>
    <!ELEMENT my_parent_item (my_child_item)>
    <!ELEMENT my_child_item (EMPTY)>
    ]>

    now i'd like to have an example of this container as (something like):

    <!DOCTYPE my_container SYSTEM "my_container.dtd" [
    <!ENTITY include_file SYSTEM "my_parent_item.xml">
    ]>
    <my_container>&include_file;</my_container>

    the problem is, i'd like this "my_parent_item.xml" file to be "stand-
    alone" in the sense that it will have it's own DTD, with a DOCTYPE of
    my_parent_item (i.e. i don't expect the my_parent_item to have to know
    that it may be inside a my_container).
    furthermore, the my_parent_item DOCTYPE definition may provide
    additional features about the my_parent_item object that my_container
    didn't know about (e.g. maybe a "name" attribute).

    when a validator processes the &include_file; entity, will i wind up
    with an error due to multiple DOCTYPE declarations (i.e. will most
    validators try to read the DOCTYPE of an *included* XML file)?

    thanks for any insight into this, and even some links in the right
    direction with such an example would be wonderfully appreciated.

    cheers!
     
    the.computational biologist, Jul 21, 2011
    #1
    1. Advertising

  2. * the.computational biologist wrote in comp.text.xml:
    ><!DOCTYPE my_container SYSTEM "my_container.dtd" [
    > <!ENTITY include_file SYSTEM "my_parent_item.xml">
    >]>
    ><my_container>&include_file;</my_container>
    >
    >the problem is, i'd like this "my_parent_item.xml" file to be "stand-
    >alone" in the sense that it will have it's own DTD, with a DOCTYPE of
    >my_parent_item (i.e. i don't expect the my_parent_item to have to know
    >that it may be inside a my_container).


    That is not possible, XML does not permit a document type declaration
    in an entity's replacement text; it's, to some extent, a literal text
    substitution mechanism. Standards like XInclude allow this, but you'd
    then need tools that support XInclude.

    >furthermore, the my_parent_item DOCTYPE definition may provide
    >additional features about the my_parent_item object that my_container
    >didn't know about (e.g. maybe a "name" attribute).


    You can split the document type definition over multiple files, but
    you cannot use the type definitions from a different XML document.
    Technically it might be possible to arrange a RELAX NG schema like
    that, but you'd be dealing with elements instead of special language
    constructs like DTDs.

    >thanks for any insight into this, and even some links in the right
    >direction with such an example would be wonderfully appreciated.


    http://www.w3.org/TR/xml/#NT-extParsedEnt has the requirements for
    external parsed entities, apart from a text declaration they need
    to match the `content` production, which allows for character data
    and elements (does not require a single root element), but no docu-
    ment type declaration.
    --
    Björn Höhrmann · mailto: · http://bjoern.hoehrmann.de
    Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
    25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
     
    Bjoern Hoehrmann, Jul 21, 2011
    #2
    1. Advertising

  3. thanks for the info!

    it's a bit disappointing to find out, though.
    with my particular application, i'd like to re-use some pieces of data
    between XML documents, but have those data stand on their own, too.
    specifically i'm working on annotated matrices, where the matrix
    itself is made up of row information, col information, and the value-
    containing cells.
    since i'll be concatenating matrices often, sharing the row and column
    keys between matrices is useful.
    but furthermore, there is often semi-structured annotation attached to
    each row and column that i'd like to be able to retrieve separately
    from the value-holding matrices (i.e. displaying info about the
    columns and rows doesn't require knowing what the cell values are for
    any particular matrix).

    i had envisioned separate column annotation and row annotation
    documents (i.e. with their own DOCTYPE declaration), and then
    embedding those column and row files into the individual matrices
    (with *their* own DOCTYPE declaration).

    ah well, maybe this is why proprietary (and thus less-exchangeable)
    file formats will never completely die off :-/

    cheers, and thanks again for the info and link!

    -m

    On Jul 21, 3:09 pm, Bjoern Hoehrmann <> wrote:
    > * the.computational biologist wrote in comp.text.xml:
    >
    > ><!DOCTYPE my_container SYSTEM "my_container.dtd" [
    > >  <!ENTITY include_file SYSTEM "my_parent_item.xml">
    > >]>
    > ><my_container>&include_file;</my_container>

    >
    > >the problem is, i'd like this "my_parent_item.xml" file to be "stand-
    > >alone" in the sense that it will have it's own DTD, with a DOCTYPE of
    > >my_parent_item (i.e. i don't expect the my_parent_item to have to know
    > >that it may be inside a my_container).

    >
    > That is not possible, XML does not permit a document type declaration
    > in an entity's replacement text; it's, to some extent, a literal text
    > substitution mechanism. Standards like XInclude allow this, but you'd
    > then need tools that support XInclude.
    >
    > >furthermore, the my_parent_item DOCTYPE definition may provide
    > >additional features about the my_parent_item object that my_container
    > >didn't know about (e.g. maybe a "name" attribute).

    >
    > You can split the document type definition over multiple files, but
    > you cannot use the type definitions from a different XML document.
    > Technically it might be possible to arrange a RELAX NG schema like
    > that, but you'd be dealing with elements instead of special language
    > constructs like DTDs.
    >
    > >thanks for any insight into this, and even some links in the right
    > >direction with such an example would be wonderfully appreciated.

    >
    > http://www.w3.org/TR/xml/#NT-extParsedEnthas the requirements for
    > external parsed entities, apart from a text declaration they need
    > to match the `content` production, which allows for character data
    > and elements (does not require a single root element), but no docu-
    > ment type declaration.
    > --
    > Bj rn H hrmann mailto:://bjoern.hoehrmann.de
    > Am Badedeich 7 Telefon: +49(0)160/4415681http://www.bjoernsworld.de
    > 25899 Dageb ll PGP Pub. KeyID: 0xA4357E78http://www.websitedev.de/
     
    the.computational biologist, Jul 22, 2011
    #3
  4. "the.computational biologist" <>
    writes:

    > it's a bit disappointing to find out, though.
    > with my particular application, i'd like to re-use some pieces of data
    > between XML documents, but have those data stand on their own, too.


    Define external entities (i.e., without doctype) that contain document
    data, and then a separate document (i.e., with doctype) for each
    possible assembly of entities you're interested in. Can be cumbersome.

    Or use something like XInclude, provided the processors you use support
    it. Actually, depending on how you are going to process the data, there
    may be easier solutions. For instance, XSLT processors are able to
    include external documents; typically, an XML document may refer to
    other documents (not entities) in attribute values, which are then
    included when necessary.

    -- Alain.
     
    Alain Ketterlin, Jul 22, 2011
    #4
  5. the.computational biologist

    Peter Flynn Guest

    On 22/07/11 19:53, the.computational biologist wrote:
    > thanks for the info!
    >
    > it's a bit disappointing to find out, though.


    I'm afraid it's part of the standard: this is one of those restrictions
    inherited from SGML which has been a PITA for a couple of decades.

    One way round it is to maintain the document fragments with their own
    Document Type Declaration (as the first line) so that you can edit them
    stand-alone, but export them without the top line to a related filename
    after each edit, eg

    $ tail -n +2 matrixfoo.xml >MatrixFoo.xml

    If you use a programmable XML editor, you may be able to program it to
    do this automatically each time you save-and-close.

    > ah well, maybe this is why proprietary (and thus less-exchangeable)
    > file formats will never completely die off :-/


    There is no reason why this kind of file management shouldn't be built
    into existing editors, but I don't know any which do it off-the-shelf.
    I suspect it would be relatively trivial to do this for Emacs and for
    the Arbortext Editor. *Lots* of people who still work with DTDs would be
    very happy to see that.

    ///Peter
     
    Peter Flynn, Jul 23, 2011
    #5
  6. XML Schemas are more flexible in this respect than DTD (as they are in
    other respects, which is why folks are being encouraged to move in that
    direction.) Of course, the declaration for the element into which you
    insert the "foreign" element must say that this is permitted.

    However, Schemas don't handle defining parsed entities. As others have
    said, you'd need to use XInclude or some similar mechanism to achieve
    that combination.
     
    Joe Kesselman, Jul 23, 2011
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Wendy Shuya

    Comments and DOCTYPES

    Wendy Shuya, Apr 1, 2004, in forum: XML
    Replies:
    2
    Views:
    403
    Richard Tobin
    Apr 2, 2004
  2. chak
    Replies:
    1
    Views:
    523
    Richard Tobin
    May 13, 2004
  3. JWL

    Mixing doctypes

    JWL, Oct 13, 2006, in forum: HTML
    Replies:
    5
    Views:
    1,230
    David Dorward
    Oct 16, 2006
  4. UKuser
    Replies:
    2
    Views:
    381
    C A Upsdell
    Feb 4, 2009
  5. Jim Higson
    Replies:
    3
    Views:
    249
    Eric Amick
    Jul 25, 2004
Loading...

Share This Page