In a DTD, how do I specify that an element contains arbitrary othermarkup?

Discussion in 'XML' started by Simon Brooke, Jul 16, 2010.

  1. Simon Brooke

    Simon Brooke Guest

    I maintain a DTD which is used to specify XML documents which are mostly
    marked up in the dialect specified by the DTD ('ADL'), but in which there
    are three elements whose contents are intended to be arbitrary XHTML 1.1.

    Currently I've declared these as #PCDATA and simply ignore the parse
    errors this causes, but this isn't very good and I'm not proud of it.

    Is there a syntax for saying 'this element contains markup from this
    other namespace' (I assume not, since SGML doesn't know about XML
    namespaces?)? Otherwise, is there a syntax for saying 'this element
    contains arbitrary markup'?

    I know that I could include the XHTML DTD into my DTD but I'd prefer not
    to do this as I'd prefer to keep the namespaces separate - to be able to
    do:

    <adl:topmatter>
    <xhtml:div class="top">
    <xhtml:p>This appears at the top of every page</xhtml:p>
    </xhtml:div>
    </adl:topmatter>

    I know also that I really ought to move to an XSD schema, but I find them
    just too prolix and awkward to work with!

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 16, 2010
    #1
    1. Advertising

  2. Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    Simon Brooke wrote:

    > Is there a syntax for saying 'this element contains markup from this
    > other namespace' (I assume not, since SGML doesn't know about XML
    > namespaces?)? Otherwise, is there a syntax for saying 'this element
    > contains arbitrary markup'?


    You can say
    <!ELEMENT foo ANY>
    to allow any content for 'foo' elements but nevertheless any elements
    then put inside of 'foo' elements are supposed to be declared in the DTD.

    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 16, 2010
    #2
    1. Advertising

  3. Simon Brooke

    Simon Brooke Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On Fri, 16 Jul 2010 12:34:22 +0200, Martin Honnen wrote:

    > Simon Brooke wrote:
    >
    >> Is there a syntax for saying 'this element contains markup from this
    >> other namespace' (I assume not, since SGML doesn't know about XML
    >> namespaces?)? Otherwise, is there a syntax for saying 'this element
    >> contains arbitrary markup'?

    >
    > You can say
    > <!ELEMENT foo ANY>
    > to allow any content for 'foo' elements but nevertheless any elements
    > then put inside of 'foo' elements are supposed to be declared in the
    > DTD.


    H'mmmm....

    So, is there any mechanism for doing what I'm trying to do with a DTD, or
    am I in fact forced to change to a schema?

    (thanks for the answer, by the way)

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 16, 2010
    #3
  4. Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    Simon Brooke wrote:

    > So, is there any mechanism for doing what I'm trying to do with a DTD, or
    > am I in fact forced to change to a schema?


    I am not sure, for instance there is modularized XHTML
    http://www.w3.org/TR/xhtml-modularization/ which has DTD based modules
    and talks about using such modules to create a new DTD but I have not
    really mastered that stuff, frankly when it comes to composing elements
    from different namespaces I prefer using the XML syntax of XML schemas
    rather then all that parameterized entity stuff those DTD based examples
    use.


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 16, 2010
    #4
  5. Simon Brooke

    Simon Brooke Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On Fri, 16 Jul 2010 14:29:48 +0200, Martin Honnen wrote:

    > Simon Brooke wrote:
    >
    >> So, is there any mechanism for doing what I'm trying to do with a DTD,
    >> or am I in fact forced to change to a schema?

    >
    > I am not sure, for instance there is modularized XHTML
    > http://www.w3.org/TR/xhtml-modularization/ which has DTD based modules
    > and talks about using such modules to create a new DTD but I have not
    > really mastered that stuff, frankly when it comes to composing elements
    > from different namespaces I prefer using the XML syntax of XML schemas
    > rather then all that parameterized entity stuff those DTD based examples
    > use.


    OK, thanks!

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 16, 2010
    #5
  6. Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On 7/16/2010 8:08 AM, Simon Brooke wrote:
    > So, is there any mechanism for doing what I'm trying to do with a DTD, or
    > am I in fact forced to change to a schema?


    Well, you *could* give up DTD validation and move all the structural
    checks into your application code... or run without them, if you trust
    the folks generating your documents...

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
     
    Joe Kesselman, Jul 16, 2010
    #6
  7. Simon Brooke

    Simon Brooke Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On Fri, 16 Jul 2010 12:39:28 -0400, Joe Kesselman wrote:

    > On 7/16/2010 8:08 AM, Simon Brooke wrote:
    >> So, is there any mechanism for doing what I'm trying to do with a DTD,
    >> or am I in fact forced to change to a schema?

    >
    > Well, you *could* give up DTD validation and move all the structural
    > checks into your application code... or run without them, if you trust
    > the folks generating your documents...


    The benefit of a DTD (from my point of view) is largely that it allows
    well written XML editors to prompt the user as to what elements/
    attributes are legitimate at what point in the document. The document is
    'interpreted' by a set of XSL transforms which generate SQL, Hibernate,
    Velocity, C# and Java code. So in as much as the XSL will only transform
    valid markup it can be said to do the structural checks, although it
    generates errors only for a small number of unexpected constructs.

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 16, 2010
    #7
  8. Simon Brooke

    Peter Flynn Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On 16/07/10 10:23, Simon Brooke wrote:
    > I maintain a DTD which is used to specify XML documents which are mostly
    > marked up in the dialect specified by the DTD ('ADL'), but in which there
    > are three elements whose contents are intended to be arbitrary XHTML 1.1.
    >
    > Currently I've declared these as #PCDATA and simply ignore the parse
    > errors this causes, but this isn't very good and I'm not proud of it.


    Indeed :)

    > Is there a syntax for saying 'this element contains markup from this
    > other namespace' (I assume not, since SGML doesn't know about XML
    > namespaces?)? Otherwise, is there a syntax for saying 'this element
    > contains arbitrary markup'?


    Not in DTDs, as such. When the concept of namespaces was first mooted,
    before they were even officially called namespaces, there was an
    assumption that they might let you construct a DTD from modules "called"
    from existing other DTDs, in the sense of saying "At this point, I'll
    have lists done the way DocBook does them, tables done the way HTML does
    then, sections done the way TEI does them, etc etc", but the lack of any
    coherent way of modularising the world's DTDs put a stop to that.

    You can of course declare an element type name "xhtml:h1" in a DTD, and
    it is perfectly validatable with any modern XML validator. So one
    approach is to copy and edit an ad-hoc compact version of as much of the
    XHTML body content model as you wish to allow, and make it the content
    model of your three element types, editing it to be as loose or tight as
    you need.

    Or, as Martin suggests, use ANY as the content model, having declared
    all the necessary element types that will occur.

    > I know that I could include the XHTML DTD into my DTD but I'd prefer not
    > to do this as I'd prefer to keep the namespaces separate


    But you can:

    <!DOCTYPE adl:topmatter [
    <!ELEMENT adl:topmatter (xhtml:div)>
    <!ELEMENT xhtml:div (xhtml:p)+>
    <!ATTLIST xhtml:div class (top|middle|bottom) #IMPLIED>
    <!ELEMENT xhtml:p (#PCDATA)>
    ]>
    <adl:topmatter>
    <xhtml:div class="top">
    <xhtml:p>This appears at the top of every page</xhtml:p>
    </xhtml:div>
    </adl:topmatter>

    $ onsgmls -wxml -e -g -s -u /usr/share/xml/declaration/xml.dcl test.xml
    onsgmls:/usr/share/xml/declaration/xml.dcl:1:W: SGML declaration was not
    implied
    Compilation finished at Sun Jul 18 23:10:40
    $ rxp test.xml >/dev/null
    $

    Just be aware that although the colon is accepted as a valid name
    character in DTDs, it does not get interpreted as a namespace delimiter.

    > I know also that I really ought to move to an XSD schema, but I find them
    > just too prolix and awkward to work with!


    Many people would agree with you. It is a fallacy that they are a
    requirement to use XML. But you should definitely consider expressing
    your grammar in RelaxNG: that way you can generate a DTD or an W3C
    Schema as and when needed, but use a more uman-friendly language for
    defining it.

    ///Peter
    --
    XML FAQ: http://xml.silmaril.ie/
     
    Peter Flynn, Jul 18, 2010
    #8
  9. Simon Brooke

    Simon Brooke Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On Sun, 18 Jul 2010 23:18:20 +0100, Peter Flynn wrote:

    >> I know also that I really ought to move to an XSD schema, but I find
    >> them just too prolix and awkward to work with!

    >
    > Many people would agree with you. It is a fallacy that they are a
    > requirement to use XML. But you should definitely consider expressing
    > your grammar in RelaxNG: that way you can generate a DTD or an W3C
    > Schema as and when needed, but use a more uman-friendly language for
    > defining it.


    Thank you very much indeed. I had not considered the possibility of using
    RelaxNG - I'd heard of it, but didn't have a clear idea of what it was or
    what its benefits were. I'll have a look.

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 19, 2010
    #9
  10. Simon Brooke

    Simon Brooke Guest

    Re: In a DTD, how do I specify that an element contains arbitraryother markup?

    On Mon, 19 Jul 2010 11:40:23 +0000, Simon Brooke wrote:

    > On Sun, 18 Jul 2010 23:18:20 +0100, Peter Flynn wrote:
    >
    >>> I know also that I really ought to move to an XSD schema, but I find
    >>> them just too prolix and awkward to work with!

    >>
    >> Many people would agree with you. It is a fallacy that they are a
    >> requirement to use XML. But you should definitely consider expressing
    >> your grammar in RelaxNG: that way you can generate a DTD or an W3C
    >> Schema as and when needed, but use a more uman-friendly language for
    >> defining it.

    >
    > Thank you very much indeed. I had not considered the possibility of
    > using RelaxNG - I'd heard of it, but didn't have a clear idea of what it
    > was or what its benefits were. I'll have a look.


    And to follow up to myself (poor form, I know), I've just been playing
    with Trang[1], and I'm /very/ impressed. It has converted my DTD into
    RelaxNG, preserving the comments (important!), and the RelaxNG syntax is
    indeed very readable (I mildly prefer the RNG XML syntax to the RNC
    'compact' syntax). This looks very promising.

    [1] http://code.google.com/p/jing-trang/

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 19, 2010
    #10
  11. Simon Brooke

    Simon Brooke Guest

    Once more with RelaxNG: How do I specify markup from a differentdialect of XML?

    On Fri, 16 Jul 2010 09:23:08 +0000, Simon Brooke wrote:

    > I maintain a DTD which is used to specify XML documents which are mostly
    > marked up in the dialect specified by the DTD ('ADL'), but in which
    > there are three elements whose contents are intended to be arbitrary
    > XHTML 1.1.
    >


    Peter Flynn helpfully pointed me to RelaxNG, which does indeed prove a
    very nice syntax for specifying a grammar (I'm using the XML syntax which
    I find easier than the 'compact' syntax, but as they're interchangeable
    that's preference. I see that RelaxNG has a mechanism for referencing
    external documents:

    http://www.relaxng.org/tutorial-20011203.html#IDA04YR

    I also found on W3C's website a specification - possibly out of date - of
    XHTML 2.0 as a series of RelaxNG modules:

    http://www.w3.org/TR/2003/WD-xhtml2-20030506/relax_module_defs.html

    (I couldn't find anywhere these were downloadable as a zip or similar,
    but I have copied and pasted into a set of working files to experiment
    with).

    However, I haven't worked out how these are supposed to work together
    since they clearly depend on one another but make no use either of the
    'externalRef' mechanism or of the 'include' mechanism. I do note that
    they make heavy use of the 'combine' mechanism.

    The RelaxNG tool I'm currently using, trang, allows one input grammar
    file only - it doesn't permit several input grammar files to be
    specified. So I haven't yet worked out how to use multiple XHTML2 modules
    together. Also, when I try trang on .rng files which contain
    externalRefs, I get:

    simon@gododdin:~/workspace/adl/schemas$ java -jar /home/simon/Downloads/
    useful/trang-20091111/trang.jar adl-1.4.rng test.xsd
    /home/simon/workspace/adl/schemas/adl-1.4.rng:1011:52: error: sorry,
    externalRef is not yet supported

    so I don't know whether I'm doing what I'm doing right. But what I'm
    trying to do is specify that (for example) an ADL headmatter element may
    contain xhtml script, link, meta and style elements, so I in the adl.rng
    I have:

    <grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
    ns="http://bowyer.journeyman.cc/adl/unstable/adl/">
    ....
    <define name="headmatter">
    <element name="headmatter">
    <ref name="attlist.headmatter"/>
    <externalRef href="permitted-html-head.rng"/>
    </element>
    </define>
    <define name="attlist.headmatter" combine="interleave">
    <empty/>
    </define>

    and in a separate file 'permitted-html-head.rng' I have

    <?xml version="1.0" encoding="UTF-8"?>
    <grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
    ns="http://www.w3.org/2002/06/xhtml2/">

    <start>
    <ref name="permitted-xhtml-head" />
    </start>

    <define name="permitted-xhtml-head">
    <zeroOrMore>
    <choice>
    <element name="content">
    <externalRef href="xhtml-2/xhtml-scripting.rng" />
    <externalRef href="xhtml-2/xhtml-link.rng" />
    <externalRef href="xhtml-2/xhtml-meta.rng" />
    <externalRef href="xhtml-2/xhtml-style.rng" />
    </element>
    </choice>
    </zeroOrMore>
    </define>
    </grammar>

    What I hope this is specifying is, e.g.:

    <adl:headmatter>
    <adl:content>
    <xhtml:link rel="stylesheet" type="text/css" href="styles.css" />
    <xhtml:meta name="generator"
    content="Application description language framework" />
    </adl:content>
    </adl:headmatter>

    I'd much rather not have the <adl:content> tag in there but I haven't yet
    worked out a way of getting rid of it. I do specifically want to keep the
    namespaces 'adl:' and 'xhtml:' distinct.

    So, the questions:

    Given that trang does not (yet) handle externalRefs, is there a tool I
    can use which will translate a RelaxNG grammar using externalRefs into an
    XSD schema?

    And, generally, am I on the right lines?

    --

    ;; Semper in faecibus sumus, sole profundam variat
     
    Simon Brooke, Jul 23, 2010
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ronald Fischer
    Replies:
    4
    Views:
    1,806
    Ronald Fischer
    Mar 17, 2005
  2. ian mayo
    Replies:
    2
    Views:
    436
    ian mayo
    Aug 11, 2003
  3. unwiseone
    Replies:
    1
    Views:
    490
    Peter Flynn
    Aug 10, 2005
  4. Honestmath
    Replies:
    5
    Views:
    590
    Honestmath
    Dec 13, 2004
  5. Replies:
    1
    Views:
    305
    Mike Meyer
    Oct 28, 2005
Loading...

Share This Page