XSD vs RelaxNG

Discussion in 'XML' started by Frank Greco, Jan 31, 2011.

  1. Frank Greco

    Frank Greco Guest

    I'm sure more people use XSD than RelaxNG. But I'm curious if its
    worth investigating RelaxNG as an alternative.
    I see there is an O'Reilly book which indicates a decent audience (in
    theory I guess). But how many are actually using RelaxNG?

    The bottom line is I'm hitting constraints on XSD sequencing. Its
    either a forced order of elements with each element potentially
    occurring 'n' time, or no-ordering with each element only occurring 0
    or 1 times. I need no-ordering with each element potentially occurring
    n times, which is not allowed for some reason.

    Thanks

    F
    Frank Greco, Jan 31, 2011
    #1
    1. Advertising

  2. On 1/31/2011 1:57 PM, Frank Greco wrote:
    > need no-ordering with each element potentially occurring n times, which
    > is not allowed for some reason.


    Historically, it was very difficult to generate a finite state machine
    which would efficiently validate that sort of constraint in the general
    case, so parsers of all kinds (not just XML validation) tended to be
    written to forbid it. There are newer techniques which will handle it,
    but since in practice there is almost never a NEED for that kind of
    constraint the folks defining languages don't like writing such a
    requirement into their specifications. Also, standards groups worry
    about how well the data will interchange with other systems (eg
    databases) which don't support that kind of constraint.

    Remember, schema or DTD is only the first level of validation --
    higher-order syntax checking, if you will -- and you will often
    want/need to impose additional checking at the application level. If you
    need this constraint, consider leaving the count unconstrained at the
    schema level and imposing limits in your application code.

    There's always the Horribly Ugly solution of having the schema
    explicitly spell out all the possible combinations/orderings of the
    children.

    But the best answer is to reconsider your document design -- ask whether
    there's another way to represent your data which doesn't require this
    combination of unordered and frequency-limited. In 99.5% of the cases
    I've seen, that requirement evaporates when you stop to think about how
    the documents will actually be produced and used and whether some of the
    entries should be grouped together under parent elements rather than all
    being siblings.


    As far as RelaxNG goes: What I heard about it early on was that it was a
    lot more straightforward for many of the common cases than XML
    Schemas... but I don't know what its status is or how heavily it has
    been stress-tested since then. I do know that the single biggest issue
    remains portability -- RelaxNG may be a fine choice if you have complete
    control over both the document's generator and consumer, but it may not
    help you much when working with other folks who don't have a
    RelaxNG-enabled system. It's reportedly possible to generate an XML
    Schema from a RelaxNG document spec... but that involves giving up any
    constraints that Schema doesn't easily support.


    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Jan 31, 2011
    #2
    1. Advertising

  3. Frank Greco wrote:
    > I'm sure more people use XSD than RelaxNG. But I'm curious if its worth
    > investigating RelaxNG as an alternative.
    > I see there is an O'Reilly book which indicates a decent audience (in
    > theory I guess). But how many are actually using RelaxNG?
    >
    > The bottom line is I'm hitting constraints on XSD sequencing. Its either
    > a forced order of elements with each element potentially occurring 'n'
    > time, or no-ordering with each element only occurring 0 or 1 times. I
    > need no-ordering with each element potentially occurring n times, which
    > is not allowed for some reason.


    Note that work is on the way to specify and implement (in Apache Xerces
    and in Saxon so far I think) version 1.1 of the W3C XML schema language.
    As far as I understand http://www.w3.org/TR/xmlschema11-1/#ch_models:
    --- quote ---------------------------
    Several of the constraints imposed by version 1.0 of this specification
    on all-groups have been relaxed:

    ....

    2.
    The value of maxOccurs may now be greater than 1 on particles in
    an all group. The elements which match a particular particle need not be
    adjacent in the input.
    --- quote ---------------------------

    you might be able to achieve what you want with version 1.1. But I don't
    have time to test that now.


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Jan 31, 2011
    #3
  4. Frank Greco

    Frank Greco Guest

    Thanks for the reply Joe.

    Essentially I need to have an unordered list of complex types with a
    potential for N entries of one of the types, for example I need
    something like this:

    <book>
    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>

    <title>Book Title One</title>
    <author>Joe Blog</author>
    <price>10.50</price>
    </book>

    I want the user to be allowed to enter any of the complex types without
    any ordering. That is, the following should be legal:

    <book>
    <title>Book Title One</title>
    <author>Joe Blog</author>

    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>

    <price>10.50</price>
    </book>

    .... along with other combinations.

    XSD 1.0 does not seem to allow me to have this.

    Any suggestions are greatly appreciated.

    F





    On 2011-01-31 15:48:24 -0500, Joe Kesselman said:

    > On 1/31/2011 1:57 PM, Frank Greco wrote:
    >> need no-ordering with each element potentially occurring n times, which
    >> is not allowed for some reason.

    >
    > Historically, it was very difficult to generate a finite state machine
    > which would efficiently validate that sort of constraint in the general
    > case, so parsers of all kinds (not just XML validation) tended to be
    > written to forbid it. There are newer techniques which will handle it,
    > but since in practice there is almost never a NEED for that kind of
    > constraint the folks defining languages don't like writing such a
    > requirement into their specifications. Also, standards groups worry
    > about how well the data will interchange with other systems (eg
    > databases) which don't support that kind of constraint.
    >
    > Remember, schema or DTD is only the first level of validation --
    > higher-order syntax checking, if you will -- and you will often
    > want/need to impose additional checking at the application level. If
    > you need this constraint, consider leaving the count unconstrained at
    > the schema level and imposing limits in your application code.
    >
    > There's always the Horribly Ugly solution of having the schema
    > explicitly spell out all the possible combinations/orderings of the
    > children.
    >
    > But the best answer is to reconsider your document design -- ask
    > whether there's another way to represent your data which doesn't
    > require this combination of unordered and frequency-limited. In 99.5%
    > of the cases I've seen, that requirement evaporates when you stop to
    > think about how the documents will actually be produced and used and
    > whether some of the entries should be grouped together under parent
    > elements rather than all being siblings.
    >
    >
    > As far as RelaxNG goes: What I heard about it early on was that it was
    > a lot more straightforward for many of the common cases than XML
    > Schemas... but I don't know what its status is or how heavily it has
    > been stress-tested since then. I do know that the single biggest issue
    > remains portability -- RelaxNG may be a fine choice if you have
    > complete control over both the document's generator and consumer, but
    > it may not help you much when working with other folks who don't have a
    > RelaxNG-enabled system. It's reportedly possible to generate an XML
    > Schema from a RelaxNG document spec... but that involves giving up any
    > constraints that Schema doesn't easily support.
    Frank Greco, Jan 31, 2011
    #4
  5. El 31/01/2011 23:41, Frank Greco escribió:
    > Thanks for the reply Joe.
    >
    > Essentially I need to have an unordered list of complex types with a
    > potential for N entries of one of the types, for example I need
    > something like this:
    >
    > <book>
    > <references>This if ref #1</references>
    > <references>This if ref #2</references>
    > <references>This if ref #3</references>
    >
    > <title>Book Title One</title>
    > <author>Joe Blog</author>
    > <price>10.50</price>
    > </book>
    >
    > I want the user to be allowed to enter any of the complex types without
    > any ordering. That is, the following should be legal:
    >
    > <book>
    > <title>Book Title One</title>
    > <author>Joe Blog</author>
    >
    > <references>This if ref #1</references>
    > <references>This if ref #2</references>
    > <references>This if ref #3</references>
    >
    > <price>10.50</price>
    > </book>
    >
    > ... along with other combinations.
    >
    > XSD 1.0 does not seem to allow me to have this.
    >
    > Any suggestions are greatly appreciated.


    Schematron?

    --
    Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
    Manuel Collado, Jan 31, 2011
    #5
  6. Frank Greco

    Frank Greco Guest

    Hi Martin,

    Thanks for the reply. That's very useful info. I'll need to test if
    the Xerces2 beta has this feature. Thanks!

    F


    On 2011-01-31 16:52:15 -0500, Martin Honnen said:

    > Frank Greco wrote:
    >> I'm sure more people use XSD than RelaxNG. But I'm curious if its worth
    >> investigating RelaxNG as an alternative.
    >> I see there is an O'Reilly book which indicates a decent audience (in
    >> theory I guess). But how many are actually using RelaxNG?
    >>
    >> The bottom line is I'm hitting constraints on XSD sequencing. Its either
    >> a forced order of elements with each element potentially occurring 'n'
    >> time, or no-ordering with each element only occurring 0 or 1 times. I
    >> need no-ordering with each element potentially occurring n times, which
    >> is not allowed for some reason.

    >
    > Note that work is on the way to specify and implement (in Apache Xerces
    > and in Saxon so far I think) version 1.1 of the W3C XML schema
    > language. As far as I understand
    > http://www.w3.org/TR/xmlschema11-1/#ch_models:
    > --- quote ---------------------------
    > Several of the constraints imposed by version 1.0 of this specification
    > on all-groups have been relaxed:
    >
    > ...
    >
    > 2.
    > The value of maxOccurs may now be greater than 1 on particles in
    > an all group. The elements which match a particular particle need not
    > be adjacent in the input.
    > --- quote ---------------------------
    >
    > you might be able to achieve what you want with version 1.1. But I
    > don't have time to test that now.
    Frank Greco, Jan 31, 2011
    #6
  7. Frank Greco wrote:

    > Essentially I need to have an unordered list of complex types with a
    > potential for N entries of one of the types, for example I need
    > something like this:
    >
    > <book>
    > <references>This if ref #1</references>
    > <references>This if ref #2</references>
    > <references>This if ref #3</references>
    >
    > <title>Book Title One</title>
    > <author>Joe Blog</author>
    > <price>10.50</price>
    > </book>
    >
    > I want the user to be allowed to enter any of the complex types without
    > any ordering. That is, the following should be legal:
    >
    > <book>
    > <title>Book Title One</title>
    > <author>Joe Blog</author>
    >
    > <references>This if ref #1</references>
    > <references>This if ref #2</references>
    > <references>This if ref #3</references>
    >
    > <price>10.50</price>
    > </book>
    >
    > ... along with other combinations.
    >
    > XSD 1.0 does not seem to allow me to have this.


    I tried the Xerces Java 2.11 beta with -xsd11 schema support and it runs
    fine with a schema like

    <xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="books">
    <xs:complexType>
    <xs:sequence maxOccurs="unbounded">
    <xs:element name="book">
    <xs:complexType>
    <xs:all>
    <xs:element name="references" type="xs:string" maxOccurs="3"/>
    <xs:element name="title" type="xs:string"/>
    <xs:element name="author" type="xs:string"/>
    <xs:element name="price" type="xs:double"/>
    </xs:all>
    </xs:complexType>
    </xs:element>
    </xs:sequence>
    </xs:complexType>
    </xs:element>

    </xs:schema>

    and it validates the first two book elements in the following sample fine:

    <books>
    <book>
    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>

    <title>Book Title One</title>
    <author>Joe Blog</author>
    <price>10.50</price>
    </book>
    <book>
    <title>Book Title One</title>
    <author>Joe Blog</author>

    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>

    <price>10.50</price>
    </book>
    <book>
    <title>Book Title One</title>
    <author>Joe Blog</author>

    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>
    <references>This if ref #4</references>
    <price>10.50</price>
    </book>
    </books>

    For the last one it outputs an error

    [Error] test2011020103.xml:28:16: cvc-complex-type.2.4.a: Invalid
    content was found starting with element 'references'. One of '{price}'
    is expected.


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Feb 1, 2011
    #7
  8. Frank Greco

    Peter Flynn Guest

    On 31/01/11 22:41, Frank Greco wrote:
    > Thanks for the reply Joe.
    >
    > Essentially I need to have an unordered list of complex types with a
    > potential for N entries of one of the types, for example I need
    > something like this:
    >
    > <book>
    > <references>This if ref #1</references>
    > <references>This if ref #2</references>
    > <references>This if ref #3</references>
    >
    > <title>Book Title One</title>
    > <author>Joe Blog</author>
    > <price>10.50</price>
    > </book>


    I assume that only references may reoccur, and that title, author, and
    price are constrained to one occurrence each.

    Consider SGML instead, which permits it:

    <!doctype book [
    <!element book - - (references*&title&author&price)>
    <!element references - - (#pcdata)>
    <!element title - - (#pcdata)>
    <!element author - - (#pcdata)>
    <!element price - - (#pcdata)>
    ]>
    <book>
    <references>This if ref #1</references>
    <references>This if ref #2</references>
    <references>This if ref #3</references>
    <title>Book Title One</title>
    <author>Joe Blog</author>
    <price>10.50</price>
    </book>

    But as Joe suggests, designing it differently may be more appropriate.
    If your references are to ID-like tokens, and your price only occurs
    once, then perhaps:

    <book price="10.50" references="abc123 foobar splat">
    <title>Book Title One</title>
    <author>Joe Blog</author>
    </book>

    Without knowing your business processes, it's hard to judge, but in
    general, a good rule is to keep PCDATA for text, and use attributes for
    numeric or categorical data. This also avoids bloating the document with
    unnecessary (generatable) text.

    BTW most people I know using RNG use it to maintain a master schema,
    from which they can then generate W3C Schemas, DTDs, etc when needed.

    ///Peter
    --
    XML FAQ: http://xml.silmaril.ie/
    Peter Flynn, Feb 3, 2011
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Markus
    Replies:
    1
    Views:
    1,083
    Markus
    Nov 22, 2005
  2. Rick Razzano

    XSD document for XSD defintion

    Rick Razzano, Sep 26, 2003, in forum: XML
    Replies:
    1
    Views:
    470
    C. M. Sperberg-McQueen
    Sep 26, 2003
  3. Replies:
    1
    Views:
    848
    Martin Honnen
    Jan 14, 2004
  4. David Abrahams

    Help with RelaxNG and Trang (and Emacs)

    David Abrahams, Feb 9, 2005, in forum: XML
    Replies:
    0
    Views:
    563
    David Abrahams
    Feb 9, 2005
  5. Manuel Collado
    Replies:
    6
    Views:
    1,393
    Manuel Collado
    Oct 20, 2011
Loading...

Share This Page