XSD vs RelaxNG

F

Frank Greco

I'm sure more people use XSD than RelaxNG. But I'm curious if its
worth investigating RelaxNG as an alternative.
I see there is an O'Reilly book which indicates a decent audience (in
theory I guess). But how many are actually using RelaxNG?

The bottom line is I'm hitting constraints on XSD sequencing. Its
either a forced order of elements with each element potentially
occurring 'n' time, or no-ordering with each element only occurring 0
or 1 times. I need no-ordering with each element potentially occurring
n times, which is not allowed for some reason.

Thanks

F
 
J

Joe Kesselman

need no-ordering with each element potentially occurring n times, which
is not allowed for some reason.

Historically, it was very difficult to generate a finite state machine
which would efficiently validate that sort of constraint in the general
case, so parsers of all kinds (not just XML validation) tended to be
written to forbid it. There are newer techniques which will handle it,
but since in practice there is almost never a NEED for that kind of
constraint the folks defining languages don't like writing such a
requirement into their specifications. Also, standards groups worry
about how well the data will interchange with other systems (eg
databases) which don't support that kind of constraint.

Remember, schema or DTD is only the first level of validation --
higher-order syntax checking, if you will -- and you will often
want/need to impose additional checking at the application level. If you
need this constraint, consider leaving the count unconstrained at the
schema level and imposing limits in your application code.

There's always the Horribly Ugly solution of having the schema
explicitly spell out all the possible combinations/orderings of the
children.

But the best answer is to reconsider your document design -- ask whether
there's another way to represent your data which doesn't require this
combination of unordered and frequency-limited. In 99.5% of the cases
I've seen, that requirement evaporates when you stop to think about how
the documents will actually be produced and used and whether some of the
entries should be grouped together under parent elements rather than all
being siblings.


As far as RelaxNG goes: What I heard about it early on was that it was a
lot more straightforward for many of the common cases than XML
Schemas... but I don't know what its status is or how heavily it has
been stress-tested since then. I do know that the single biggest issue
remains portability -- RelaxNG may be a fine choice if you have complete
control over both the document's generator and consumer, but it may not
help you much when working with other folks who don't have a
RelaxNG-enabled system. It's reportedly possible to generate an XML
Schema from a RelaxNG document spec... but that involves giving up any
constraints that Schema doesn't easily support.


--
Joe Kesselman,
http://www.love-song-productions.com/people/keshlam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
M

Martin Honnen

Frank said:
I'm sure more people use XSD than RelaxNG. But I'm curious if its worth
investigating RelaxNG as an alternative.
I see there is an O'Reilly book which indicates a decent audience (in
theory I guess). But how many are actually using RelaxNG?

The bottom line is I'm hitting constraints on XSD sequencing. Its either
a forced order of elements with each element potentially occurring 'n'
time, or no-ordering with each element only occurring 0 or 1 times. I
need no-ordering with each element potentially occurring n times, which
is not allowed for some reason.

Note that work is on the way to specify and implement (in Apache Xerces
and in Saxon so far I think) version 1.1 of the W3C XML schema language.
As far as I understand http://www.w3.org/TR/xmlschema11-1/#ch_models:
--- quote ---------------------------
Several of the constraints imposed by version 1.0 of this specification
on all-groups have been relaxed:

....

2.
The value of maxOccurs may now be greater than 1 on particles in
an all group. The elements which match a particular particle need not be
adjacent in the input.
--- quote ---------------------------

you might be able to achieve what you want with version 1.1. But I don't
have time to test that now.
 
F

Frank Greco

Thanks for the reply Joe.

Essentially I need to have an unordered list of complex types with a
potential for N entries of one of the types, for example I need
something like this:

<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

I want the user to be allowed to enter any of the complex types without
any ordering. That is, the following should be legal:

<book>
<title>Book Title One</title>
<author>Joe Blog</author>

<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<price>10.50</price>
</book>

.... along with other combinations.

XSD 1.0 does not seem to allow me to have this.

Any suggestions are greatly appreciated.

F
 
M

Manuel Collado

El 31/01/2011 23:41, Frank Greco escribió:
Thanks for the reply Joe.

Essentially I need to have an unordered list of complex types with a
potential for N entries of one of the types, for example I need
something like this:

<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

I want the user to be allowed to enter any of the complex types without
any ordering. That is, the following should be legal:

<book>
<title>Book Title One</title>
<author>Joe Blog</author>

<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<price>10.50</price>
</book>

... along with other combinations.

XSD 1.0 does not seem to allow me to have this.

Any suggestions are greatly appreciated.

Schematron?
 
F

Frank Greco

Hi Martin,

Thanks for the reply. That's very useful info. I'll need to test if
the Xerces2 beta has this feature. Thanks!

F
 
M

Martin Honnen

Frank said:
Essentially I need to have an unordered list of complex types with a
potential for N entries of one of the types, for example I need
something like this:

<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

I want the user to be allowed to enter any of the complex types without
any ordering. That is, the following should be legal:

<book>
<title>Book Title One</title>
<author>Joe Blog</author>

<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<price>10.50</price>
</book>

... along with other combinations.

XSD 1.0 does not seem to allow me to have this.

I tried the Xerces Java 2.11 beta with -xsd11 schema support and it runs
fine with a schema like

<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="books">
<xs:complexType>
<xs:sequence maxOccurs="unbounded">
<xs:element name="book">
<xs:complexType>
<xs:all>
<xs:element name="references" type="xs:string" maxOccurs="3"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
<xs:element name="price" type="xs:double"/>
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

</xs:schema>

and it validates the first two book elements in the following sample fine:

<books>
<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>
<book>
<title>Book Title One</title>
<author>Joe Blog</author>

<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<price>10.50</price>
</book>
<book>
<title>Book Title One</title>
<author>Joe Blog</author>

<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>
<references>This if ref #4</references>
<price>10.50</price>
</book>
</books>

For the last one it outputs an error

[Error] test2011020103.xml:28:16: cvc-complex-type.2.4.a: Invalid
content was found starting with element 'references'. One of '{price}'
is expected.
 
P

Peter Flynn

Thanks for the reply Joe.

Essentially I need to have an unordered list of complex types with a
potential for N entries of one of the types, for example I need
something like this:

<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

I assume that only references may reoccur, and that title, author, and
price are constrained to one occurrence each.

Consider SGML instead, which permits it:

<!doctype book [
<!element book - - (references*&title&author&price)>
<!element references - - (#pcdata)>
<!element title - - (#pcdata)>
<!element author - - (#pcdata)>
<!element price - - (#pcdata)>
]>
<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>
<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

But as Joe suggests, designing it differently may be more appropriate.
If your references are to ID-like tokens, and your price only occurs
once, then perhaps:

<book price="10.50" references="abc123 foobar splat">
<title>Book Title One</title>
<author>Joe Blog</author>
</book>

Without knowing your business processes, it's hard to judge, but in
general, a good rule is to keep PCDATA for text, and use attributes for
numeric or categorical data. This also avoids bloating the document with
unnecessary (generatable) text.

BTW most people I know using RNG use it to maintain a master schema,
from which they can then generate W3C Schemas, DTDs, etc when needed.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top