XML order does not always match DTD

C

compaqr4000

Hi
I've just started to maintain someone else's project which uses SAX to
parse XML documents. According to the DTD all of the XML elements must
be in a specific section. However, 'sometimes' they are not ordered
properly and the XML document does not get parsed. The java SAX parser
spits out this message:
Element "COMPANY" does not allow "CompanyEvent" here.

I need to change the parser, or the DTD in order to be able to allow
parsing.

Here's the part of the DTD showing Company elements:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
CompanyEvent?, Ranking?)>

I'm fairly new to XML, so any thoughts, suggestions are welcome.

Thanks!
Ry
 
M

Martin Honnen

I've just started to maintain someone else's project which uses SAX to
parse XML documents. According to the DTD all of the XML elements must
be in a specific section. However, 'sometimes' they are not ordered
properly and the XML document does not get parsed. The java SAX parser
spits out this message:
Element "COMPANY" does not allow "CompanyEvent" here.

I need to change the parser, or the DTD in order to be able to allow
parsing.

Here's the part of the DTD showing Company elements:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
CompanyEvent?, Ranking?)>

How does the XML look that currently gives an error and for which you
want to adapt the DTD?
 
J

Joseph Kesselman

If you're willing to move the only-once check into your application
code, this is easy to solve (just allow an unconstrained mix of the
children). If you aren't, this gets ugly and tedious; the DTD will not
be able to enforce only-once unless you spell out the acceptable orderings.

The other solution is to hit your users over the head with a printout of
the DTD and say "The order is specified. Stop being lazy; build the
document in this order."

Or you could run the document through a preprocessing stage (a
stylesheet, for example) to standardize its order before validating.
 
C

compaqr4000

How does the XML look that currently gives an error and for which you
want to adapt the DTD?

The initial ordering stays the same, however sometimes 'Ranking' data
comes before 'CompanyEvent'.


Ry
 
C

compaqr4000

If you're willing to move the only-once check into your application
code, this is easy to solve (just allow an unconstrained mix of the
children).
At this point, I want to do this in order to re-process my data. I
think that a lot of data is missing because of this. How would I
change my DTD to accomplish this?


If you aren't, this gets ugly and tedious; the DTD will not
be able to enforce only-once unless you spell out the acceptable orderings.

The other solution is to hit your users over the head with a printout of
the DTD and say "The order is specified. Stop being lazy; build the
document in this order."
Now, that would be fun, but unfortunately isn't an option for me in
this case.
 
C

compaqr4000

Okay

I tried to make the ordering of the last two elements unconstrained:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
(CompanyEvent? | Ranking?))>

but I still get the same error.

Any ideas?


Ry
 
J

Joseph Kesselman

Okay

I tried to make the ordering of the last two elements unconstrained:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
(CompanyEvent? | Ranking?))>

That doesn't make the order unconstrained; it permits only one or the other.

(CompanyEvent|Ranking)* should give you unconstrained order, but doesn't
let you constrain the number of instances of either.
 
C

compaqr4000

(CompanyEvent|Ranking)* should give you unconstrained order, but doesn't
let you constrain the number of instances of either.

That's the syntax I was looking for. It works!

Thanks.

Ry
 
A

A. Bolmarcich

Okay

I tried to make the ordering of the last two elements unconstrained:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
(CompanyEvent? | Ranking?))>

but I still get the same error.

Any ideas?

You can use

((CompanyEvent, Ranking?) | (Ranking, CompanyEvent?))?

Element content models for more than two elements in any order are
unwieldy.
 
J

Joseph Kesselman

A. Bolmarcich said:
((CompanyEvent, Ranking?) | (Ranking, CompanyEvent?))?

Yep, that gets the only-one-in-either-order result. "Unwieldy" is an
understatement for more complicated cases. I *have* seen someone write a
program to try to automatically generate a DTD with all the permutations
for an arbitrary list... but frankly, that strikes me as better solved
in some other level of semantic testing.
 
P

Peter Flynn

Hi
I've just started to maintain someone else's project which uses SAX to
parse XML documents. According to the DTD all of the XML elements must
be in a specific section. However, 'sometimes' they are not ordered
properly and the XML document does not get parsed. The java SAX parser
spits out this message:
Element "COMPANY" does not allow "CompanyEvent" here.

I need to change the parser, or the DTD in order to be able to allow
parsing.

More likely you need to edit the documents so that they conform to the
DTD. Or get whoever creates them to do it right.

Only change the DTD if it's actually specifying a faulty restriction.

Certainly don't modify the parser, or it may end up not being an XML parser.
Here's the part of the DTD showing Company elements:
<!ELEMENT COMPANY (Name, Region?, Employees?, State?, City?, Misc*,
CompanyEvent?, Ranking?)>

You'd need to show us an example document as well...

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top