[XML Schema] Identically named elements of different type

S

Stanimir Stamenkov

Using Xerces2 Java I'm trying to validate a CSV data following an
XNI example
<http://xml.apache.org/xerces2-j/xni-config.html#examples>. The CSV
scanner generates XML tree events like:

<csv>
<row>
<col>Andy Clark</col>
<col>16 Jan 1973</col>
<col>Cincinnati</col>
</row>
</csv>

In the "XML Schema Part 1: Structures" specification I find the
following paragraph
When two or more particles contained directly or indirectly in the
{particles} of a model group have identically named element
declarations as their {term}, the type definitions of those
declarations must be the same. ...

In order to validate the XML structure generated from the CSV data I
would need an XML Schema like:

<!-- top-level declaration -->
<xs:element name="csv">
<xs:complexType>
<xs:sequence>
<xs:element name="row" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="col" type="type1" />
<xs:element name="col" type="type2" />
<xs:element name="col" type="type3" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

Where 'type1', 'type2' and 'type3' are whatever different simple
types. According to the spec the above is not permissible, but seems
like the Xerces2 implementation doesn't mind it. I think it should
be permissible at least in such "straight-forward" cases, where the
position of the element in the tree also matters for resolving its
content type definition.

Is there development for such an issue for the next version of the
XML Schema spec and is it reasonably valid for the Xerces2
implementation to behave such differently from the current
specification?
 
S

Stan Kitsis [MSFT]

Hi Stanimir,

The spec is right - elements with the same name and in the same scope must
have the same type. As such, you cannot have three <col> elements of
different types within the same scope. However, if all three <col> elements
contain strings, the following schema will validate it:

<xs:element name="csv">
<xs:complexType>
<xs:sequence>
<xs:element name="row">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="col" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

--
Stan Kitsis
Program Manager, XML Technologies
Microsoft Corporation

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
 
S

Stanimir Stamenkov

Hi, Stan. Thank you for your response.

/Stan Kitsis [MSFT]/:
The spec is right - elements with the same name and in the same scope must
have the same type. As such, you cannot have three <col> elements of
different types within the same scope. However, if all three <col> elements
contain strings, the following schema will validate it:

I haven't questioned the correctness of the current specification as
much as I've wanted to get opinions on a possible future spec
development regarding that concrete issue and if it has already been
considered (for version 1.1, for example).

So far I haven't figured out show-stopper obstacles for having
different type elements with a same name within a same scope.
Probably there would be cases where it won't be possible to
determine which type/declaration an element should be validated
against, depending on the content model composition, but I really
can't think of any at the moment.

So my question could be stated as: What's the rationale behind this
current limitation and do you think it would be feasible to change
it to permit content type resolution according to the position of
the element in the content model, too?
 
S

Stanimir Stamenkov

/Stanimir Stamenkov/:
So my question could be stated as: What's the rationale behind this
current limitation and do you think it would be feasible to change it to
permit content type resolution according to the position of the element
in the content model, too?

Probably "according to the position of the element in the content
model" is not best stated, because of the following example:

<xs:choice>
<xs:element name="data" type="xs:token" />
<xs:element name="data" minOccurs="0" type="xs:date" />
<xs:element name="data" maxOccurs="unbounded"
type="xs:integer" />
</xs:choice>

The above would mean the content model could have a single "data"
element with its content type of type 'token', could be empty, could
have a single "data" element with content type of 'date' or could
contain unlimited number of "data" elements with content type of
'integer'.

May be the above could be represented with some kind of union though
the different element type quantifications would make it impossible,
at first glance.
 
H

Henry S. Thompson

Stanimir Stamenkov said:
I haven't questioned the correctness of the current specification as
much as I've wanted to get opinions on a possible future spec
development regarding that concrete issue and if it has already been
considered (for version 1.1, for example).

The basic argument is that you should be able to tell the type of an
element if you know its ancestry. Or, put another way, the type of
any element addressable with a simple XPath of the form /x/y/z should
be unique.

This is admittedly a somewhat arbitrary choice, but feels to be pretty
close to the 80/20 point to me.

ht
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: (e-mail address removed)
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
 
S

Stanimir Stamenkov

/Henry S. Thompson/:
The basic argument is that you should be able to tell the type of an
element if you know its ancestry. Or, put another way, the type of
any element addressable with a simple XPath of the form /x/y/z should
be unique.

But then doesn't "/x/y/z" select a node-set? While "/x/y/z[0]" would
select a single node.
 
H

Henry S. Thompson

Stanimir Stamenkov said:
/Henry S. Thompson/:
The basic argument is that you should be able to tell the type of an
element if you know its ancestry. Or, put another way, the type of
any element addressable with a simple XPath of the form /x/y/z
should be unique.

But then doesn't "/x/y/z" select a node-set? While "/x/y/z[0]" would
select a single node.

Indeed, perhaps a better way to say would have been:

The type of every element in a node-set addressable with a simple
XPath of the form /x/y/z should be the same.

ht
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: (e-mail address removed)
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
 
S

Stanimir Stamenkov

/Henry S. Thompson/:
The basic argument is that you should be able to tell the type of an
element if you know its ancestry. Or, put another way, the type of
any element addressable with a simple XPath of the form /x/y/z should
be unique.

So I guess it has something to do with the "Deterministic Content
Models" <http://www.w3.org/TR/REC-xml/#determinism> described in the
XML specification? I haven't found such restrictions in the XML
Schema documentation and I wonder if it's really that impossible. :)
 
N

Nadia Cobos

Good Morning...
I know that elements with the same name and in the same scope must have
the same type. I try to make a xml schema for config file. We try to
verify (validate) at specific attribute and assign a specific
restriction.

For example the section called "Common" have the next elements and
attributes:
<Common>
<add key="CountInstances" value="1.2"> Dec </add>
<add key="LogLevel" value="2"> Int</add>
<add key="OperationsPerFileList" value="1"> Int </add>
<add key="frecuenciacentinela" value="2.0"> Dec </add>
</Common>

and we want to specify for example that the attribute called
"CountInstances" is a decimal that could takes values from 1 to 3, the
attribute "LogLevel" takes integers values from 0 to 3,
key="OperationsPerFileList" takes integers values from 0 to 1, and
"frecuenciacentinela" takes decimal values from 0 to 2.

We try to use instead the fixed value....
<xs:complexType name="CommonType">
<xs:sequence>
<xs:element name="add" type="addType" maxOccurs="unbounded" />
</xs:sequence>
<xs:attributeGroup ref="myAttributeGroup" />
</xs:complexType>

<xs:attributeGroup name="myAttributeGroupCountInstances">
<xs:attribute name="key" type="xs:string" fixed="CountInstances" />
<xs:attribute name="value" type="numbyno" />
</xs:attributeGroup>

<xs:simpleType name="numbyno">
<xsd:restriction base="xsd:decimal">
<xsd:totalDigits value='3'/>
<xsd:fractionDigits value="1"/>
<xsd:minInclusive value="0" />
<xsd:maxInclusive value="5" />
<!--<xsd:pattern value="[0-5](.2)?" />-->
</xsd:restriction>
</xs:simpleType>

...but we couldn´t add another attributeGroup because we couldn´t called
an attribute name with
different type..

...so we belief use the enumeration value but we don´t know if we can
assign a specific attribute a specific restriction...we use an union
type but we don´t know how assign a specific restriccion for example
"CountInstances" to specific restriction "that could takes decimal
values for example from 1 to 3"... we did this but like a general way
for all int and so on... we have just belief that if we put in the same
order it could work but it wasn´t!, because we change the order and it
still work but taking differents types..

<xsd:complexType name="CommonType">
<xsd:sequence maxOccurs="unbounded">
<xsd:element name="add" type="addType" />
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="addType" mixed="true">
<xsd:attributeGroup ref="CommonAttribute"/>
</xsd:complexType>
<xsd:attributeGroup name="CommonAttribute">
<xsd:attribute name="key" use="required">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="CountInstances" />
<xsd:enumeration value="LogLevel" />
<xsd:enumeration value="frecuenciacentinela" />
....
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="value" use="required">
<xsd:simpleType>
<xsd:union>
<xsd:simpleType> <xsd:restriction base="xsd:int">
<xsd:minInclusive value="1" /> <xsd:maxInclusive value="3" />
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType> <xsd:restriction base="xsd:decimal">
<xsd:totalDigits value='3'/>
<xsd:fractionDigits value="1"/> <xsd:minInclusive value="0"
/> <xsd:maxInclusive value="5" /> </xsd:restriction>
</xsd:simpleType>
....
</xsd:union>
</xsd:simpleType>
</xsd:attribute>
</xsd:attributeGroup>

we hope you could help us or recommendation something...

Thanks..
 
D

David Carlisle

It's not possible in xsd schema (or dtd) to make the type of one
attribute depend on an another attribute value. You can specify this in
other schema languages (relax ng or schematron for example) or you can
just use a less restrictive schema that doesn't distinguish based on
the attribute value, and check in the application layer after (above)
validation that the two attributes are compatible.

David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top