Two different element types with the same name ?

L

Ludovic Kuty

Dear group,

In the article entitled "Namespace Myths Exploded" ( http://www.xml.com/lpt/a/395 ), there is something (in fact a few things but let's focus on this one thing) that bothers me. In Myth 2 called "Universal names uniquely identify element types and attributes" we have a code sample :

<?xml version="1.0" ?>
<A xmlns="http://www.foo.org/">
<A>abcd</A>
</A>

And the associated question in the text : "Do these share a single element type or do they have two different element types?". I don't understand. I thought that the DTD will constrain the element A by telling us what is acceptable as attributes and what is acceptable as content. Once it is done, how could we get two element types ? I mean A should/could be defined as :

<!ELEMENT A (#PCDATA | A)>
<!ATTLIST A xmlns CDATA #IMPLIED>

For me, it is just one type, not two. How could we get two different element types with the same name ? I understand that it "could" be possible to distinguish them based on the context but I don't think DTDs or W3C XML Schemas allow this. Could someone enlighten me on this matter ?

Also, I would be interested in any ressource (book, article, Web page, ...)that talks about "element types" and not just "elements" and that is quiteformal about XML. You may point me to the right location in the XML recommendation.

TIA,

Ludovic Kuty
 
M

Martin Honnen

Ludovic said:
In the article entitled "Namespace Myths Exploded" ( http://www.xml.com/lpt/a/395 ), there is something (in fact a few things but let's focus on this one thing) that bothers me. In Myth 2 called "Universal names uniquely identify element types and attributes" we have a code sample :

<?xml version="1.0" ?>
<A xmlns="http://www.foo.org/">
<A>abcd</A>
</A>

And the associated question in the text : "Do these share a single element type or do they have two different element types?". I don't understand. I thought that the DTD will constrain the element A by telling us what is acceptable as attributes and what is acceptable as content. Once it is done, how could we get two element types ? I mean A should/could be defined as :

<!ELEMENT A (#PCDATA | A)>
<!ATTLIST A xmlns CDATA #IMPLIED>

For me, it is just one type, not two. How could we get two different element types with the same name ? I understand that it "could" be possible to distinguish them based on the context but I don't think DTDs or W3C XML Schemas allow this. Could someone enlighten me on this matter ?

At least when defining a schema you would not define the "xmlns"
attribute, rather you would set up the targetNamespace for the schema to
be http://www.foo.org/. See http://www.w3.org/TR/REC-xml-names/#ns-decl,
it says "The prefix xmlns is used only to declare namespace bindings and
is by definition bound to the namespace name
http://www.w3.org/2000/xmlns/. It MUST NOT be declared .".

I realize that does not answer your question but I think it is worth
mentioning with the XML you have and the question on how to set up a schema.
 
P

Peter Flynn

Dear group,

In the article entitled "Namespace Myths Exploded"
(http://www.xml.com/lpt/a/395), there is something (in fact a few
things but let's focus on this one thing) that bothers me. In Myth 2
called "Universal names uniquely identify element types and
attributes" we have a code sample

<?xml version="1.0" ?>
<A xmlns="http://www.foo.org/">
<A>abcd</A>
</A>

And the associated question in the text : "Do these share a single
element type or do they have two different element types?". I don't
understand.

The simple answer, strictly by the book (well, the XML Specification) is
that this document has a single element type called A, with two
instances of it. The fact that the first instance has an attribute
called xmlns is misleading and irrelevant.

But as Martin has pointed out, in Schema-land, a namespace binding must
be specified with a prefix, eg

<?xml version="1.0" ?>
<A xmlns:foo="http://www.foo.org/">
<A>abcd</A>
</A>

This binds the first A element *and its content* to the namespace
specified, using the prefix "foo". So the second instance of A also
acquires that namespace, and therefore again, there is a single element
type in use here.
I thought that the DTD will constrain the element A by
telling us what is acceptable as attributes and what is acceptable as
content.

That is correct, but has nothing to do with the question. What you are
describing here is a content model: the specification of what an element
type is permitted to contain.
Once it is done, how could we get two element types ? I mean A
should/could be defined as :

<!ELEMENT A (#PCDATA | A)>
<!ATTLIST A xmlns CDATA #IMPLIED>

Not quite: it should be <!ELEMENT A (#PCDATA | A)*> so that the inner A
is optional, otherwise it would be compulsory and lead to infinite
recursion.
For me, it is just one type, not two.
Correct.

How could we get two different element types with the same name?

By using namespaces to distinguish them:

<?xml version="1.0" ?>
<a xmlns:html="http://www.w3.org/1999/xhtml">
<a xmlns:foo="http://www.foo.org/">abcd</a>
</a>

This is an a element type from XHTML containing a completely different a
element type taken from some schema defined by Foo, Inc. Two separate
types of element.

In XML processing (eg XPath), the name() function will reference the
prefixed name (eg html:a or foo:a) and the local-name() function will
reference the unprefixed name (a in both cases).
I understand that it "could" be possible to distinguish them based on
the context but I don't think DTDs or W3C XML Schemas allow this.

You can infer a namespace by the context in the sense that all the
content of an element in one namespace is held to be in the same
namespace unless specified otherwise as above. But no, in the general
sense, unmarked, you cannot infer a *distinction* between elements based
on namespace, where the namespace is not given.
Also, I would be interested in any resource (book, article, Web
page, ...) that talks about "element types" and not just "elements"
and that is quite formal about XML. You may point me to the right
location in the XML recommendation.

The formal distinction comes from SGML, so the canonical location is ISO
8879:1988. You can still buy Charles Goldfarb's _SGML Handbook_, where
you will find it explained in production 117 at p.406 (the term actually
goes back much further into markup history). You will find the terms
used in this way in any book or web page written for the formal
discussion of markup.

What you declare in a DTD or Schema are element types. An element of
type A in [old] HTML is for marking a hypertext Anchor. There can only
ever be one of each (modulo the use of namespaces to distinguish
conflicting names), because you can only declare an element type once.

What you use in documents are element instances: in effect, this term
describes occurrences of elements of a particular type, and is usually
what is meant by the word "element".

The looser use of the word "element" to cover both meanings is
widespread in informal discussion.

For more discussion, see http://www.flightlab.com/~joe/sgml/faq-not.txt
especially Part 5 :)

///Peter
 
J

Joe Kesselman

<?xml version="1.0" ?>
<A xmlns="http://www.foo.org/">
<A>abcd</A>
And the associated question in the text : "Do these share a single
element type or do they have two different element types?"

Single type, but for two different reasons. The answer depends on
whether you're talking about types defined by a DTD, or types defined by
a schema.

DTDs are not namespace-aware. So if they define a type for the element
<A/>, there is only one such definition; A is A is A, and the answer is
"single".


Schemas *are* aware of XML Namespaces. (If you haven't read an XML
tutorial which explains namespaces, DO SO AS SOON AS POSSIBLE because
they've become an essential part of XML. You should also find a good
tutorial introduction to XML Schemas, which are now preferred over DTDs
for many reasons.)

The outer element
<A xmlns="http://www.foo.org/">
defines a default namespace binding using the xmlns= attribute. That
default is immediately applied to this element. Thus the element name
is, effectively, something like {http://www.foo.org/}:A, and XML Schema
validation will specifically look for a schema for the
http://www.foo.org/ namespace which defines the element A in that namespace.

For the inner element, you have to be aware that the default namespace
binding is inherited down the tree until/unless it is explicitly rebound
or unbound. This means the inner <A/> element is considered to have the
same namespace as the outer one, and therefore the same effective name
and the same type. Again, the answer is "single".


Since it looks like you need to find some better instructional material,
let me include my standard recommendation for the XML section of
DeveloperWorks (http://www.ibm.com/developerworks/xml/). This site has a
LOT of good resources, from basic tutorials to expert techniques to
arguments for or against competing standards. (Not all of which agree
with IBM's recommendations, by the way. DeveloperWorks operates as a
semi-autonomous web 'zine.)



--
Joe Kesselman,
http://www.love-song-productions.com/people/keshlam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
A

Alain Ketterlin

[...]
<?xml version="1.0" ?>
<A xmlns:foo="http://www.foo.org/">
<A>abcd</A>
</A>

This binds the first A element *and its content* to the namespace
specified, using the prefix "foo". So the second instance of A also
acquires that namespace, and therefore again, there is a single element
type in use here.

I'm not sure this is correct. In this example, both A elements have a
"no-value" namespace (see "Namespaces in XML", section 6.2, par. 3: "If
there is no default namespace declaration in scope, the namespace name
has no value"). Both element are of the same type, but they do not
belong to the namespace declared. They would of they were named foo:A.

But maybe you're just talking about binding prefixes to namespaces...

By the way, I'm not even sure this example is allowed by the Namespaces
recomm.
By using namespaces to distinguish them:

<?xml version="1.0" ?>
<a xmlns:html="http://www.w3.org/1999/xhtml">
<a xmlns:foo="http://www.foo.org/">abcd</a>
</a>

Again, these unprefixed names have no associated namespace. Having:

<?xml version="1.0" ?>
<html:a xmlns:html="http://www.w3.org/1999/xhtml">
<foo:a xmlns:foo="http://www.foo.org/">abcd</foo:a>
</html:a>

shows two distinct elements. Another way to write this would be:

<?xml version="1.0" ?>
<a xmlns="http://www.w3.org/1999/xhtml">
<a xmlns="http://www.foo.org/">abcd</a>
</a>

(i.e., all prefixes removed).

-- Alain.
 
J

Joe Kesselman

Alain's note is entirely correct. Namespaces get bound to elements
either when there is a default namespace declaration in scope, or when
the element uses the prefix which has been bound to that namespace URI.
Namespace declarations are inherited, but only the default namespace is
automatically applied.

Note that attributes *ONLY* get bound if there's an explicit prefix;
they are not affected by the default element namespace.

--
Joe Kesselman,
http://www.love-song-productions.com/people/keshlam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
L

Ludovic Kuty

I understand namespaces (more or less) and my question was not focused towards namespaces but towards the fact that we have a default namespace declaration and both A belong to the same namespace named http://www.foo.org/ butthey just look different. That factors out the namespace stuff out of the equation. I mean I could have written <A><A>abcd</A></A>, no NS at all. ButI wanted to copy/paste the stuff in the article.

So they have to have the same type and I was really puzzled by the fact that the question has the words "two different element types" in it.

But I also overlooked the fact that we could have had :

<a xmlns="http://ns1.foo.org/">
<a xmlns="http://ns2.foo.org/">abcd</a>
</a>

Two different element types which happen to have the same name. Thanks to Alain Ketterlin who has reminded me of it.

I thought that my DTD was ok because we have a choice #PCDATA or A and thusa base case to stop the recursion. But my Apache Xerces complains "[Fatal Error] test.dtd:1:25: The mixed content model "A" must end with ")*" when the types of child elements are constrained.". Well that's curious because then you can't define an element A whose content is character data or A. Nota mix of the two (I leave the WS due to indentation alone).

I appreciate the answers and the pointers to SGML.

Thanks
 
M

Martin Honnen

Ludovic said:
I understand namespaces (more or less) and my question was not
focused towards namespaces but towards the fact that we have a
default namespace declaration and both A belong to the same namespace
named http://www.foo.org/ but they just look different. That factors
out the namespace stuff out of the equation. I mean I could have
written <A><A>abcd</A></A>, no NS at all. But I wanted to copy/paste
the stuff in the article.

So they have to have the same type and I was really puzzled by the
fact that the question has the words "two different element types" in
it.

Well as far as the schema language is concerned you can define a schema

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="A">
<xs:complexType>
<xs:sequence>
<xs:element name="A" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

where the root element is named 'A' and has a complex type with a
sequence of a child element also named 'A' but of a different type (e.g.
a simple type like xs:string).

And then a sample like

<A>
<A>foo</A>
</A>

is a valid instance of that schema.

But of course you can also use recursion and define a schema like

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="A">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element ref="A" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

where the root element is named 'A' and then as its type has mixed
contents with itself as a possible content. In that case the sample

<A>
<A>foo</A>
</A>

is also a valid instance of the second schema but now you can write
other samples like

<A>foo<A>bar<A>baz</A>
</A>
</A>

which is also a valid instance of the second schema but of course not of
the first one.

So in terms of the schema language it is possible to define elements of
the same name with different type (although not in the same scope e.g.
you can't do
<xs:sequence>
<xs:element name="A" type="xs:string"/>
<xs:element name="A" type="xs:integer"/>
</xs:sequence>
).
 
P

Peter Flynn

[...]
<?xml version="1.0" ?>
<A xmlns:foo="http://www.foo.org/">
<A>abcd</A>
</A>

This binds the first A element *and its content* to the namespace
specified, using the prefix "foo". So the second instance of A
also acquires that namespace, and therefore again, there is a
single element type in use here.

I'm not sure this is correct. In this example, both A elements have
a "no-value" namespace (see "Namespaces in XML", section 6.2, par. 3:
"If there is no default namespace declaration in scope, the namespace
name has no value"). Both element are of the same type, but they do
not belong to the namespace declared. They would of they were named
foo:A.

You're quite right. I must have been on drugs.
But maybe you're just talking about binding prefixes to
namespaces...

By the way, I'm not even sure this example is allowed by the
Namespaces recomm.

Martin pointed out that it's not.

I should do more work on namespaces. Or then maybe not...

///Peter
 
L

Ludovic Kuty

Thanks. That clarifies things. Actually, I am currently reading
"Definitive XML Schema" 2nd by Priscilla Walmsley so it makes sense.
 
M

mike myers

Dear group,



In the article entitled "Namespace Myths Exploded" ( http://www.xml.com/lpt/a/395 ), there is something (in fact a few things but let's focus on this one thing) that bothers me. In Myth 2 called "Universal names uniquely identify element types and attributes" we have a code sample :



<?xml version="1.0" ?>

<A xmlns="http://www.foo.org/">

<A>abcd</A>

</A>



And the associated question in the text : "Do these share a single element type or do they have two different element types?". I don't understand. Ithought that the DTD will constrain the element A by telling us what is acceptable as attributes and what is acceptable as content. Once it is done, how could we get two element types ? I mean A should/could be defined as :



<!ELEMENT A (#PCDATA | A)>

<!ATTLIST A xmlns CDATA #IMPLIED>



For me, it is just one type, not two. How could we get two different element types with the same name ? I understand that it "could" be possible to distinguish them based on the context but I don't think DTDs or W3C XML Schemas allow this. Could someone enlighten me on this matter ?



Also, I would be interested in any ressource (book, article, Web page, ....) that talks about "element types" and not just "elements" and that is quite formal about XML. You may point me to the right location in the XML recommendation.



TIA,



Ludovic Kuty

If it helps theirs also a pretty good totorial here, http://www.liquid-technologies.com/Tutorials/XmlSchemas/XsdTutorial_01.aspx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,045
Latest member
DRCM

Latest Threads

Top