syntax of 'nested' tags

I

Ike

I am hoping someone can help me with the proper syntax for this. I have an
attribute, called, say "name," such that:

<set name="something">thename</set>

However, the value for name, is something that is unknown, something within
tags itself. So, for example, the tag <star index="1"/> might be
"something."

How then can I express, or more precisely, what would be the proper syntax
for <set name="<star index="1"/>">thename</set><category> given that one
cannot nest tags?

tia, Ike
 
S

Stefan Ram

Ike said:
How then can I express, or more precisely, what would be the proper syntax
for <set name="<star index="1"/>">thename</set><category> given that one
cannot nest tags?

This is one of the major flaws of XML.

I am using my own notation "Unotal" that allows for such
structured attributes. But let's get back to XML:

The usual solution is to use child elements:

<set>
<name>
<star index="1"/>
</name>
thename
</set>

I have a pre-fabricated rant, starting at the often-asked
question "When to use attributes and when to use child
elements?":

The criterion that makes sense regarding the meaning can not
be used in XML due to syntactic restrictions.

An element is describing something. A description is an
assertion. An assertion might contain unary predicates or
binary relations.

comparing this structure of assertions with the structure
of XML, it seems to be natural to represent unary predicates
with types and binary relations with attributes.

Say, "x" is a rose and belongs to Jack. The assertion is:

rose( x ) ^ owner( x, "Jack" )

This is written in XML as:

<rose owner="Jack" />

Thus, my answer would be: use element types for unary
predicates and attributes for binary relations.

Unfortunately, in XML, this is not always possible, because in
XML:

- there might be at most one type per element,

- there might be at most one attribute value per attribute
name, and

- attribute values are not allowed to be structured in
XML.

Therefore, the designers of XML document types are forced to
abuse element /types/, to describe the /relation/ of an
element to its parent element.

This /is/ an abuse, because the designation "element type"
obviously is supposed to give the /type of an element/,
i.e., a property which is intrinsic to the element alone
and has nothing to do with its relation to other elements.

The document type designers, however, are being forced to
commit this abuse, to reinvent poorly the missing structured
attribute values using the means of XML. If a rose has two
owners, it needs to be written:

<rose>
<owner>Jack</owner>
<owner>Jill</owner></rose>

Here the notion "element type" suggests that it is marked that
Jack is "an owner", in the sense that "owner" is supposed to
be the type (the kind) of Jack. The intention of the author,
however, is that "owner" is supposed to give the /relation/ to
the containing element "rose". This is the natural field of
application for attributes, as the meaning of the word
"attribute" outside of XML makes clear, but it is not possible
to use them for this purpose in XML.

An alternative solution might be the following notation.

<rose owner="Alexander Marie" />

Here a /new/ mini language (not XML anymore) is used within an
attribute value, which, of course, can not be checked anymore
by XML validators. This is really done so, for example, in
XHTML, where classes are written this way.

So in its main language XHTML, the W3C has to abandon XML
even to write class attributes. This is not such a good
accomplishment given that the W3C was able to use the
experience made with SGML and HTML when designing XML and that
XHTML is one of the most prominent XML applications.

The needless restrictions of XML inhibit the meaningful use of
syntax. This makes many document type designers wondering,
when attributes and when elements are supposed to be used,
which actually is an evidence of incapacity for the design of
XML, that does not have many more notations than attributes
and elements. And now the W3C failed to give even these two
notations a clear and meaningful dedication!

Without the restrictions described, XML alone would have
nearly the expressive power of RDF/XML, which has to repair
painfully some of the errors made in the XML-design.

Now, some recommend to /always/ use subelements, because one
can never know, whether an attribute value that seems to be
unstructured today might need to become structured tomorrow.
(Or it is recommended to use attributes only when one is quite
confident that they never will need to be structured.) Now, this
recommendation does not even try to make a sense out of
attributes, but just explains how to circumvent the obstacles
the W3C has built into XML.

Others use an XML editor that happens to make the input of
attributes more comfortable than the input of elements and
seriously suggest, therefore, to use as many attributes as
possible.

Still others have studied how to use CSS to format XML
documents and are using this to give recommendations about
when to use attributes and when to use subelements.

Of course: Mixing all these criteria (structured vs.
unstructured, data vs. "metadata", by CSS, by the ease of
editing, ...) often will give conflicting recommendations.

Other notations than XML have solved the problem by either
omitting attributes altogether or by allowing structured
attributes. I believe that notations with structured
attributes, which also allow multiple element types and
multiple attribute values for the same attribute name,
are helpful.
 
M

Malcolm Dew-Jones

Ike ([email protected]) wrote:

: I am hoping someone can help me with the proper syntax for this. I have an
: attribute, called, say "name," such that:

: <set name="something">thename</set>

: However, the value for name, is something that is unknown, something within
: tags itself. So, for example, the tag <star index="1"/> might be
: "something."

: How then can I express, or more precisely, what would be the proper syntax
: for <set name="<star index="1"/>">thename</set><category> given that one
: cannot nest tags?

<set name="&lt;star index=&quot;1&quot;/&gt;">

When an application reads the xml (using SAX, or etc) then the value of
the attribute returned by the xml parser to the application will contain
the originally desired value, (the string <star index="1"/> ).

When you create the tags you must encode the value before spitting out the
text of the xml. If you use an xml creation program then it would either
do what ever encodings are required, or at least provide functions you can
use in the appropriate places.
 
R

Richard Tobin

Malcolm Dew-Jones said:
<set name="&lt;star index=&quot;1&quot;/&gt;">

If you want the name to represent nested structure, this is a really
bad idea, since you have turned it into a flat string.

Use child elements to enclose structure, not attributes.

-- Richard
 
P

Peter Flynn

Ike said:
I am hoping someone can help me with the proper syntax for this. I have an
attribute, called, say "name," such that:

<set name="something">thename</set>

However, the value for name, is something that is unknown, something
within
tags itself. So, for example, the tag <star index="1"/> might be
"something."

I'm not sure I understand your phrase
How then can I express, or more precisely, what would be the proper syntax
for <set name="<star index="1"/>">thename</set><category> given that one
cannot nest tags?

One self-validating way is to use ID/IDREF.

The index values have to begin with a letter, eg <star index="S1"/>
so external data may require preprocessing.

Declare the index attribute as type ID, and name attribute as type IDREF,
then you can say <set name="S1">thename</set>, and the element carrying
the ID "S1" can be any element in your document for which you declare the
attribute to be an ID.

///Peter
 
K

Ken Starks

Peter said:
I'm not sure I understand your phrase

I'm not sure I understand your question either, Ike.
But it occurs to me you might be trying to define a template
for xsl.

In this case the thing you need is <xsl:attribute>

For your example you might put:

<set>
<xsl:attribute name="name">something</xsl:attribute>
the name
</set>


Instead of 'something', you can put a <xsl:value-of> element, or
whatever.
 
C

C. M. Sperberg-McQueen

This is one of the major flaws of XML.

You are certainly right that XML has flaws, just like
everything else beneath the moon. I'm not sure I agree
with your diagnosis in detail, though.
An element is describing something. A description is an
assertion. An assertion might contain unary predicates or
binary relations.

No assertions of arity greater than 2? No assertions
involving entities other than the entity assumed to be
represented by each element instance?

I think your account of a natural XML semantics is too simple.
comparing this structure of assertions with the structure
of XML, it seems to be natural to represent unary
predicates with types and binary relations with
attributes.
Say, "x" is a rose and belongs to Jack. The assertion is:
rose( x ) ^ owner( x, "Jack" )
This is written in XML as:
<rose owner="Jack" />

Correction: What you show is *one* way to write it in XML.
Thus, my answer would be: use element types for unary
predicates and attributes for binary relations.
Unfortunately, in XML, this is not always possible,
because in XML:
- there might be at most one type per element,
- there might be at most one attribute value per
attribute name, and
- attribute values are not allowed to be structured in
XML.

This doesn't seem plausible to me. Apart from the fact that
(as various contributions to the thread, not to mention the
'match' attribute of XSLT, amply illustrate) the value of an
attribute may be written in any notation of one's choice,
there is the fact that NMTOKENS, IDREFS, ENTITIES, are native
attribute types in XML 1.0.
Therefore, the designers of XML document types are forced
to abuse element /types/, to describe the /relation/ of an
element to its parent element.

I don't think you have made even a prima facie case that this
constitutes abuse of any kind.
This /is/ an abuse, because the designation "element type"
obviously is supposed to give the /type of an element/,
i.e., a property which is intrinsic to the element alone
and has nothing to do with its relation to other elements.

? You seem to be taking as a premise that types and relations
have nothing to do with each other. Why would we assume that?
One useful way to distinguish types of things, in any
modeling, is to observe the relations they can legitimately
enter into. A registered student in good standing at a
university can be enrolled in a particular class; one way to
represent this is with a relation holding between the student
and the class. A human being who is not registered cannot be
enrolled in the class. We might infer from this that we wish
to define two distinct types: one for human beings in general,
and one for registered students.
The document type designers, however, are being forced to
commit this abuse, to reinvent poorly the missing
structured attribute values using the means of XML. If a
rose has two owners, it needs to be written:
<rose>
<owner>Jack</owner>
<owner>Jill</owner></rose>

? Not necessarily. As you point out below,

<rose owners="Jack Jill"/>

is a perfectly legitimate representation of the information.
Here the notion "element type" suggests that it is marked
that Jack is "an owner", in the sense that "owner" is
supposed to be the type (the kind) of Jack. The intention
of the author, however, is that "owner" is supposed to
give the /relation/ to the containing element "rose".
This is the natural field of application for attributes,
as the meaning of the word "attribute" outside of XML
makes clear, but it is not possible to use them for this
purpose in XML.

It seems to me your objection applies with greater force to
the relational model of databases, since in that model the
ownership attribute of the rose really must be separated from
other attributes whose values are guaranteed single and
atomic.
An alternative solution might be the following notation.
<rose owner="Alexander Marie" />
Here a /new/ mini language (not XML anymore) is used
within an attribute value, which, of course, can not be
checked anymore by XML validators.

What validation are you interested in? A DTD-based XML
validator can check to ensure that 'Alexander' and 'Marie' are
both NMTOKENs, or to ensure that they are both ID values on
some elements in the document. A schema-based validator can
do those or other things.

Even if I were to accept your premise that "Alexander Marie"
is "not XML", I would find it unsurprising that XML allows the
use of non-XML notations for information. Any human-readable
document is likely to have a great deal of information
expressed only in natural language; from the very beginning,
therefore, SGML and XML have been made compatible with the
view that there may be information in the document which is
not exhibited directly by the XML markup. I have occasionally
taken the view that structured information of any kind is
almost always best represented in XML, not in specialized
notations (so I favored an instance-based notation for
document grammars even in 1996, and have mocked ISO 8879
mercilessly for providing a distinct metalinguistic notation).
I still think that's a good rough rule. But as time has
passed I have noticed more and more cases where the position
taken by the designers of SGML seems to be the right one:
allow for the existence of non-SGML notations, and do not
insist on being the sole notation in which to write
information.
So in its main language XHTML, the W3C has to abandon XML
even to write class attributes. This is not such a good
accomplishment given that the W3C was able to use the
experience made with SGML and HTML when designing XML and
that XHTML is one of the most prominent XML applications.

Hmm. Never occurred to me to think that the definition of the
'class' attribute was a problem that needed fixing.
Space-delimited tokens are really not hard to handle in most
languages I've worked with. YMMV, of course.
The needless restrictions of XML inhibit the meaningful
use of syntax. This makes many document type designers
wondering, when attributes and when elements are supposed
to be used, which actually is an evidence of incapacity

Asking when a vocabulary designer is "supposed" to use
elements and when attributes feels to me a lot like
asking when a sketch artist is supposed to use straight
lines, and when they are supposed to use curved lines.
Of course, I don't believe that XML has a single way to
represent any particular unary or binary or n-ary
predicate -- nor do I believe that a particular set of
predicates or relations is ever likely to be the only
plausible set with which to represent a particular
body of information. Will we always write

rose( x ) ^ owner( x, "Jack" )

and never any of

member(x, roses) & owns("Jack", x)
rose(x) & person(jack) & relation(jack,x,owns)
rose(x) & person(jack) & relation(x,jack,chattel)
ownership-relation(y) & instance_of(z,y) &
arg1(z,jack) & arg2(z,rose)
jack(owner_of(r)) and r(rose) and jack(human)
time(t) & human(j) & flower(r) & variety(r,rose)
& relationship(o) & true(o,j,r,t)

or any of the infinity of other ways to formalize the
proposition that Jack owns a rose?

It's quite true that XML does not prescribe a particular usage
for elements and attributes. This follows from the fact that
XML does not prescribe any particular method of using XML to
encode information or assigning semantics to tags. Some
people paraphrase this point by saying that XML "has no
semantics" or "is just syntax". But any application of XML
does have semantics. It's just that the specification of
semantics is under the control of the vocabulary designer and
not under the control of the XML Working Group or the XML
spec. There is no set of semantic primitives to which all XML
vocabularies are automatically reducible (the way the
semantics of all TeX macros are ultimately reducible to ink on
paper), there is no pattern or structure to which the
semantics need conform (the way systems of first order logic
tend to need to talk about individuals and predicates taking
individuals as arguments). The semantics of an XML
application are limited only by human ingenuity.

Personally, I think that's one of the main reasons XML has
such wide applicability: the semantics of the markup can be
anything the designer can make them be. If the price of that
freedom is that the designer gets no binding rule about when
to use attributes and when to use elements, -- well, speaking
for myself I think that's a low price to pay.

-C. M. Sperberg-McQueen
World Wide Web Consortium
 
S

Stefan Ram

No assertions of arity greater than 2?

Saying "an assertion might contain unary predicates or binary
relations" does not imply, that it might not contain relations
with an arity greater than 2.
No assertions involving entities other than the entity
assumed to be represented by each element instance?

An assertion might contains such entities.
I think your account of a natural XML semantics is too simple.

I hope the above clarifications have resolved that point.
This doesn't seem plausible to me. Apart from the fact that
(as various contributions to the thread, not to mention the
'match' attribute of XSLT, amply illustrate) the value of an
attribute may be written in any notation of one's choice,
there is the fact that NMTOKENS, IDREFS, ENTITIES, are native
attribute types in XML 1.0.

NMTOKENS, IDREFS and ENTITIES often will not suffice
to describe a notation used in attribute values.

One is free to use any notation within attribute values, but
such a notation then will not have much to do with XML.
? You seem to be taking as a premise that types and relations
have nothing to do with each other. Why would we assume that?

Because a "type" T of an object usually is a property marking
the object as belonging to a certain set T. The word "type"
might be used to mean

"a member of an indicated class" -- Merriam-Webster Online

When the type of an object is "car", it is a member of the set
S of all cars. So the type gives /one/ specification regarding
the object (this specification is the set S).

An "attribute" might be - according to Merriam-Webster Online
- "an object closely associated with or belonging to a
specific person, thing, or office".

So, while a type T of an object e is a /single/ specification
"T", an attribute "Ro" is its relation R to another object "o".
An attribute consists of /two/ specifications: the other
object "o" of the association and the kind "R" of the association.

By choosing these english words "type" and "attribute" and by
implementing them with the structure "<T/>" with /one/
specification "T" and <... R=o ...> with /two/ specifications
"R" and "o", just as types and attributes usually have, the
XML specifications suggests that the words "type" and
"attribute" in the XML specification are intended to mean
something like "type" and "attribute" in the English language.
This is some kind of suggested semantics.
What validation are you interested in?

It might make sense to be able to verify that the attribute
value contains only names from the set { Alexander, Marie,
Jack, Jill } separated by white space.

If an attribute value could be a structure, i.e., an element,
then the whole apparatus of a schema language could be used to
specify restrictions for this element.
Will we always write
rose( x ) ^ owner( x, "Jack" )
and never any of
member(x, roses) & owns("Jack", x) (...)
or any of the infinity of other ways to formalize the
proposition that Jack owns a rose?

All these notations, of course, are perfectly legitimate for
certain purposes.
Some people paraphrase this point by saying that XML "has no
semantics" or "is just syntax".

Even though certain names in XML (such as "type" or
"attribute") might suggest some meaning, I can agree with
that.

The problem is, that the syntax is too restrictive to fit some
natural way to use it, when it forbids structured attributes.
(If this is too much freedom for a certain application, it
might still forbid this via a schema.)

A comparison with mathematical notation would be: Imagine
that one might write the equation:

x=2

but would be forbidden to write the equation

x=2+3

because the right-hand side "=" would not be allowed to be
structured. Then, whenever one wants to write "x=2+3", one
would be forced to write, for example:

x/(2+3)

only because the right-hand side of the division-operator "/"
would be allowed to be structured. One would have to explain
somewhere, then, when "/" is used to mean "division" and when
it is (ab)used to write an "equation".
The semantics of an XML application are limited only by human
ingenuity.

... and by the syntax of XML.

It might make no sense to specify a vocabulary for attribute
names, when the intended attributes are not possible, because
they need to be structured. So the XML-application designer is
forced to invent some workaround, such as using child element
types instead of attribute names, thereby contradicting the
meaning of the words "type" and "attribute" in the English
language, (ab)using a type name as the name of a relation.
 
C

C. M. Sperberg-McQueen

(e-mail address removed) (C. M. Sperberg-McQueen) writes:

Because a "type" T of an object usually is a property marking
the object as belonging to a certain set T. ...

I think everything you say here is true. But none of it
explains why you think the use of something called a "type"
to express something called a "relation" is an abuse of language
or of anything else.
It might make sense to be able to verify that the attribute
value contains only names from the set { Alexander, Marie,
Jack, Jill } separated by white space.

If an attribute value could be a structure, i.e., an element,
then the whole apparatus of a schema language could be used to
specify restrictions for this element.

XML Schema can indeed perform the validation you describe.

If we wish to treat lists of atomic values as structures
(and why not?), then you are quite right: if this attribute
value can be a structure, it can be validated in the way
you describe. It can in fact be treated as a structure, both
by XML Schema and by DTDs -- and so it can be validated
(by XSD in the way you describe, by DTDs only in a weaker
way).

best regards,

-CMSMcQ
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,781
Messages
2,569,615
Members
45,301
Latest member
BuyPureganics

Latest Threads

Top