Basic XML question

M

mlt

It seems that nodes can have different definitions:

<box>
<spec>12</spec>
<spec>13</spec>
</box>
<box>
<spec id="BA" value="short"/>
<spec id="BB" value="thin"/>
</box>

1) The first box node contains two nodes without attributes that is
"closed": <spec>...</spec>
- these kind of nodes only seems to have a "textContent"

2) The second box node contains two nodes with two attributes that is "open"
<spec ..... />
- These nodes does not have a "textContent"

What are these different ways of defining nodes called? And when is it good
to use the one instead of the other?
 
J

Joe Kesselman

What are these different ways of defining nodes called? And when is it
good to use the one instead of the other?

An element which does not have content (whether or not it has
attributes) may be expressed either with an open and close tag pair:
<foo></foo>
or with an "empty element" tag
<foo/>

These two forms are completely identical semantically; the difference is
almost purely one of convenience and personal preference.

Why "almost"? Well, it is officially preferred (ie, there is a SHOULD
keyword in the XML Recommendation) that the empty element tag be used
only for elements whose DTD or Schema specification says they can never
contain content (ie, that in a Valid document they will always be
empty). But I don't believe anyone has ever actually followed that
preference, since the empty-element tag is a trifle more efficient to
type by hand or process by machine. So that proposed convention seems to
have died due to lack of support from the community, and in practice the
two forms really are completely equivalent.

I'm presuming that's what you meant by "open" versus "closed".

Note that it's completely orthogonal to the decision of whether the
element should carry data as attributes, as content, or both. That's a
decision made when the language which XML is being used to implement is
designed, based on a combination of stylistic preferences and practical
concerns. Attributes have their values normalized to some degree, have
no meaningful order, and are limited to name-value pairs, whereas
contained nodes may include any desired mix of
text/elements/comments/processing instructions, do not normalize the
text, and do reliably report their order as a possibly significant piece
of information. Both approaches have advantages and disadvantages.
Generally, by convention, attributes are used for information that
modifies how the element they're attached to should be interpreted,
while the information that is logically contained by that element is
placed in its content ("child nodes")... but there are exceptions, which
is where the stylistic element comes in. Do what makes sense given what
the document is intended to mean, but if there's any doubt you should
probably lean toward child nodes -- they give you more options for
future enhancement. If you need more guidance on this, websearching for
"XML children versus attributes" or similar phrases will find many other
discussions of this trade-off.

Hope that helps.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top