Tags v.s. Attributes

Discussion in 'XML' started by gaijinco, Jul 8, 2008.

  1. gaijinco

    gaijinco Guest

    I'm fairly new to using XML and I tend to be quite verbose when
    writting files.

    Is there any disadvantage of writting:

    <person>
    <name first="Carlos" second="" />
    <lastname first="Obregon" second="Jimenez" />
    <id type="CC" number="79879389" />
    <birthday day="17" month="02" year="1979" />
    <member day="01" month="06" year="2007" />
    <adress id="Evegreen Cr. 1234" />
    <telephone number="555-123456" />
    <email user="me" domain="home.com" />
    </person>

    Intead of:

    <person>
    <name>
    <first>Carlos</first>
    <second></second>
    </name>
    <lastname>
    <first>Obregon</first>
    <second>Jimenez</second>
    </lastname>
    <id>
    <type>CC</type>
    <number>79879389</number>
    </id>
    <birthday>
    <day>17</day>
    <month>02</month>
    <year>1979</year>
    </birthday>
    <member>
    <day>01</day>
    <month>06</month>
    <year>2008</year>
    </member>
    <adress>Evegreen Cr. 1234</adress>
    <telephone>555-123456</telephone>
    <email>
    <user>me</user>
    <domain>home.com</domain>
    </email>
    </person>

    Thanks.
    gaijinco, Jul 8, 2008
    #1
    1. Advertising

  2. gaijinco

    Mukul Gandhi Guest

    Mukul Gandhi, Jul 9, 2008
    #2
    1. Advertising

  3. gaijinco

    Peter Flynn Guest

    Peter Flynn, Jul 13, 2008
    #3
  4. gaijinco

    Stefan Ram Guest

    gaijinco <> writes:
    ><name first="Carlos" second="" />
    ><name>
    > <first>Carlos</first>
    > <second></second>
    ></name>


    When a new document type is to be defined, when should one
    choose child elements and when attributes?

    The criterion that makes sense regarding the meaning can not
    be used in XML due to syntactic restrictions.

    An element is describing something. A description is an
    assertion. An assertion might contain unary predicates or
    binary relations.

    Comparing this structure of assertions with the structure
    of XML, it seems to be natural to represent unary predicates
    with types and binary relations with attributes.

    Say, "x" is a rose and belongs to Jack. This assertion can
    be written in a more formal way to show the relations used:

    rose( x ) ^ owner( x, Jack )

    This is written in XML as:

    <rose owner="Jack" />

    Thus, my answer would be: use element types for unary
    predicates and attributes for binary relations.

    Unfortunately, in XML, this is not always possible, because
    in XML:

    - there might be at most one type per element,

    - there might be at most one attribute value per attribute
    name, and

    - attribute values are not allowed to be structured in
    XML.

    Therefore, the designers of XML document types are forced to
    abuse element /types/ in order to describe the /relation/
    of an element to its parent element.

    This /is/ an abuse, because the designation "element type"
    obviously is supposed to give the /type of an element/,
    i.e., a property which is intrinsic to the element alone
    and has nothing to do with its relation to other elements.

    The document type designers, however, are being forced to
    commit this abuse, to reinvent poorly the missing structured
    attribute values using the means of XML. If a rose has two
    owners, the following element is not allowed in XML:

    <rose owner="Jack" owner="Jill" />

    One is made to use representations such as the following:

    <rose>
    <owner>Jack</owner>
    <owner>Jill</owner></rose>

    Here the notion "element type" suggests that it is marked
    that Jack is "an owner", in the sense that "owner" is
    supposed to be the type (the kind) of Jack. Not an
    "owner of ..." (which would make sense), but just "an owner".

    The intention of the author, however, is that "owner" is
    supposed to give the /relation/ to the containing element
    "rose". This is the natural field of application for
    attributes, as the meaning of the word "attribute" outside
    of XML clearly indicates, but it is not possible to
    always use attributes for this purpose in XML.

    An alternative solution might be the following notation.

    <rose owner="Alexander Marie" />

    Here a /new/ mini language (not XML anymore) is used within
    an attribute value, which, of course, can not be checked
    anymore by XML validators. This is really done so, for
    example, in XHTML, where classes are written this way.

    So in its most prominent XML application XHTML, the W3C
    has to abandon XML even to write class attributes. This
    is not such a good accomplishment given that the W3C
    was able to use the experience made with SGML and HTML
    when designing XML.

    The needless restrictions of XML inhibit the meaningful
    use of syntax. This makes many document type designers
    wonder, when attributes and when elements
    should be used, which actually is an evidence of
    incapacity for the design of XML: XML does not have many
    more notations than these two: attributes and elements.
    And now the W3C failed to give even these two
    notations a clear and meaningful dedication!

    Without the restrictions described, XML alone would have
    nearly the expressive power of RDF/XML, which has to repair
    painfully some of the errors made in the XML-design.

    Now, some "experts" recommend to /always/ use subelements,
    because one can never know, whether an attribute value
    that seems to be unstructured today might need to become
    structured tomorrow. Other "experts" recommend to use
    attributes only when one is quite confident that they
    never will need to be structured. This recommendation
    does not even try to make a sense out of attributes,
    but just explains how to circumvent the obstacles
    the W3C has built into XML.

    Others recommend to use attributes for something they
    call "metadata". They ignore that this limits "metadata"
    to unstructured values.

    Others use an XML editor that happens to make the input of
    attributes more comfortable than the input of elements and
    seriously suggest, therefore, to use as many attributes as
    possible.

    Still others have studied how to use CSS to format XML
    documents and are using this to give recommendations about
    when to use attributes and when to use subelements. (So
    that the resulting document can be formatted most easily
    with CSS.)

    Of course: Mixing all these criteria (structured vs.
    unstructured, data vs. "metadata", by CSS, by the ease of
    editing, ...) often will give conflicting recommendations.

    Certain other notations than XML have solved the problem
    by either omitting attributes altogether or by allowing
    structured attributes.
    Stefan Ram, Jul 13, 2008
    #4
  5. gaijinco

    Peter Flynn Guest

    Stefan Ram wrote:
    > gaijinco <> writes:
    >> <name first="Carlos" second="" />
    >> <name>
    >> <first>Carlos</first>
    >> <second></second>
    >> </name>

    >
    > When a new document type is to be defined, when should one
    > choose child elements and when attributes?
    >
    > The criterion that makes sense regarding the meaning can not
    > be used in XML due to syntactic restrictions.


    That is too broad. Often it can.

    > An element is describing something. A description is an
    > assertion. An assertion might contain unary predicates or
    > binary relations.
    >
    > Comparing this structure of assertions with the structure
    > of XML, it seems to be natural to represent unary predicates
    > with types and binary relations with attributes.
    >
    > Say, "x" is a rose and belongs to Jack. This assertion can
    > be written in a more formal way to show the relations used:
    >
    > rose( x ) ^ owner( x, Jack )
    >
    > This is written in XML as:
    >
    > <rose owner="Jack" />


    This is not true. It demonstrates very well a misunderstanding of text
    markup that is unfortunately far too prevalent. Naming element types
    after concrete objects is rare and almost always wrong. Possibly a DTD
    for a horticulturalist might do this, but in normal text applications
    you would write something like

    <plant type="rose" owner="Jack">x</plant>

    That is, "x" is an instance of a type of plant called a rose and this
    one belongs to Jack.

    > Thus, my answer would be: use element types for unary
    > predicates and attributes for binary relations.
    >
    > Unfortunately, in XML, this is not always possible, because
    > in XML:
    >
    > - there might be at most one type per element,
    >
    > - there might be at most one attribute value per attribute
    > name, and
    >
    > - attribute values are not allowed to be structured in
    > XML.
    >
    > Therefore, the designers of XML document types are forced to
    > abuse element /types/ in order to describe the /relation/
    > of an element to its parent element.
    >
    > This /is/ an abuse, because the designation "element type"
    > obviously is supposed to give the /type of an element/,
    > i.e., a property which is intrinsic to the element alone
    > and has nothing to do with its relation to other elements.


    Nearly. But you are trying to force XML into a very narrow,
    computer-science style mould of logic, which it was never intended for.

    > The document type designers, however, are being forced to
    > commit this abuse, to reinvent poorly the missing structured
    > attribute values using the means of XML. If a rose has two
    > owners, the following element is not allowed in XML:
    >
    > <rose owner="Jack" owner="Jill" />


    Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
    to multiple parallel values, where owner is declared as IDREFS or ENTITIES.

    > One is made to use representations such as the following:
    >
    > <rose>
    > <owner>Jack</owner>
    > <owner>Jill</owner></rose>


    This would be suboptimal for this case, where the owners are presumed to
    be uniquely occurring individuals. But it would be possible.

    > Here the notion "element type" suggests that it is marked
    > that Jack is "an owner", in the sense that "owner" is
    > supposed to be the type (the kind) of Jack. Not an
    > "owner of ..." (which would make sense), but just "an owner".


    The normal solution would be something like

    ....
    <owners>
    <owner id="Jack">Jack the Lad</owner>
    <owner id="Jill">Jill the Lass</owner>
    ...
    </owners>
    ....
    <plant type="rose" owners="Jack Jill">x</plant>

    (with id as ID and owners as IDREFS). Certainly you could choose to
    expand the declaration of <owner> to allow subelements to provide finer
    detail (see many of the TEI declarations for examples).

    > The intention of the author, however, is that "owner" is
    > supposed to give the /relation/ to the containing element
    > "rose". This is the natural field of application for
    > attributes, as the meaning of the word "attribute" outside
    > of XML clearly indicates, but it is not possible to
    > always use attributes for this purpose in XML.
    >
    > An alternative solution might be the following notation.
    >
    > <rose owner="Alexander Marie" />
    >
    > Here a /new/ mini language (not XML anymore) is used within
    > an attribute value, which, of course, can not be checked
    > anymore by XML validators. This is really done so, for
    > example, in XHTML, where classes are written this way.


    I suggest you re-read the XML Spec for IDREFS and ENTITIES.

    > So in its most prominent XML application XHTML, the W3C
    > has to abandon XML even to write class attributes. This
    > is not such a good accomplishment given that the W3C
    > was able to use the experience made with SGML and HTML
    > when designing XML.


    That was done for exogenous political reasons, as I understand it, not
    for technical ones.

    > The needless restrictions of XML inhibit the meaningful
    > use of syntax. This makes many document type designers
    > wonder, when attributes and when elements
    > should be used, which actually is an evidence of
    > incapacity for the design of XML: XML does not have many
    > more notations than these two: attributes and elements.
    > And now the W3C failed to give even these two
    > notations a clear and meaningful dedication!


    No-one is pretending that XML is perfect, but you must understand that
    it was designed for text documents, not for database engineering.

    > Without the restrictions described, XML alone would have
    > nearly the expressive power of RDF/XML, which has to repair
    > painfully some of the errors made in the XML-design.
    >
    > Now, some "experts" recommend to /always/ use subelements,
    > because one can never know, whether an attribute value
    > that seems to be unstructured today might need to become
    > structured tomorrow. Other "experts" recommend to use
    > attributes only when one is quite confident that they
    > never will need to be structured. This recommendation
    > does not even try to make a sense out of attributes,
    > but just explains how to circumvent the obstacles
    > the W3C has built into XML.


    Please re-read the FAQ warning on this subject.

    [snip]

    ///Peter
    Peter Flynn, Jul 13, 2008
    #5
  6. gaijinco

    Stefan Ram Guest

    Peter Flynn <> writes:
    >Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
    >to multiple parallel values, where owner is declared as IDREFS or ENTITIES.


    Thank you, I was not aware of IDREFS or ENTITIES yet.
    So, there is limited support for parallel values in XML.

    One might say for »parallel /references/« (to ids or entities).
    It seems as if it can not be used when the values are literals
    (not references) such as numerals (numbers), for example.
    Stefan Ram, Jul 14, 2008
    #6
  7. gaijinco

    Peter Flynn Guest

    Stefan Ram wrote:
    > Peter Flynn <> writes:
    >> Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
    >> to multiple parallel values, where owner is declared as IDREFS or ENTITIES.

    >
    > Thank you, I was not aware of IDREFS or ENTITIES yet.
    > So, there is limited support for parallel values in XML.
    >
    > One might say for »parallel /references/« (to ids or entities).
    > It seems as if it can not be used when the values are literals
    > (not references) such as numerals (numbers), for example.


    That's correct. XML is based on SGML DTDs, and was aimed at the document
    publishing field. There are many things that users of rectangular data
    would like to see allowed, but for that you need another syntax.

    ///Peter
    Peter Flynn, Jul 26, 2008
    #7
  8. Stefan Ram wrote:
    > Peter Flynn <> writes:
    >> Again, not true. <rose owner="Jack Jill Stefan"/> is the normal solution
    >> to multiple parallel values, where owner is declared as IDREFS or ENTITIES.

    >
    > Thank you, I was not aware of IDREFS or ENTITIES yet.
    > So, there is limited support for parallel values in XML.
    >
    > One might say for »parallel /references/« (to ids or entities).
    > It seems as if it can not be used when the values are literals
    > (not references) such as numerals (numbers), for example.
    >


    That though is a restriction of DTD rather than of XML itself. Other XML
    validation languages such as XSD or Relax NG Schema, or schematron for
    exampes can all easily be used to constrain an attribute to be (for
    example) a white space separated list of integer values.

    David

    --
    http://dpcarlisle.blogspot.com
    David Carlisle, Jul 26, 2008
    #8
  9. gaijinco

    Stefan Ram Guest

    David Carlisle <> writes:
    >That though is a restriction of DTD rather than of XML itself. Other XML
    >validation languages such as XSD or Relax NG Schema, or schematron for
    >exampes can all easily be used to constrain an attribute to be (for
    >example) a white space separated list of integer values.


    You can also create a validation language that can be used to
    constrain an attribute to be a valid Java program.

    However, the structure of such an attribute is not being
    described by the XML language anymore. The XML TR does not
    describe the Java syntax. So it is not provided by the XML TR.

    The XML TR describes a document made of elements and possibly
    attributes. It provides rules and names for these structural parts.

    XML provides rules and names for a list of IDREFs within an
    attribute, so this still »is« XML.

    But the XML TR does not provide rules and syntactical names
    (nonterminal symbols) for a list of integer numerals (integer
    literals) within an attribut.

    This is another language. It might be call »Relax-XML« or so.

    Such a valid Relax-XML document also can be a valid XML document.
    Insofar it »is« XML. But XML does not describe a special
    syntax for integer numerals within an attribute value. To XML,
    this is just an opaque attribute value. Interpreting this as a
    list of integers is not backed by the XML TR anymore, this needs
    the additional Relax specification.
    Stefan Ram, Jul 27, 2008
    #9
  10. Stefan Ram wrote:
    > David Carlisle <> writes:
    >> That though is a restriction of DTD rather than of XML itself. Other XML
    >> validation languages such as XSD or Relax NG Schema, or schematron for
    >> exampes can all easily be used to constrain an attribute to be (for
    >> example) a white space separated list of integer values.

    >
    > You can also create a validation language that can be used to
    > constrain an attribute to be a valid Java program.
    >
    > However, the structure of such an attribute is not being
    > described by the XML language anymore. The XML TR does not
    > describe the Java syntax. So it is not provided by the XML TR.
    >
    > The XML TR describes a document made of elements and possibly
    > attributes. It provides rules and names for these structural parts.
    >
    > XML provides rules and names for a list of IDREFs within an
    > attribute, so this still »is« XML.
    >
    > But the XML TR does not provide rules and syntactical names
    > (nonterminal symbols) for a list of integer numerals (integer
    > literals) within an attribut.
    >
    > This is another language. It might be call »Relax-XML« or so.
    >
    > Such a valid Relax-XML document also can be a valid XML document.
    > Insofar it »is« XML. But XML does not describe a special
    > syntax for integer numerals within an attribute value. To XML,
    > this is just an opaque attribute value. Interpreting this as a
    > list of integers is not backed by the XML TR anymore, this needs
    > the additional Relax specification.
    >


    there are lots of things the XML spec doesn't secify, but to say XML
    encoding lists of numbers (an SVG path attribute for example) isn't XML
    is a rather strange conclusion to draw. For a start a lot (quite
    possibly a majority) of XML is "just" well formed and not validated at
    all so the relative expressive strengths of validation languages are
    irrelevant. For documents that are to be validated, it's issentially
    irrelevant to the end user, the internal organisation and timing of the
    various working groups that mean that xml is split across a range of
    specifications, xml itself, xml names, sax, dom, xsd etc. If you just go
    by what's in the XML rec without relying on anything else, you can not
    even use any standard parsing model, so use fo XML woul dbe rather hard,
    or you may decide it's legal to use names like <a:b:c/> but then find
    that the vast majority of current xml tools follow the additinal
    constraints in the namespace spec and would reject such an element.

    Actually even by your definition XML can do more than you imply: IDREFS
    isn't the only list type NMTOKENS for example would allow you to specify
    an attribute is a white space list of something, even if you can't, in
    DTD, restrict the tokens further to be just digits. But to say XML with
    lists of integers in an attribute isn't really XML because DTD can't
    validate the XML is just like saying that an XMl document containing
    english text isn't really XML because DTD can't enforce spell checking.
    By that definition, what can XML be used for?

    David

    --
    http://dpcarlisle.blogspot.com
    David Carlisle, Jul 30, 2008
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dean H. Saxe
    Replies:
    0
    Views:
    1,008
    Dean H. Saxe
    Jan 3, 2004
  2. Rob Nicholson
    Replies:
    3
    Views:
    656
    Rob Nicholson
    May 28, 2005
  3. Ranganath

    Custom Tags within Custom Tags.

    Ranganath, Oct 17, 2003, in forum: Java
    Replies:
    2
    Views:
    440
    Ranganath
    Oct 21, 2003
  4. Mike
    Replies:
    3
    Views:
    858
    Michael Borgwardt
    Jan 9, 2004
  5. A. Brinkmann
    Replies:
    2
    Views:
    1,062
    A. Brinkmann
    Apr 16, 2004
Loading...

Share This Page