Re: Misuse of XML namespaces; call for help in marshalling arguments

Discussion in 'XML' started by Peter Flynn, Aug 6, 2004.

  1. Peter Flynn

    Peter Flynn Guest

    Simon North wrote:

    > I feel a bit like a traitor for posting an XML question here, but no-one
    > in comp.text.xml seems to be bothered by my concerns, and this group is
    > a more likely place for markup purists ... so, with excuses, the following


    I haven't seen your post in c.t.x. But c.t.s is probably inhabited by no
    greater percentage of purists than c.t.x -- just that they are probably
    more knowledgeable and experienced by virtue of having done this stuff for
    longer.

    WARNING: crossposted to c.t.x. Followups set to c.t.x

    > I am documenting C++ classes. We have created an authoring environment and
    > the developers write the text themselves.
    >
    > I edit and output the XML instances. I've created my own code (DTD,
    > schema, XSLT stylesheet and CSS stylesheet) and then I pump it into
    > RoboHelp. Works pretty well. So far, so good.
    >
    > However, IMNSHO, the XML instances are awful. The developer responsible
    > has used namespace prefixes as if they were a cute part of the element
    > name syntax. An abbreviation of the parent element name becomes the
    > namespace prefix for its children. The instances therefore look something
    > like this:
    >
    > <IT:IT xmlns:IAE="IAE" xmlns:IAEA="IAEA" xmlns:IAER="IAER" xmlns:IME="IME"
    > xmlns:IMEPV="IMEPV" xmlns:IT="IT">
    > <IT:N> ... </IT:N>
    > ...
    > <IT:IME>
    > <IME:IME>
    > <IME:Name>...</IME:Name>
    > ...
    > </IME:IME>
    > ...
    > </IT:IME>
    > </IT:IT>
    >
    > This is just a fraction; the nesting goes pretty deep.
    >
    > Apart from finding this inherently ugly, my gut feeling tells me that this
    > is an example of something you should NOT do with namespaces. Before I go
    > back and complain, I'm appealing for help. Am I just being pedantic? ...
    > but if I am right, what are the convincing arguments why this practice is
    > wrong?


    No, you are perfectly correct, and this is a fine example of a ludicrous
    abuse of namespaces. It's unfortunately rather common among people who have
    come to XML by way of e-commerce and schemas, rather than by way of SGML and
    documents. Many e-commerce (data) users have good reasons for doing this
    kind of thing where a document is composed of discrete data structures each
    with their own provenance. In the case of text documentation this is almost
    certainly not the case.

    I prefic these comments by an acknowledgment that I obviously don't
    know the precise circumstances of the genesis of this markup, so what
    I say may be entirely wrong for a given value of "markup", and may
    need to be taken with the proverbial grain of salt. IANAL.

    1. XML systems already provide for inheritance by way of descent, so
    imposing it via namespaces is singularly pointless.

    2. Namespaces provide a convenient way of attributing ownership of a
    data structure: in this case they appear to do nothing of the sort
    (although those acronyms may perhaps have a hidden meaning).

    3. The value of a namespace attribute must be a URI. An acronym may well
    be a local, relative URI, but it is not sufficiently meaningful for
    practical use.

    4. Short element type names were fine in the days of SGML when files were
    punched onto cards and keystroke minimisation was important. The XML
    Spec, however, makes it clear that terseness is of minimal importance.
    I read this to mean that descriptive names have a semantic value which
    outstrips their length.

    5. If you are documenting computer procedures, then use DocBook. It was
    written specifically to do exactly that. It's not perfect, but it's
    very good, reliable, stable, well-supported, and easy to use. IMHO
    anyone who uses anything else is either smater than Norm Walsh et al
    or seriously needs their head examining.

    Go tell your colleague to learn something about XML for text documents
    first, and then pick a suitable DTD or Schema afterwards. Reinventing
    the wheel -- and badly -- is not generally recommended approach for the
    beginner.

    Convert your existing instances into DocBook and retrain your authors
    to use it -- there are copious amounts of software, stylesheets and
    support available. Then if your colleague insists on retaining the
    above format for some exogenous reason, write an XSLT transformation
    to output it.

    WARNING: there are many people out there who will disagree violently
    with what I have said, and may be able to adduce good arguments for
    retaining the namespaces (such as the need to segment the processing).
    YMMV. FAQ: http://www.ucc.ie/xml/

    ///Peter Flynn
    --
    "The cat in the box is both a wave and a particle"
    -- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
    Peter Flynn, Aug 6, 2004
    #1
    1. Advertising

  2. Peter Flynn

    Simon North Guest

    Re: Misuse of XML namespaces; call for help in marshallingarguments

    >>> Peter Flynn<> 8/6/2004 9:03:24 PM >>>

    > I haven't seen your post in c.t.x.


    I had posted, but no-one had responded, and over the past couple of years
    I've seen increasingly
    fewer discussions of a less directly code-oriented problem.

    Thank you for the points. To clarify a little, the company has it's own
    proprietory application
    for documenting the classes. This tool behaves like a combination database,
    configuration management
    system and editing environment. The XML instances have no other function
    than documentation,
    so I think all your comments are valid and were much as I expected, it's
    just nice to have them
    confirmed by someone more knowledgeable than I.

    > 1. XML systems already provide for inheritance by way of descent, so
    > imposing it via namespaces is singularly pointless.


    I wondered about this one. Just for discussion, consider this:

    <elementA>
    <name></name>
    </elementA>

    or

    <elementA>
    <A:name></A:name>
    </elementA>

    almost every elementN has a name element, but the namespace usage does make

    it easier to address the element uniquely ... but on the other hand, I can't
    write a generic
    name rule and have to create a separate rule for each N:name element. Seems
    a bit
    like swings and roundabouts to me.

    > 2. Namespaces provide a convenient way of attributing ownership of a
    > data structure: in this case they appear to do nothing of the sort
    > (although those acronyms may perhaps have a hidden meaning).


    The acronyms are simply an abbreviation of the parent element name.

    > 4. Short element type names were fine in the days of SGML when files were
    > punched onto cards and keystroke minimisation was important. The XML
    > Spec, however, makes it clear that terseness is of minimal importance.
    > I read this to mean that descriptive names have a semantic value which
    > outstrips their length.


    I abbreviated some of the element names myself to make the example a bit
    more compact.
    This is a more complete fragment:

    <ET:ExportType
    xmlns:EAE="EAE"
    xmlns:EAEE="EAEE"
    xmlns:EC="EC"
    xmlns:EF="EF"
    xmlns:EME="EME"
    xmlns:EMEE="EMEE"
    xmlns:EMEPV="EMEPV"
    xmlns:EP="EP"
    xmlns:EPA="EPA"
    xmlns:EPR="EPR"
    xmlns:ET="ET"
    xmlns:ETE="ETE"
    xmlns:ExportActionElementArgument="ExportActionElementArgument"
    Key="954_0_168798">
    <ET:Name>Calendar</ET:Name>
    <ET:Description></ET:Description>
    <ET:Hyperlink></ET:Hyperlink>
    <ET:Kind>Logic</ET:Kind>
    <ET:parentName>Object</ET:parentName>
    <ET:LastChangedVersion>3.2</ET:LastChangedVersion>
    <ET:LastChanged>0001-01-01T00:00:00.000</ET:LastChanged>
    <ET:LastDelivered>0001-01-01T00:00:00.000</ET:LastDelivered>
    <ET:LastDocumented>0001-01-01T00:00:00.000</ET:LastDocumented>
    <ET:LastImported>2004-05-05T18:45:23.000</ET:LastImported>
    <ET:ExportMoment>2004-05-12T17:24:37.000</ET:ExportMoment>
    <ET:ExportComponent>
    <EC:ExportComponent Key="954_0_168799">
    <EC:Hyperlink></EC:Hyperlink>
    <EC:Name>Calendar</EC:Name>
    </EC:ExportComponent>
    </ET:ExportComponent>
    <ET:ExportFunctionality>
    </ET:ExportFunctionality>
    <ET:ExportTypeExample>
    </ET:ExportTypeExample>
    <ET:ExportModelElement>
    <EME:ExportModelElement Key="954_0_168800">
    <EME:Kind>Attribute</EME:Kind>
    <EME:Description></EME:Description>
    <EME:IsReadOnly>true</EME:IsReadOnly>
    <EME:Name>CalendarDirty</EME:Name>
    <EME:TypeName>Boolean</EME:TypeName>
    <EME:ExportModelElementPossibleValue>
    </EME:ExportModelElementPossibleValue>
    <EME:ExportModelElementExample>
    </EME:ExportModelElementExample>
    </EME:ExportModelElement>
    </EME:ExportModelElement>
    <EME:ExportModelElement Key="954_0_168806">
    <EME:Kind>Attribute</EME:Kind>
    <EME:Description></EME:Description>
    <EME:IsReadOnly>false</EME:IsReadOnly>
    <EME:Name>UpdateInterval</EME:Name>
    <EME:TypeName>Duration</EME:TypeName>
    <EME:ExportModelElementPossibleValue>
    </EME:ExportModelElementPossibleValue>
    <EME:ExportModelElementExample>
    </EME:ExportModelElementExample>
    </EME:ExportModelElement>
    </ET:ExportModelElement>

    etc.

    > 5. If you are documenting computer procedures, then use DocBook. It was
    > written specifically to do exactly that. It's not perfect, but it's
    > very good, reliable, stable, well-supported, and easy to use.


    True, with reservations. In this instance I am only providing a quick way
    to provide navigable C++ code (the guys rejected the conventional 'doxygen'

    embedded comment approach).

    > Convert your existing instances into DocBook and retrain your authors
    > to use it -- there are copious amounts of software, stylesheets and
    > support available. Then if your colleague insists on retaining the
    > above format for some exogenous reason, write an XSLT transformation
    > to output it.


    I'm not really interested in text documents. With all due credit to Norm,
    docbook
    would be complete overkill for this application. The effort of tailoring
    docbook to
    give me what I want and then writing an XSLT transform would be orders of
    magnitude greater than the 2 days it's taken me so far. I've tool-generated
    a
    schema that gives me enough validation to confirm that the input code is
    consistent
    and that's all I need.

    There are no static instances as such. As the classes change (often) I can
    export
    a new set of XML data, pump them through my XSLT transform and then use the
    HTML
    code in RoboHelp. The authors (me and the developers) simply see ascii text
    in fields
    in a GUI. No human is every going to write or edit the code. I'm the only
    one who really
    gets to see the XML code, and I'm the only one who really does anything with
    it.
    I'm the sole writer here too.

    Thanks for the fuel,

    Simon North



    Quintiq Application Software BV
    's Hertogenbosch, The Netherlands
    Simon North, Aug 9, 2004
    #2
    1. Advertising

  3. Peter Flynn

    Peter Flynn Guest

    Simon North wrote:

    >>>> Peter Flynn<> 8/6/2004 9:03:24 PM >>>


    > Thank you for the points. To clarify a little, the company has its own
    > proprietory application for documenting the classes. This tool behaves
    > like a combination database, configuration management system and editing
    > environment. The XML instances have no other function than documentation,


    That's probably as much as can be expected.

    >> 1. XML systems already provide for inheritance by way of descent, so
    >> imposing it via namespaces is singularly pointless.

    >
    > I wondered about this one. Just for discussion, consider this:
    >
    > <elementA>
    > <name></name>
    > </elementA>
    >
    > or
    >
    > <elementA>
    > <A:name></A:name>
    > </elementA>
    >
    > almost every elementN has a name element, but the namespace usage does
    > make it easier to address the element uniquely ...


    I'm not clear what you mean by this. As I see it, it makes it harder, eg
    in XSLT:

    <xsl:template match="elementA/name">

    as opposed to

    <xsl:template match="elementA/A:name">

    or Omnimark:

    element name when parent is elementA
    element A:name when parent is elementA

    I don't see the advantage of the two extra characters "A:" unless you
    expect to have B:name or C:name etc *also* permitted inside elementA,
    in which case I'd query the data model (an attribute might be more
    effective here).

    > but on the other hand, I can't write a generic name rule and have to
    > create a separate rule for each N:name element. Seems a bit like swings
    > and roundabouts to me.


    Most processing languages seem to be defective here, and provide no function
    to address the namespace separately, nor to address the base element type
    without the namespace. I suspect they may have forgotten the original
    reason why namespaces were attractive in the first place (faulty though it
    may have been).

    >> 2. Namespaces provide a convenient way of attributing ownership of a
    >> data structure: in this case they appear to do nothing of the sort
    >> (although those acronyms may perhaps have a hidden meaning).

    >
    > The acronyms are simply an abbreviation of the parent element name.


    It keeps the file size small, but makes it harder to read.

    >> 4. Short element type names were fine in the days of SGML when files were
    >> punched onto cards and keystroke minimisation was important. The XML
    >> Spec, however, makes it clear that terseness is of minimal importance.
    >> I read this to mean that descriptive names have a semantic value which
    >> outstrips their length.

    >
    > I abbreviated some of the element names myself to make the example a bit
    > more compact. This is a more complete fragment:
    >

    [snip]

    That explains it :)

    >> 5. If you are documenting computer procedures, then use DocBook. It was
    >> written specifically to do exactly that. It's not perfect, but it's
    >> very good, reliable, stable, well-supported, and easy to use.

    >
    > True, with reservations. In this instance I am only providing a quick way
    > to provide navigable C++ code


    Ah. That's a completely different thing from wanting to write documentation.

    > (the guys rejected the conventional 'doxygen' embedded comment approach).


    Proper order too.

    > I'm not really interested in text documents.


    That changes a lot :)

    > With all due credit to Norm, docbook would be complete overkill for this >
    > application. The effort of tailoring docbook to give me what I want


    I suspect little or no tailoring is needed.

    > and then writing an XSLT transform would be orders of magnitude greater
    > than the 2 days it's taken me so far.


    As I have no idea what you want to do, I can't comment on that.

    > I've tool-generated a schema that gives me enough validation to confirm
    > that the input code is consistent and that's all I need.


    In which case I'm unclear what your original post was asking.

    > There are no static instances as such. As the classes change (often) I can
    > export
    > a new set of XML data, pump them through my XSLT transform and then use
    > the HTML
    > code in RoboHelp. The authors (me and the developers) simply see ascii
    > text in fields
    > in a GUI. No human is every going to write or edit the code. I'm the only
    > one who really
    > gets to see the XML code, and I'm the only one who really does anything
    > with it.
    > I'm the sole writer here too.


    In which case I'm unclear what your original post was asking. I do the same
    in other circumstances for different systems.

    ///Peter
    --
    "The cat in the box is both a wave and a particle"
    -- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
    Peter Flynn, Aug 9, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon North
    Replies:
    0
    Views:
    362
    Simon North
    Aug 5, 2004
  2. Michael Laplante

    Page won't validate -- misuse of A element?

    Michael Laplante, May 18, 2006, in forum: HTML
    Replies:
    3
    Views:
    471
    Jonathan N. Little
    May 18, 2006
  3. John Roth

    Re: Misuse of <tab>

    John Roth, Jul 30, 2003, in forum: Python
    Replies:
    8
    Views:
    370
    Robin Munn
    Aug 12, 2003
  4. Michael Sampson

    Re: Misuse of <tab>

    Michael Sampson, Jul 30, 2003, in forum: Python
    Replies:
    5
    Views:
    360
    Ben Finney
    Jul 31, 2003
  5. naive misuse?

    , Aug 28, 2006, in forum: Python
    Replies:
    3
    Views:
    340
    Simon Forman
    Aug 29, 2006
Loading...

Share This Page