New file format design

Discussion in 'XML' started by mathieu, Jun 15, 2006.

  1. mathieu

    mathieu Guest

    Hello there,

    I am looking for suggestions for designing a simple file format
    based on XML. It will only contain text information (no binary data).
    1. If I have a choice: Element or Attribute ?
    2. Do I need to define my own file version (maybe as the first XML
    element) ?
    3. Do I need to provide a DTD or XML schema ?

    Thanks for inputs,
    Mathieu
     
    mathieu, Jun 15, 2006
    #1
    1. Advertising

  2. mathieu wrote:
    > 1. If I have a choice: Element or Attribute ?


    This is a FAQ. What's the intent of the datum (modifier or content), and
    will it ever in the future want to be structured (in which case it has
    to be an element).

    > 2. Do I need to define my own file version (maybe as the first XML
    > element) ?


    Up to you. Will you ever need to distinguish versions?


    > 3. Do I need to provide a DTD or XML schema ?


    Up to you. Do you want the parser to help confirm the data is reasonably
    structured and contains plausible values? Do you need to mark some data
    as having particular kinds of meanings (ID is the obvious one that has
    to be defined at this level)? Do you want to define named entities
    (supported only in DTDs, and *probably* best avoided these days although
    folks still debate that)?


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Jun 15, 2006
    #2
    1. Advertising

  3. mathieu

    Andy Dingley Guest

    On Thu, 15 Jun 2006 16:25:23 -0400, Joe Kesselman
    <> wrote:

    >mathieu wrote:
    >> 1. If I have a choice: Element or Attribute ?

    >
    >This is a FAQ.


    Isn't this the only Q that's more FA'ed than,
    "Why does SAX cut off my text" ? :cool:
     
    Andy Dingley, Jun 15, 2006
    #3
  4. Andy Dingley wrote:
    >>> 1. If I have a choice: Element or Attribute ?

    > Isn't this the only Q that's more FA'ed than,
    > "Why does SAX cut off my text" ? :cool:


    I wouldn't like to try to guess which one wins. :p

    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Jun 15, 2006
    #4
  5. mathieu

    mathieu Guest

    Joe Kesselman wrote:
    > This is a FAQ. What's the intent of the datum (modifier or content), and
    > will it ever in the future want to be structured (in which case it has
    > to be an element).


    Thank for the ref, I am sorry I did not do the step of searching for
    it.
    http://xml.silmaril.ie/developers/attributes/


    > > 2. Do I need to define my own file version (maybe as the first XML
    > > element) ?

    >
    > Up to you. Will you ever need to distinguish versions?


    Well I disagree simply because I don't know. I was under the impression
    that XML was designed exactly for this 'I don't know'. So adding
    Attributes or Elements is still (by design) syntactically correct. What
    I am unsure is : is this mechanism enough ?

    > > 3. Do I need to provide a DTD or XML schema ?

    >
    > Up to you. Do you want the parser to help confirm the data is reasonably
    > structured and contains plausible values? Do you need to mark some data
    > as having particular kinds of meanings (ID is the obvious one that has
    > to be defined at this level)? Do you want to define named entities
    > (supported only in DTDs, and *probably* best avoided these days although
    > folks still debate that)?


    Not really, I know what I am reading. My understanding was that DTD or
    XML schema was much more explicit for a third party than if I were to
    write down the file specification.

    Thanks !
    M
     
    mathieu, Jun 16, 2006
    #5
  6. mathieu wrote:
    > Joe Kesselman wrote:
    >>> 2. Do I need to define my own file version (maybe as the first XML
    >>> element) ?

    >> Up to you. Will you ever need to distinguish versions?

    > Well I disagree simply because I don't know.


    If you don't know, you can either treat the absence of the version mark
    as indicating version 0.0, or you can go ahead and design it in now.
    Either solution is defendable.

    In general: If in doubt, it's wise to design for a version mark, even if
    you make it optional.

    > My understanding was that DTD or
    > XML schema was much more explicit for a third party than if I were to
    > write down the file specification.


    Not entirely. The DTD/Schema may be useful for driving some tools. It
    may provide some specific kinds of information that aren't expressed
    directly in the instance document -- if your parser doesn't support
    xml:id, and you don't have a DTD or schema, tools may not be able to
    take advantage of some optimization potential. In fact, IBM has
    demonstrated that a schema-aware parser can actually be made faster than
    a non-validating parser, if you know which schema to expect and you do
    some compilation ahead of time. (I think a paper on that topic appears
    in the current issue of the IBM Systems Journal; I know the authors have
    presented papers on this at conferences.)

    If those issues don't concern you, you don't have to create a DTD or
    schema immediately -- but the longer you wait, the more likely folks
    will do things in their instance documents that you didn't expect. And
    formalizing your document design is a good exercise even if you don't
    enforce it.

    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Jun 16, 2006
    #6
  7. mathieu

    Andy Dingley Guest

    Joe Kesselman wrote:

    > > mathieu wrote:


    > >>> 2. Do I need to define my own file version (maybe as the first XML
    > >>> element) ?


    > If you don't know, you can either treat the absence of the version mark
    > as indicating version 0.0, or you can go ahead and design it in now.



    King numbering.
    (Coinage is labelled 'George II' and 'George IV', but simply 'George'
    for the first one)
     
    Andy Dingley, Jun 16, 2006
    #7
  8. Andy Dingley <> wrote:
    > King numbering.
    > (Coinage is labelled 'George II' and 'George IV', but simply 'George'
    > for the first one)


    I like the term; thanks!


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Jun 16, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    5
    Views:
    720
    Magnus Lycka
    Oct 27, 2006
  2. Replies:
    1
    Views:
    806
    mlimber
    Sep 11, 2006
  3. Randy Kramer
    Replies:
    2
    Views:
    398
    Randy Kramer
    Jan 12, 2007
  4. Replies:
    2
    Views:
    470
    Thomas 'PointedEars' Lahn
    Mar 11, 2008
  5. Wesley
    Replies:
    1
    Views:
    140
    Wesley
    Apr 15, 2014
Loading...

Share This Page