Relational data to XML - Are there any standards?

Discussion in 'XML' started by Pradeep, Sep 7, 2006.

  1. Pradeep

    Pradeep Guest

    Hello,

    I need to take a set of input tables and create an XML output file. The
    format of the XML output must be user-definable and must be intuitive
    enough for non-techies to use.

    input table(s) + SomeSchemaDefinition ==> XML file

    I have seen examples of XML file generation with fixed scope. For
    example, if input table (called customer) is as follows:

    Id Name
    101 Clark Kent
    102 Peter Parker
    103 Bruce Banner

    The output XML generated is:
    <root>
    <customer>
    <Id>101</Id>
    <Name>Clark Kent</Name>
    </customer>
    <customer>
    ....

    However, the user never had a chance to control the output format. In
    my case, the user must have the abilitlity to define which columns are
    attributes and which columns are child elements.

    I am wondering if there is a standard that is already in place that I
    must look at. Any other pointer is appreciated as well. Especially, how
    are multiple level nestings handled?

    Thank you in advance for your help.

    Pradeep
     
    Pradeep, Sep 7, 2006
    #1
    1. Advertising

  2. hi,

    Pradeep wrote:
    > Hello,
    >
    > I need to take a set of input tables and create an XML output file. The
    > format of the XML output must be user-definable and must be intuitive
    > enough for non-techies to use.
    >
    > input table(s) + SomeSchemaDefinition ==> XML file
    >
    > I have seen examples of XML file generation with fixed scope. For
    > example, if input table (called customer) is as follows:
    >
    > Id Name
    > 101 Clark Kent
    > 102 Peter Parker
    > 103 Bruce Banner
    >
    > The output XML generated is:
    > <root>
    > <customer>
    > <Id>101</Id>
    > <Name>Clark Kent</Name>
    > </customer>
    > <customer>
    > ....
    >
    > However, the user never had a chance to control the output format. In
    > my case, the user must have the abilitlity to define which columns are
    > attributes and which columns are child elements.
    >
    > I am wondering if there is a standard that is already in place that I
    > must look at.


    SQL to XML is not standardized ; many RDBMS vendors provide proprietary
    mechanisms to map SQL to XML, which can be enough for simple XML
    targets, but can't deal with more complex XML structures

    Any other pointer is appreciated as well. Especially, how
    > are multiple level nestings handled?


    have a look at RefleX

    an example here :
    http://reflex.gforge.inria.fr/tutorial.html#N801062

    this tool can map any SQL select statement to the XML structure you
    would expect : you can choose
    -which items become elements, attributes, or text,
    -what are the names of attributes and elements if the names of the
    columns doesn't suit,
    -insert some container elements here and there,
    -etc

    it can output DOM, SAX or write to a file

    you can launch it either from the command line or within a web
    application, or embed it in your application

    enjoy !

    >
    > Thank you in advance for your help.
    >
    > Pradeep
    >


    --
    Cordialement,

    ///
    (. .)
    --------ooO--(_)--Ooo--------
    | Philippe Poulard |
    -----------------------------
    http://reflex.gforge.inria.fr/
    Have the RefleX !
     
    Philippe Poulard, Sep 7, 2006
    #2
    1. Advertising

  3. The most standards-oriented solution I can think of offhand would be to
    export/extract the relational data to XML in a fairly straightforward
    manner, and then run that through XSLT to get your user-defined
    formatting layer.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Sep 7, 2006
    #3
  4. Pradeep

    Stefan Ram Guest

    Joseph Kesselman <> writes:
    >The most standards-oriented solution I can think of offhand would be to
    >export/extract the relational data to XML in a fairly straightforward
    >manner, and then run that through XSLT to get your user-defined
    >formatting layer.


    I would consider to use a document per table, an ID-attribute
    for the primary key and IDREF-attributes for foreign keys.

    I believe the common opinion that XML favors hierarchical data
    to be wrong. It might also be used for relation data quite
    easily, as shown above, and since relations can model any
    other kind of structure so can XML.
     
    Stefan Ram, Sep 7, 2006
    #4
  5. Pradeep

    Andy Dingley Guest

    Stefan Ram wrote:

    > I would consider to use a document per table, an ID-attribute
    > for the primary key and IDREF-attributes for foreign keys.


    Why would you want to model the tables so exactly in the XML document ?
    Aspects like keys are quite specific to a specific implementation
    through an RDBMS and it's not necessarily important to preserve them.

    Assuming that we're talking about some "application" purpose here,
    rather than simply replicating an entire database, then the likelihood
    is that we care less about "tables" and more about an
    application-centred denormalised view across multiple tables. This view
    doesn't require foreign keys (it's now a single view) and XML's
    hierarchical nature can represent it easily.


    > I believe the common opinion that XML favors hierarchical data
    > to be wrong.


    ID & IDREF suck. Therefore XML _favours_ hierarchical data. It's not
    hierarchical to the exclusion of all else, but it's a strong
    favouritism.


    > It might also be used for relation data quite
    > easily, as shown above, and since relations can model any
    > other kind of structure so can XML.


    Tapeworms are Turning complete, therefore I can compute anything with
    them.
    But it's hardly practical, is it ?
     
    Andy Dingley, Sep 7, 2006
    #5
  6. Pradeep

    Stefan Ram Guest

    "Andy Dingley" <> writes:
    >Why would you want to model the tables so exactly in the XML document ?
    > Aspects like keys are quite specific to a specific implementation
    >through an RDBMS and it's not necessarily important to preserve them.


    Keys are inherent to the relational model of the /data/ -
    they are not an implementation detail of a specific RDBMS.

    >Assuming that we're talking about some "application" purpose
    >here, rather than simply replicating an entire database, then
    >the likelihood is that we care less about "tables" and more
    >about an application-centred denormalised view across multiple
    >tables. This view doesn't require foreign keys (it's now a
    >single view) and XML's hierarchical nature can represent it
    >easily.


    In specific applications, there might well be reasons to
    diverge from my suggestion. I am just not aware of them right now.

    >ID & IDREF suck. Therefore XML _favours_ hierarchical data.
    >It's not hierarchical to the exclusion of all else, but it's a
    >strong favouritism.


    I only know about them from the XML-application "XHTML 1.1",
    where the id-attribute has ID-type and "for" and "usemap" have
    IDREF-type, and was not aware of problems with this approach.

    But then, I really have no other experience with ID and IDREF,
    so you might know more about this than I do.

    >But it's hardly practical, is it ?


    It's not so hard storing relational data in XML with one
    element per set and one element per tuple, in fact, this
    seems quite natural to me.
     
    Stefan Ram, Sep 7, 2006
    #6
  7. Andy Dingley wrote:
    > ID & IDREF suck. Therefore XML _favours_ hierarchical data.


    There's a lot more than ID/IDREF available once you move to schemas. Not
    to mention the option of simply using XPaths in your own document syntax.

    XML's native syntax is certainly tree-structured. But what you read that
    into for manipulation is up to the application. DOM and SAX and such are
    conveniences/tools, *NOT* universal solutions for all tasks.

    (Of course IBM's now added XML support to DB2, recognizing that
    sometimes a dataset is best manipulated hierarchically as an XML
    infoset. Tools for tasks.)

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Sep 7, 2006
    #7
  8. Pradeep

    Peter Flynn Guest

    Joseph Kesselman wrote:
    > Andy Dingley wrote:
    >> ID & IDREF suck. Therefore XML _favours_ hierarchical data.


    Don't forget that XML was designed for normal text documents,
    where ID/IDREF is a useful and robust mechanism. It is not
    related in any way to whether or not XML favours hierarchical
    data.

    If you use XML for rectangular data, you have to understand that
    you are pushing the limits of what XML was designed for.

    > There's a lot more than ID/IDREF available once you move to schemas.


    Possibly. But that's a penalty you have to pay.

    ///Peter
     
    Peter Flynn, Sep 7, 2006
    #8
  9. Pradeep

    Andy Dingley Guest

    Stefan Ram wrote:

    > "Andy Dingley" <> writes:


    > > Aspects like keys are quite specific to a specific implementation
    > >through an RDBMS and it's not necessarily important to preserve them.

    >
    > Keys are inherent to the relational model of the /data/ -
    > they are not an implementation detail of a specific RDBMS.


    The concept of "keys" is relevant to any current "relational"
    implementation of the data.

    However the _specific_ use of keys is specific to the implementation.
    There's a question of how far you normalise your data when designing a
    relational model for it. You don't have to normalise to the same form
    each time, and you don't have to use identical key structures.

    If we see this "XML output" of the database model as being application
    centric, then we don't care about such design choices. No matter how
    normalised the data was when it was stored internally, we want the same
    denormalised view for the output. As different implementations may have
    used a different data model (an Access implementation was probably
    de-normalised compared to a SQL Server implementation), this difference
    is now irrelevant, inappropriate and possibly misleading. Our XML
    representation shouldn't preserve these keys.

    > I only know about them from the XML-application "XHTML 1.1",
    > where the id-attribute has ID-type and "for" and "usemap" have
    > IDREF-type, and was not aware of problems with this approach.


    Read the RDF documentation. Much of RDF's work was in overcoming the
    shortcomings of XML, in providing a usable data model for ID &
    IDREF-like concepts.

    XML has two major shortcomings here:

    * To use IDREF, you must first have an ID. What happens if you want to
    refer to a node that's identifiable, but not explicitly labelled ? It's
    a valid requirement.

    * ID & IDREF only work within a single document. To make small
    appplications that can inter-work in a large universe, we need tools
    that can refer outside their immediate frame of reference. XPointer is
    an attempt here, but there's still a lot lacking with XML in this
    context.


    > >But it's hardly practical, is it ?

    >
    > It's not so hard storing relational data in XML with one
    > element per set and one element per tuple, in fact, this
    > seems quite natural to me.


    What's a "tuple" here ? A tuple as held in a table, or a tuple as a
    row in a relational view ? I have no real interest in
    tuples-from-tables, they're too low level and only really useful for
    "database replication between databases with identical table structures
    and data models".

    If we look at the more interesing case of tuples from a view, then
    these will be de-normalised (i.e. they have structure that would have
    been normalised into multiple tables). An appropriate XML
    representation of these is also normalised. Now we can still say "one
    element per tuple" simply, but it has to become "one parent element for
    one or more tuples" and "potentially more than one level of element
    hierarchy within a tuple"

    I strongly recommend studying MS SQL 2000 and the splendid hack with
    which they implemented the "AS XML" select query, without changing
    anything in the database itself. If you search the MSDN SDK for
    "Universal table" then there's a good explanation of it. Basically any
    "AS XML" query produces a huge denormalised scratch table called the
    "Universal table", then a trivial row scanner runs through this and
    generates new element hierarchies when column values change. Quite
    useful, and a splendid low-effort hack.
     
    Andy Dingley, Sep 8, 2006
    #9
  10. Andy Dingley wrote:
    > * To use IDREF, you must first have an ID. What happens if you want to
    > refer to a node that's identifiable, but not explicitly labelled ? It's
    > a valid requirement.


    Usual solution is XPath -- structural/content-based crossreferencing
    rather than pre-tagged -- and solutions derived from it such as
    XPointer, or schema's keys, or the similar capabilities in XSLT and XQuery.

    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Sep 8, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Amol
    Replies:
    0
    Views:
    369
  2. Stylus Studio
    Replies:
    0
    Views:
    382
    Stylus Studio
    Sep 20, 2005
  3. Alex Hunsley
    Replies:
    2
    Views:
    473
    dimitar
    Jun 2, 2006
  4. Thomas Weholt

    Re: Any pure-python relational databases?

    Thomas Weholt, Jul 12, 2003, in forum: Python
    Replies:
    1
    Views:
    464
    =?ISO-8859-1?Q?Gerhard_H=E4ring?=
    Jul 13, 2003
  5. Replies:
    0
    Views:
    303
Loading...

Share This Page