XML and Ontologies

Discussion in 'XML' started by Alex Fawcett, Jul 10, 2003.

  1. Alex Fawcett

    Alex Fawcett Guest

    I am interested in XML mediation and the use of ontlogies to link
    similar but different element names in XML schema. Am I correct in my
    understanding that an onltology is a language or set of commands that
    is agreed upon thus making mediation between XML element names
    uneccesary. Also is this the best method of mediation between XML
    files.
    thanks for any help
    Alex
    Alex Fawcett, Jul 10, 2003
    #1
    1. Advertising

  2. Alex Fawcett

    Andy Dingley Guest

    On 10 Jul 2003 05:35:27 -0700, (Alex
    Fawcett) wrote:

    >I am interested in XML mediation and the use of ontlogies to link
    >similar but different element names in XML schema.


    XML is a bit of an unhappy fit with ontologies - you start to
    appreciate the differences between RDF and XML.

    I suggest giving Protégé a whirl
    http://protege.stanford.edu

    It's an environment for editing both ontologies and instance data, in
    a very approachable style. Certainly worth a look.

    I spent much of last week here:
    http://protege.stanford.edu/workshop_vi/schedule.html
    and blogged a brief trip report here:
    http://www.livejournal.com/users/quercus/20830.html


    It's a frames-based approach, rather than a description logics
    approach. This makes big differences, but you need to get a little
    hands-on with both (and frames is perhaps simpler to start with). We
    don't know where we'll end up finally, and we might need to combine
    both approaches.

    Take a look at the W3C's OWL (Web Ontology Language) and the older
    work (SHOE, OIL, DAML+OIL) too. These are generally DL-based (take a
    look at Manchester's OilEd, if you want a contrast to Protégé)


    >Am I correct in my
    >understanding that an onltology is a language or set of commands that
    >is agreed upon thus making mediation between XML element names
    >uneccesary.


    What's an ontology ? I've written the "30 second elevator pitch" on
    this about a dozen times over the last few years. It's very hard to
    give one simple definition that meets all needs. Everyone who comes to
    ontologies (and it's almost a stampede now) approaches from a
    different angle.

    Natasha Noy 's classic paper "Ontology 101" is a good place to start
    http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html

    Broadly, I'd say that it was one definition of a set of entities and
    their related properties, expressed in a style that was understood by
    other systems.

    It may also describe their metaphysical "meanings", which is the
    difference between an ontology and a schema (or between DAML and
    DAML+OIL)

    An ontology does not describe mappings or mediation between two XML
    schemas. Depending on your meaning of "mediation" this might be easy
    (if you know they're ontologically identical, but you just need to
    match up the names), but mapping is generally speaking a fiendishly
    difficult problem.

    You can approach it with ontologies. You use two ontologies,
    describing both the source and target. Then you apply some form of
    complex reasoning to identify commonality and as much "mapping" as is
    possible. From this you then generate (or auto-generate) code to do
    the mapping. Easy.

    The problem is that any ontology beyond the trivial has no simple
    mapping between entities. Does an employee have a "works-for"
    relationship with their boss, or a "works-in" with their department
    and a "manages" relationship between boss and department ? This stuff
    just doesn't overlay cleanly, so an improved description technique
    alone isn't going to fix things.

    >Also is this the best method of mediation between XML
    >files.


    Depends on the scale of your problem. What's an "XML file" ? Are
    these the same two schemas you see every day, or is it a dynamic
    problem with every new message ? How different are the two models ?

    Incidentally, the same problem between one XML document and an RDBMS
    is also common.

    There's a lot of very rudimentary work being passed off around this
    problem (Oracle 9i being a case in point) where people in suits with a
    product to sell are pushing very simple (often XSLT-based) solutions
    as a panacea. Those who are seriously in the field know it's not so
    easy.


    There's also the problem of meta-languages. Many people are already
    encountering this with database output, and it has a huge effect on
    the use of XSLT.

    Consider an RDBMS with a generic XML export filter. What should the
    output look like ?

    <order>
    <order-item>
    <a>1</a><b>2</b>
    </order-item>
    <order-item>
    <b>3</b><c>4</c>
    </order-item>
    </order>


    <query name="order" >
    <row name="order-item" >
    <column name="a" >1</column><column name="b" >2</column>
    </row>
    <row name="order-item" >
    <column name="b" >3</column><column name="c" >4</column>
    </row>
    </query>


    The first of these maps column names onto element names. It generates
    comapct XML that's probably how most XML coders would do it manually.
    The trouble is that it's a new DTD for every query.

    The second is a meta-format. The DTD is the same for every query
    output and only the name="" metadata changes. It's verbose (but we
    don't care, because our computers deal with that for us)

    Ontologically these are _identical_ (they ought to be, or our export
    filter is broken). In terms of ease of use though, they're quite
    different. The first is unstable and somewhat unpredictable
    (although you can easily auto-export a DTD or even ontology at the
    same time), the second is hard to process (with XSLT).

    XSLT is a language for transformtions of XML data at the structural
    level. This works fine for our "type 1" data above, or for much XML,
    because XML's data model is inferred from the structure (go read
    XML-Infoset). A structural transformation _is_ a transformation at the
    level of the data-model.

    The second one becomes much harder. We've now separated the structural
    level (and the data model of our consistent "generic export format")
    from the data model of our "real" data. An XSLT transform still
    operates at the structural level (it has to - that's what XSLT does)
    and so it's now divorced from the level the interesting data is
    residing at. Using XSLT to make real "data-level" transformations
    like this becomes a real PITA. In some formats it's straightforward,
    but long-winded, in others (like RDF) it becomes well-nigh impossible.
    Schematron can sometimes help.

    RDF is a bit like "type 2" data, with a "generic export format" that's
    already defined by the RDF/XML standards. You can't work with
    non-trivial RDF in XSLT, because of just this problem. That's why RDF
    is manipulated by tools such as Jena, that work at the data model
    level.
    Andy Dingley, Jul 12, 2003
    #2
    1. Advertising

  3. In message <>, Andy Dingley
    <> writes
    >On 10 Jul 2003 05:35:27 -0700, (Alex
    >Fawcett) wrote:
    >
    >>I am interested in XML mediation and the use of ontlogies to link
    >>similar but different element names in XML schema.

    >
    >>Am I correct in my
    >>understanding that an onltology is a language or set of commands that
    >>is agreed upon thus making mediation between XML element names
    >>uneccesary.


    Further to Andy's excellent thoughts on this issue, I would add the
    suggestion that you could look into using Topic Maps
    (http://www.topicmaps.org/) to represent equivalences between concepts
    in schemas. As it happens, I was doing exactly this only last week, as
    preparation for a data mapping exercise.

    I took the two schemas I wanted to compare, and used XSLT to convert
    them to Topic Maps. I then wrote a "links" document containing
    relationships between individual concepts. As it happens, I wrote this
    in the sort of "compact" style Andy described, e.g.:

    <link type="exact">
    <member schema="nt" id="condition-check"/>
    <member schema="spectrum" id="check"/>
    </link>

    but I could easily use XSLT to convert this to a proper Topic Map
    (containing nothing but Associations).

    What I actually did was to convert this "links" document into an HTML
    table of links between equivalent concepts in the two schemas. This was
    sufficient for the task at hand.

    In principle I could instead have made my "links" document into a TM in
    its own right, and then used it to merge the two schemas into a single
    TM with all the correspondences expressed as TM Associations. This sort
    of approach lets you work at a higher level of abstraction than the raw
    XML (i.e. at a "Topic Map concepts" level). Conversely, TM XML is
    pretty simple (if verbose) in its structure, so you may get more mileage
    using XSLT than Andy suggests you would with RDF.

    Richard Light
    --
    Richard Light
    SGML/XML and Museum Information Consultancy
    Richard Light, Jul 14, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bomb Diggy
    Replies:
    0
    Views:
    440
    Bomb Diggy
    Jul 28, 2004
  2. David Allen

    OWL and ontologies

    David Allen, Dec 10, 2004, in forum: XML
    Replies:
    2
    Views:
    352
    Andy Dingley
    Dec 15, 2004
  3. Replies:
    0
    Views:
    398
  4. Stylus Studio
    Replies:
    0
    Views:
    431
    Stylus Studio
    Sep 26, 2006
  5. Erik Wasser
    Replies:
    5
    Views:
    445
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page