What kinds things can be verified of XML files?

Discussion in 'XML' started by Cambridge Ray, Aug 10, 2011.

  1. The question is so abstract, I guess I have to illustrate. One of my
    XML files contains a set of rectangular coordinates:

    <reference>
    <line x1="416" y1="6436" x2="416" y2="3924" />
    <line x1="420" y1="6436" x2="420" y2="3924" />
    <line x1="1500" y1="5388" x2="1500" y2="4452" />
    <line x1="1504" y1="4436" x2="1504" y2="3924" />
    <line x1="2884" y1="5388" x2="2884" y2="4456" />
    <line x1="412" y1="4436" x2="412" y2="3932" />
    </reference>

    I would like to make sure that every X2 is greater than or equal to
    its X1 companion. Same for Y2 and Y1. Is this something that can be
    easily checked at the XML level, or should I perform such check after
    the XML file is read and parsed?

    I use Xerces-C++.

    TIA,

    -Ramon
    Cambridge Ray, Aug 10, 2011
    #1
    1. Advertising

  2. On Aug 10, 4:31 pm, Cambridge Ray <> wrote:
    > The question is so abstract, I guess I have to illustrate. One of my
    > XML files contains a set of rectangular coordinates:
    >
    > <reference>
    >     <line x1="416" y1="6436" x2="416" y2="3924" />
    >     <line x1="420" y1="6436" x2="420" y2="3924" />
    >     <line x1="1500" y1="5388" x2="1500" y2="4452" />
    >     <line x1="1504" y1="4436" x2="1504" y2="3924" />
    >     <line x1="2884" y1="5388" x2="2884" y2="4456" />
    >     <line x1="412" y1="4436" x2="412" y2="3932" />
    > </reference>
    >
    > I would like to make sure that every X2 is greater than or equal to
    > its X1 companion. Same for Y2 and Y1. Is this something that can be
    > easily checked at the XML level, or should I perform such check after
    > the XML file is read and parsed?
    >
    > I use Xerces-C++.
    >
    > TIA,
    >
    > -Ramon


    Here's another example. What I would like to check is that the
    successive coordinates have an ascending order, and the "skip" element
    should only contain 0 and 1 values. Can this be (relatively) easily be
    verified at the XML level, or should I do it after the XML file is
    read and parsed?

    TIA,

    -Ramon

    -----------

    <rows>
    <coord>3449</coord>
    <coord>3600</coord>
    <coord>3893</coord>
    <coord>4196</coord>
    <coord>4340</coord>
    <coord>4644</coord>
    <coord>4941</coord>
    <coord>5242</coord>
    <coord>5541</coord>
    </rows>

    <columns>
    <coord>278</coord>
    <coord>876</coord>
    <coord>1174</coord>
    <coord>1783</coord>
    <coord>2555</coord>
    <coord>3154</coord>
    <coord>4068</coord>
    <coord>4825</coord>
    </columns>

    <skip>
    <coord>0</coord>
    <coord>1</coord>
    <coord>1</coord>
    <coord>0</coord>
    <coord>1</coord>
    </skip>
    Cambridge Ray, Aug 10, 2011
    #2
    1. Advertising

  3. >> <line x1="412" y1="4436" x2="412" y2="3932" />
    >> I would like to make sure that every X2 is greater than or equal to
    >> its X1 companion.


    The standard XML DTD and Schema languages can't express that kind of
    interaction; you'd need to implement it at a higher level of your
    application. Basically, if something is application semantics the
    application has to deal with it; if it's closer to syntax (type and
    range limits, and many but not all kinds of document structure
    constraint) schema can check it.

    There have been alternatives to the W3C's XML Schema language which can
    implement more complicated constraints. The problem is that they aren't
    as well standardized or as widely supported, so you really can't count
    on anyone else using them. They may still be useful within some
    controlled contexts, as an alternative to hand-coding.

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Aug 11, 2011
    #3
  4. Cambridge Ray

    Tim Arnold Guest

    On 8/10/2011 7:01 PM, Joe Kesselman wrote:
    >>> <line x1="412" y1="4436" x2="412" y2="3932" />
    >>> I would like to make sure that every X2 is greater than or equal to
    >>> its X1 companion.

    >
    > The standard XML DTD and Schema languages can't express that kind of
    > interaction; you'd need to implement it at a higher level of your
    > application. Basically, if something is application semantics the
    > application has to deal with it; if it's closer to syntax (type and
    > range limits, and many but not all kinds of document structure
    > constraint) schema can check it.
    >
    > There have been alternatives to the W3C's XML Schema language which can
    > implement more complicated constraints. The problem is that they aren't
    > as well standardized or as widely supported, so you really can't count
    > on anyone else using them. They may still be useful within some
    > controlled contexts, as an alternative to hand-coding.
    >


    Hi Joe,
    Just curious if schematron with its 'let' and 'value-of' abilities could
    be of help for the OP?
    thanks,
    --Tim
    Tim Arnold, Aug 11, 2011
    #4
  5. On 8/11/2011 12:21 PM, Tim Arnold wrote:
    > Just curious if schematron with its 'let' and 'value-of' abilities could
    > be of help for the OP?


    I believe Schematron can express this kind of constraint... if you are
    in an environment where you can guarantee that Schematron will be
    available on the machine in question. In other words, it might be
    reasonable to apply this on the server end where you own all the code,
    but unless you can also guarantee that nobody but you will be writing
    clients you may not be able to do much with it on that end -- and if you
    ARE writing all the clients, you can usually ensure the data is correct
    in the first place rather than spending cycles checking it.



    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Aug 12, 2011
    #5
  6. Joe Kesselman wrote:
    >>> <line x1="412" y1="4436" x2="412" y2="3932" />
    >>> I would like to make sure that every X2 is greater than or equal to
    >>> its X1 companion.

    >
    > The standard XML DTD and Schema languages can't express that kind of
    > interaction;


    It might be worth noting that the version 1.1 of the schema language is
    in the state "Candidate Recommendation" and with that you are able to
    define assertions http://www.w3.org/TR/xmlschema11-1/#cAssertions e.g.
    <xs:assert test="@x2 ge @x1"/>
    I think there is a version of Xerces Java that does implement that
    already. And Saxon's commercial schema processor also supports that.


    --

    Martin Honnen --- MVP Data Platform Development
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Aug 12, 2011
    #6
  7. > It might be worth noting that the version 1.1 of the schema language is
    > in the state "Candidate Recommendation" and with that you are able to
    > define assertions http://www.w3.org/TR/xmlschema11-1/#cAssertions e.g.
    > <xs:assert test="@x2 ge @x1"/>


    Good point. I'd hesitate to _rely_ on Schema 1.1 until it graduates to
    Recommendation -- and even then, not all parsers will support it
    promptly -- but it's certainly reasonable to start prototyping against
    it if you have it available.

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Aug 12, 2011
    #7
  8. Cambridge Ray

    Peter Flynn Guest

    On 11/08/11 00:01, Joe Kesselman wrote:
    >>> <line x1="412" y1="4436" x2="412" y2="3932" />
    >>> I would like to make sure that every X2 is greater than or equal to
    >>> its X1 companion.

    >
    > The standard XML DTD and Schema languages can't express that kind of
    > interaction; you'd need to implement it at a higher level of your
    > application. Basically, if something is application semantics the
    > application has to deal with it; if it's closer to syntax (type and
    > range limits, and many but not all kinds of document structure
    > constraint) schema can check it.
    >
    > There have been alternatives to the W3C's XML Schema language which can
    > implement more complicated constraints. The problem is that they aren't
    > as well standardized or as widely supported, so you really can't count
    > on anyone else using them. They may still be useful within some
    > controlled contexts, as an alternative to hand-coding.


    I think it's also important to establish what the objective is. The
    typical sequence of events when an XML instance is processed can be
    expressed as

    1. syntactic verification (is the document well-formed)
    2. formal validation (well-formed document tested against schema/dtd)
    3. processing with whatever language/engine is specified, which may
    involve further error-reporting, but at this stage the document
    itself is presumed valid to its schema/dtd

    The expectation is that if steps 1 or 2 fail, no further action takes
    place, although a processor can report an error and even try to fix it,
    which may involve digging further into the document to see what is going
    on; but it cannot continue as if nothing had happened.

    If you specify a constraint at the level of the Schema or DTD then
    presumably you do so because you want to prevent the instance being
    processed if it fails a well-formedness or validation test.

    In effect, an assertion such as Martin mentions (that one attribute has
    to be bigger than another) becomes a breaking-point. So we need to
    consider how big a deal this is. The document is well-formed, because
    validation will only take place if the document has passed (1) above. Is
    the fact that <foo bar="42" blort="43"/> going to kill someone, or cause
    the stock market to crash, or create a batch of dud chips, or just order
    43 paperclips instead of 42? This level of analysis should indicate
    whether such a test should cause the entire factory to come to a stop
    and evacuate, or simply email a warning to the appropriate person.

    I think what I am saying is, the fact that you *can* specify ever
    tighter constraints doesn't necessarily mean that it is the right
    business decision to do so, because the effects of premature validation
    failure can be just as serious as those of remaining undetected until later.

    ///Peter
    --
    XML FAQ: http://xml.silmaril.ie/
    (and apologies to those trying to access it in the last few days: the
    server suffered a CPU flood from a rogue process; and No, before you
    ask, it wasn't unvalidated XML data :)
    Peter Flynn, Aug 14, 2011
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Petterson Mikael

    Help to select FIXED/VERIFIED/CLOSED

    Petterson Mikael, Aug 21, 2006, in forum: XML
    Replies:
    1
    Views:
    351
    Joe Kesselman
    Aug 21, 2006
  2. david.karr
    Replies:
    22
    Views:
    684
    Arne Vajhøj
    Aug 30, 2009
  3. Claira
    Replies:
    0
    Views:
    120
    Claira
    Feb 17, 2013
  4. Tim Golden
    Replies:
    2
    Views:
    181
    Rick Johnson
    Feb 17, 2013
  5. Oscar Benjamin
    Replies:
    8
    Views:
    197
    Dennis Lee Bieber
    Feb 19, 2013
Loading...

Share This Page