Here's the XML validation tool the world is waiting for...

Discussion in 'XML' started by Ramon F Herrera, Nov 10, 2012.

  1. I have tried with some automated file-validation tool, but concluded
    that their approach was fundamentally flawed: You cannot possibly
    generate an schema based on only one XML sample. Not unlike electoral
    polls (or any kind of sampling), the more samples you have, the more
    accurate the result. Yes, I know that at some point you reach
    diminishing returns.

    But I digress...

    Are you folks aware of any such tool? One that takes a whole bunch (I
    have an infinite numbers, can make a widely varied set) of XML files
    and creates their best-fit schema?

    -Ramon

    ps: There's a business opportunity...
     
    Ramon F Herrera, Nov 10, 2012
    #1
    1. Advertising

  2. On Nov 10, 10:50 am, Ramon F Herrera <> wrote:
    > I have tried with some automated file-validation tool, but concluded
    > that their approach was fundamentally flawed: You cannot possibly
    > generate an schema based on only one XML sample. Not unlike electoral
    > polls (or any kind of sampling), the more samples you have, the more
    > accurate the result. Yes, I know that at some point you reach
    > diminishing returns.
    >
    > But I digress...
    >
    > Are you folks aware of any such tool? One that takes a whole bunch (I
    > have an infinite numbers, can make a widely varied set) of XML files
    > and creates their best-fit schema?
    >
    > -Ramon
    >
    > ps: There's a business opportunity...



    Never mind, I found something that looks great: Liquid XML Studio.

    Any competing options?

    -Ramon
     
    Ramon F Herrera, Nov 10, 2012
    #2
    1. Advertising

  3. Ramon F Herrera <> writes:

    > I have tried with some automated file-validation tool, but concluded
    > that their approach was fundamentally flawed: You cannot possibly
    > generate an schema based on only one XML sample. Not unlike electoral
    > polls (or any kind of sampling), the more samples you have, the more
    > accurate the result. Yes, I know that at some point you reach
    > diminishing returns.
    >
    > But I digress...
    >
    > Are you folks aware of any such tool? One that takes a whole bunch (I
    > have an infinite numbers, can make a widely varied set) of XML files
    > and creates their best-fit schema?


    This is called grammatical inference (or grammar induction sometimes).
    (Warning: understatement ahead) It is difficult in the general case. All
    you can hope is a "good enough" solution (it looks like you found one).

    > ps: There's a business opportunity...


    There are plenty of scientific opportunities...

    -- Alain.
     
    Alain Ketterlin, Nov 11, 2012
    #3
  4. Ramon F Herrera

    Peter Flynn Guest

    On 10/11/12 16:50, Ramon F Herrera wrote:
    > I have tried with some automated file-validation tool, but concluded
    > that their approach was fundamentally flawed: You cannot possibly
    > generate an schema based on only one XML sample.


    No, you can generate a schema that describes that single document. This
    has been known for a long time, at least since the days of OCLC's Fred
    (using SGML DTDs). If the document is sufficiently representative of its
    type, it is a good starting-point for manual refinement, and saves a lot
    of time on those (rare) occasions when it is necessary.

    > Not unlike electoral polls (or any kind of sampling), the more
    > samples you have, the more accurate the result. Yes, I know that at
    > some point you reach diminishing returns.


    For it to be useful, the samples must describe the same type of
    document. Creating the union of TEI and DocBook is probably not useful :)

    > Are you folks aware of any such tool? One that takes a whole bunch
    > (I have an infinite numbers, can make a widely varied set) of XML
    > files and creates their best-fit schema?


    If you can infer a sample fragmentary grammar and express it in a
    generalised machine-readable syntax, then you can probably deduce the
    union of multiple instances of other fragments of the same grammar,
    provided they possess sufficient commonality.

    > ps: There's a business opportunity...


    Limited, I would say, but certainly there.

    ///Peter
     
    Peter Flynn, Nov 13, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. vijay
    Replies:
    8
    Views:
    729
  2. Hans-Peter Diettrich
    Replies:
    2
    Views:
    442
    Hans-Peter Diettrich
    Aug 22, 2008
  3. George Hester

    Try over here likely more to the point here

    George Hester, Sep 30, 2004, in forum: Javascript
    Replies:
    0
    Views:
    120
    George Hester
    Sep 30, 2004
  4. Larry
    Replies:
    27
    Views:
    453
    Michele Dondi
    Jan 25, 2005
  5. FAQ server
    Replies:
    0
    Views:
    161
    FAQ server
    Aug 10, 2006
Loading...

Share This Page