XML Parsing

Discussion in 'Perl Misc' started by xhoster@gmail.com, Sep 29, 2005.

  1. Guest

    I'm currently using XML::parser to process some XML. I am wondering what
    else is out there. A search of CPAN on 'XML' reveals that there is way
    more out there than I can possible evaluate, but that most of it is not
    general XML processing modules, but rather for very specific tasks which
    just happen to involve XML.

    I'd like to be familiar with the major alternatives to XML::parser, without
    having to read 3689 different perldoc pages to find those major
    alternatives. So, what are your favorite Perl modules for general-purpose
    XML processing?

    Thanks,

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Please forgive me for not posting code.
    , Sep 29, 2005
    #1
    1. Advertising

  2. Henry Law Guest

    wrote:

    > I'd like to be familiar with the major alternatives to XML::parser, without
    > having to read 3689 different perldoc pages to find those major
    > alternatives. So, what are your favorite Perl modules for general-purpose
    > XML processing?


    And I thought it was just me, being easily confused! For my part I use
    XML:Simple occasionally (for config files and straightforward things
    like that), but more often XML::Twig.

    --

    Henry Law <>< Manchester, England
    Henry Law, Sep 29, 2005
    #2
    1. Advertising

  3. Guest

    wrote:
    > I'm currently using XML::parser to process some XML. I am wondering what
    > else is out there. A search of CPAN on 'XML' reveals that there is way
    > more out there than I can possible evaluate, but that most of it is not
    > general XML processing modules, but rather for very specific tasks which
    > just happen to involve XML.
    >
    > I'd like to be familiar with the major alternatives to XML::parser, without
    > having to read 3689 different perldoc pages to find those major
    > alternatives. So, what are your favorite Perl modules for general-purpose
    > XML processing?
    >


    XML::libXML is an alternative (I seem to recall somewhere that
    XML::libXML
    had some advantages over XML::parser but details escape me)

    --
    Charles DeRykus
    , Sep 30, 2005
    #3
  4. wrote:
    > I'm currently using XML::parser to process some XML. I am wondering what
    > else is out there. A search of CPAN on 'XML' reveals that there is way
    > more out there than I can possible evaluate, but that most of it is not
    > general XML processing modules, but rather for very specific tasks which
    > just happen to involve XML.
    >
    > I'd like to be familiar with the major alternatives to XML::parser, without
    > having to read 3689 different perldoc pages to find those major
    > alternatives. So, what are your favorite Perl modules for general-purpose
    > XML processing?


    It depends what you are looking for:

    XML::Simple is often enough to load XML data into a Perl data structure,
    XML::LibXML gives you the speed and power of libxml2
    XML::Twig is perlish and convenient (and written by me ;--)
    XML::SAX::Machine gives you a framework for stream processing

    I guess those are the ones I would recomment these days

    The Perl XML FAQ ( http://perl-xml.sourceforge.net/faq/ ) has some more
    information. I also have a few articles on my site that show examples of
    using the various modules: http://www.xmltwig.com/article/index_wtr.html
    --
    Michel Rodriguez
    Perl &amp; XML
    xmltwig.com
    Michel Rodriguez, Oct 3, 2005
    #4
  5. Guest

    what i'm using:

    XML::Xerces;
    XML::parser::Expat;
    XML::Simple;
    , Oct 4, 2005
    #5
  6. Guest

    On 29 Sep 2005 17:51:53 GMT, wrote:

    >I'm currently using XML::parser to process some XML. I am wondering what
    >else is out there. A search of CPAN on 'XML' reveals that there is way
    >more out there than I can possible evaluate, but that most of it is not
    >general XML processing modules, but rather for very specific tasks which
    >just happen to involve XML.
    >
    >I'd like to be familiar with the major alternatives to XML::parser, without
    >having to read 3689 different perldoc pages to find those major
    >alternatives. So, what are your favorite Perl modules for general-purpose
    >XML processing?
    >
    >Thanks,
    >
    >Xho

    I guess I'll try to revisit your post even it being 2 weeks later.
    It isin't possible to know from your questions what would be good for
    you. I use multiple methods to manipulate XML. In Perl, thats just the
    way it is right now. Theres simple xml and there is complex "nested
    entities" xml. Sometimes knowing what it is your trying to do helps
    better. In reality, there is no "one source" xml solution.
    Some of the basic concerns are if your reading/writing, validating,
    DOM and/or SAX. The current trend for reading is SAX a "roll your
    own" event driven solution as opposed to the "node" approach of a
    DOM.
    With SAX, (XML:parser, I use Expat which is a layer above I think)
    you can set handlers for start (with attributes)/end tags as well
    as content handlers and every w3c entity you wish. Be careful as this
    is easy to process with simple closures (non-nested entities).
    For example, lets say a known, compound (nested) structure is
    coming down the pipe. After you start filling the container, you know
    a certain inner container tag of <BigTag><tag1>content</tag1>
    <tag2>content</tag2><tag3>content</tag3><tag4>content</tag4></BigTag>
    sequence. As soon as the "BigTag" is trigged you start accumulating
    everything between <BigTag> and </BigTag>. When you have the string,
    you pass it to Simple to create a hash array that gets embedded
    into you containter structure, the "tag#" being the key, the content
    the value.
    So, you have to break it up this way and use what is known. Simple
    will work on simple xml (non-nested entities) to get what your
    interested in. Simple has to be tweaked too.
    While all this is going on, parser calls have to be wrapped in eval's
    and acted upon to trap errors.
    For "unknown" casual reading of xml for display purposes, use
    Microsoft browser (it dosen't use Perl).
    I use ActiveState's perl and use Expat, Simple, Xerces (Apache).
    I also use them with Perl2Exe (-tiny) on a commercial level app.
    So I guess you have to ask yourself what it is exactly you want
    to do. Perl and XML don't fit hand in glove, if your looking for
    something "quick & dirty" you will not find the solution in Perl.
    Hope this helps a little....
    , Oct 22, 2005
    #6
  7. Guest

    On Fri, 21 Oct 2005 22:58:01 -0700, wrote:

    >On 29 Sep 2005 17:51:53 GMT, wrote:
    >
    >>I'm currently using XML::parser to process some XML. I am wondering what
    >>else is out there. A search of CPAN on 'XML' reveals that there is way
    >>more out there than I can possible evaluate, but that most of it is not
    >>general XML processing modules, but rather for very specific tasks which
    >>just happen to involve XML.
    >>
    >>I'd like to be familiar with the major alternatives to XML::parser, without
    >>having to read 3689 different perldoc pages to find those major
    >>alternatives. So, what are your favorite Perl modules for general-purpose
    >>XML processing?
    >>
    >>Thanks,
    >>
    >>Xho

    >I guess I'll try to revisit your post even it being 2 weeks later.
    >It isin't possible to know from your questions what would be good for
    >you. I use multiple methods to manipulate XML. In Perl, thats just the
    >way it is right now. Theres simple xml and there is complex "nested
    >entities" xml. Sometimes knowing what it is your trying to do helps
    >better. In reality, there is no "one source" xml solution.
    >Some of the basic concerns are if your reading/writing, validating,
    >DOM and/or SAX. The current trend for reading is SAX a "roll your
    >own" event driven solution as opposed to the "node" approach of a
    >DOM.
    >With SAX, (XML:parser, I use Expat which is a layer above I think)
    >you can set handlers for start (with attributes)/end tags as well
    >as content handlers and every w3c entity you wish. Be careful as this
    >is easy to process with simple closures (non-nested entities).
    >For example, lets say a known, compound (nested) structure is
    >coming down the pipe. After you start filling the container, you know
    >a certain inner container tag of <BigTag><tag1>content</tag1>
    ><tag2>content</tag2><tag3>content</tag3><tag4>content</tag4></BigTag>
    >sequence. As soon as the "BigTag" is trigged you start accumulating
    >everything between <BigTag> and </BigTag>. When you have the string,
    >you pass it to Simple to create a hash array that gets embedded
    >into you containter structure, the "tag#" being the key, the content
    >the value.
    >So, you have to break it up this way and use what is known. Simple
    >will work on simple xml (non-nested entities) to get what your
    >interested in. Simple has to be tweaked too.
    >While all this is going on, parser calls have to be wrapped in eval's
    >and acted upon to trap errors.
    >For "unknown" casual reading of xml for display purposes, use
    >Microsoft browser (it dosen't use Perl).
    >I use ActiveState's perl and use Expat, Simple, Xerces (Apache).
    >I also use them with Perl2Exe (-tiny) on a commercial level app.
    >So I guess you have to ask yourself what it is exactly you want
    >to do. Perl and XML don't fit hand in glove, if your looking for
    >something "quick & dirty" you will not find the solution in Perl.
    >Hope this helps a little....

    Followup, you might find the "quick & dirty" in Perl, but it
    won't be the complete answer. I have found the complete "read"
    answer in Perl including validation (using Xerces). I would
    only use Xerces in the "write" answer and not Simple or
    any other solution. To use Xerces, you have to scoure the
    C++ source code and its hit and miss, mostly miss. As
    of this date, schema checking works excelent using Xerces
    pm interface. It took me so much time to test interface
    (but only as it pertains to schema) I may revisit C proto's
    and publish (where author just used a bot creator) actual
    implemtation.
    >
    , Oct 22, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Per Magnus L?vold
    Replies:
    0
    Views:
    1,366
    Per Magnus L?vold
    Nov 15, 2004
  2. Greg Wogan-Browne
    Replies:
    1
    Views:
    791
    Uche Ogbuji
    Jan 28, 2005
  3. Replies:
    2
    Views:
    491
  4. John Levine
    Replies:
    0
    Views:
    718
    John Levine
    Feb 2, 2012
  5. Erik Wasser
    Replies:
    5
    Views:
    437
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page