xml type parser in the standard perl installation ?

Abhinav · May 27, 2004

Hi

I have a script where some chuncks of text are marked between xml-type
tags .

I say 'xml-type' instead of xml as the tags are preceded with a comment
character "# " so that the script does not fail.

I need to be able to extract the data between tags (which can be
nested), and store it in a hash with each key being the tag itself and
the value, the data in between (it is multiline).

The problem is that I initiially tried using Text::Balanced, but gave up
since ir was too demanding for this kind of work .. spanning across
multiple lines ..

I am thinking of stripping the # from all tagged lines so that it
becomes an xml file, adding a root element (which was not present
before) , and then using an xml parser.

My questions :
1. Is the approach feasible, or is there som other simpler way to do it
... (after all, TIMTOWTDI)
2. If the above is the optimal solution, is there any parser/module
shipped along with the standard perl (5.8) distro .. ?

Many thanks ..
Abhinav

John Bokma · May 28, 2004

Abhinav said:
Hi

I have a script where some chuncks of text are marked between xml-type
tags .

I say 'xml-type' instead of xml as the tags are preceded with a comment
character "# " so that the script does not fail.

Why not put the XML at the end said:
I need to be able to extract the data between tags (which can be
nested), and store it in a hash with each key being the tag itself and
the value, the data in between (it is multiline).

Or open your script as a file, and read the #'s and throw away real
comments (you can use ## for real ones for example), and parse the
result. But I recommend __END__

The problem is that I initiially tried using Text::Balanced, but gave up
since ir was too demanding for this kind of work .. spanning across
multiple lines ..

I am thinking of stripping the # from all tagged lines so that it
becomes an xml file, adding a root element (which was not present
before) , and then using an xml parser.

Yup, good idea :-D.

My questions :
1. Is the approach feasible, or is there som other simpler way to do it
.. (after all, TIMTOWTDI)

use __END__

2. If the above is the optimal solution, is there any parser/module
shipped along with the standard perl (5.8) distro .. ?

Yes, but I like XML::Twig a lot ;-) Have a look at it.

http://xmltwig.com/xmltwig/

Other pointers:

http://www.xml.com/pub/a/2000/04/05/feature/index.html
http://perl-xml.sourceforge.net/faq/

chanio · May 28, 2004

John Bokma (comp.lang.perl.misc) dijo...

Or open your script as a file, and read the #'s and throw away real
comments (you can use ## for real ones for example), and parse the
result. But I recommend __END__

Yup, good idea :-D.

use __END__

Yes, but I like XML::Twig a lot ;-) Have a look at it.

http://xmltwig.com/xmltwig/

Other pointers:

http://www.xml.com/pub/a/2000/04/05/feature/index.html
http://perl-xml.sourceforge.net/faq/

You read each line without the preceding # and load it all in a scalar. Then
add it to the XMLin part of xmltwig. And get the parsed xml from XMLout.
Read the help file well since it has a lot of clauses in order to interpret
it well, but it is only trying and changing until you get the best result.
Then you have it all inside a hash reference (with array references inside).

--
.------------------. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
| ___ _ _ _ _ | ALBERTO ADRIAN SCHIANO - ARGENTINA - 2004
|/ __/ | \ || | | | <[email protected]> # 34-34S 058-25W(z-3)
|||_< \| || ' | | +------------+------------------------------
|`____/|_\_|`___' | LINUX COUNTER: 240 133 ~ machine : 119 401
| _ _ _ __ _ | +------------+----------+-------------------
|| | | \ |\ \/ | AMD Athlon 6 |RAM 512Mb.|krnl.: 2.6.3-10mdk
|| |_ | | \ \ | i586-mandrake-linux-gnu |MDK 9.2 - KDE 3.13
||___||_\_|_/\_\ | +-----------------------+-------------------
| __ __ ___ _ _ | Maxtor #4D040H2 32Gb. |DISPLAY_VGA SiS 630
|| \ \| . \| / | ------------------------+--+----------------
|| || | || \ | PCI Audio snd-trident 7018 | ViewSonic E771
||_|_|_||___/|_\_ | ---------------------------+----------------
| | http://perlmonks.org/index.pl?node_id=245320
'------------------' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.

Abhinav · May 28, 2004

John said:
Why not put the XML at the end, after __END__ and read it using <DATA>?

Hi John ,
Thanks ! I was not clear when I said "I have a script" . I actually
meant that I have a Winrunner script, Not Perl script, in which i wanted
to put these tags. (So as to extract info from the Winrunner script,
using a perl script

)

Or open your script as a file, and read the #'s and throw away real
comments (you can use ## for real ones for example), and parse the
result. But I recommend __END__

Yup, good idea :-D.

use __END__

Yes, but I like XML::Twig a lot ;-) Have a look at it.

http://xmltwig.com/xmltwig/

Other pointers:

http://www.xml.com/pub/a/2000/04/05/feature/index.html
http://perl-xml.sourceforge.net/faq/

Thanks .. that gives me enough to do for now

Anyway, good to know
that the approach I want to use fnids accepteance

Regards
AB

XML::Parser Installation error: XML-Parser-2.34	3	Oct 2, 2006
XML::Parser Installation error: XML-Parser-2.34	0	Oct 2, 2006
My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)	43	Dec 21, 2005
XML::PARSER utf-8 and japanese characters	1	Jul 28, 2004
Search/Replace text in XML file	4	Jan 9, 2008
Parsing XML and storing attributes in MySQL using Perl	5	Jun 12, 2006
javascript xml parser question.	4	Jul 5, 2004
Recommended decent XML editor? I kind of desperately need one.	7	Aug 15, 2004

xml type parser in the standard perl installation ?

Abhinav

John Bokma

chanio

Abhinav

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads