Pattern matching

Discussion in 'Perl Misc' started by Deepan Perl XML Parser, Mar 25, 2008.

  1. Hi all,
    I am having a file like below:

    <?xml version="1.0" encoding="UTF-8"?>
    <log xmlns="http://www.httpwatch.com/xml/log/5.1">
    <entry method="GET" URL="http://www.google.com/sa/frame_main.cgi">
    ..
    ..
    ..
    ..
    -------some text-------------
    ..
    ..
    ..
    </entry>
    <entry method="GET" URL="http://www.toogle.com/framer/main.cgi">
    ..
    ..
    ..
    ..
    -------some text-------------
    ..
    ..
    ..
    </entry>
    <entry method="GET" URL="http://www.google.com/sa/frame_main.html">
    ..
    ..
    ..
    ..
    -------some text-------------
    ..
    ..
    ..
    </entry>
    <page id="page_0" title="Sustaining Portal" dynamic="true"
    unknown="false">
    <started>00:00:00.000</started>
    <startedDateTime>2008-03-25T09:52:12.791</startedDateTime>
    </page>
    <page id="page_1" title="Sustaining Portal" dynamic="true"
    unknown="false">
    <started>00:00:08.455</started>
    <startedDateTime>2008-03-25T09:52:21.246</startedDateTime>
    </page>
    <page id="page_2" title="Sustaining Portal" dynamic="true"
    unknown="false">
    <started>00:00:20.296</started>
    <startedDateTime>2008-03-25T09:52:33.087</startedDateTime>
    </page>
    <page id="page_3" title="Sustaining Portal" dynamic="true"
    unknown="false">
    <started>00:00:29.848</started>
    <startedDateTime>2008-03-25T09:52:42.639</startedDateTime>
    </page>
    </log>

    ----------------------------------------------------------------------------------

    Now how to get all those <entry ....> tags into an array? I mean
    getting

    <entry method="GET" URL="http://www.google.com/sa/
    frame_main.cgi">
    <entry method="GET" URL="http://www.toogle.com/framer/
    main.cgi">
    <entry method="GET" URL="http://www.google.com/sa/
    frame_main.html">

    into some array.

    Thanks,
    Deepan
     
    Deepan Perl XML Parser, Mar 25, 2008
    #1
    1. Advertising

  2. On Mar 25, 9:50 am, Deepan Perl XML Parser <>
    wrote:
    > Hi all,
    > I am having a file like below:
    >
    > <?xml version="1.0" encoding="UTF-8"?>
    > <log xmlns="http://www.httpwatch.com/xml/log/5.1">
    > <entry method="GET" URL="http://www.google.com/sa/frame_main.cgi">
    > .
    > .
    > .
    > .
    > -------some text-------------
    > .
    > .
    > .
    > </entry>
    > <entry method="GET" URL="http://www.toogle.com/framer/main.cgi">
    > .
    > .
    > .
    > .
    > -------some text-------------
    > .
    > .
    > .
    > </entry>
    > <entry method="GET" URL="http://www.google.com/sa/frame_main.html">
    > .
    > .
    > .
    > .
    > -------some text-------------
    > .
    > .
    > .
    > </entry>
    > <page id="page_0" title="Sustaining Portal" dynamic="true"
    > unknown="false">
    > <started>00:00:00.000</started>
    > <startedDateTime>2008-03-25T09:52:12.791</startedDateTime>
    > </page>
    > <page id="page_1" title="Sustaining Portal" dynamic="true"
    > unknown="false">
    > <started>00:00:08.455</started>
    > <startedDateTime>2008-03-25T09:52:21.246</startedDateTime>
    > </page>
    > <page id="page_2" title="Sustaining Portal" dynamic="true"
    > unknown="false">
    > <started>00:00:20.296</started>
    > <startedDateTime>2008-03-25T09:52:33.087</startedDateTime>
    > </page>
    > <page id="page_3" title="Sustaining Portal" dynamic="true"
    > unknown="false">
    > <started>00:00:29.848</started>
    > <startedDateTime>2008-03-25T09:52:42.639</startedDateTime>
    > </page>
    > </log>
    >
    > ----------------------------------------------------------------------------------
    >
    > Now how to get all those <entry ....> tags into an array? I mean
    > getting
    >
    > <entry method="GET" URL="http://www.google.com/sa/
    > frame_main.cgi">
    > <entry method="GET" URL="http://www.toogle.com/framer/
    > main.cgi">
    > <entry method="GET" URL="http://www.google.com/sa/
    > frame_main.html">
    >
    > into some array.
    >
    > Thanks,
    > Deepan


    while($string =~ m#<entry method=.* URL="http://(.*)">#g)
    {
    ..................
    ..................
    }

    I am able to do this by using the above expr. Is there any fair way of
    doing it other than this?
     
    Deepan Perl XML Parser, Mar 25, 2008
    #2
    1. Advertising

  3. Deepan Perl XML Parser <> wrote:
    > I am having a file like below:
    ><?xml version="1.0" encoding="UTF-8"?>

    [...]
    >Now how to get all those <entry ....> tags into an array? I mean


    You would use an XML parser to parse XML.
    Has nothing to do with pattern matching at all.

    jue
     
    Jürgen Exner, Mar 25, 2008
    #3
  4. On Mar 25, 10:40 am, Jürgen Exner <> wrote:
    > Deepan Perl XML Parser <> wrote:
    >
    > > I am having a file like below:
    > ><?xml version="1.0" encoding="UTF-8"?>

    > [...]
    > >Now how to get all those <entry ....> tags into an array? I mean

    >
    > You would use an XML parser to parse XML.
    > Has nothing to do with pattern matching at all.
    >
    > jue


    No i am writing my own XML parser.
     
    Deepan Perl XML Parser, Mar 25, 2008
    #4
  5. On Mar 25, 7:13 pm, Lawrence Statton <> wrote:
    > Deepan Perl XML Parser <> writes:
    >
    >
    >
    > > No i am writing my own XML parser.

    >
    > Don't. There are many good XML parsers out there, the world doesn't
    > need another one.
    >
    > --
    > Lawrence Statton - s/aba/c/g
    > Computer software consists of only two components: ones and
    > zeros, in roughly equal proportions. All that is required is to
    > place them into the correct order.


    Okay then can you name any parsers that would get the CDATA section?
     
    Deepan Perl XML Parser, Mar 26, 2008
    #5
  6. On 2008-03-26 04:00, Deepan Perl XML Parser <> wrote:
    > On Mar 25, 7:13 pm, Lawrence Statton <> wrote:
    >> Deepan Perl XML Parser <> writes:
    >>
    >> > No i am writing my own XML parser.

    >>
    >> Don't. There are many good XML parsers out there, the world doesn't
    >> need another one.

    >
    > Okay then can you name any parsers that would get the CDATA section?


    Which one doesn't?

    LibXML certainly does (I just tested it). I think expat does, too.
    I have my doubts about the pure perl XML parser, but that has a lot of
    other problems too and shouldn't be used.

    hp
     
    Peter J. Holzer, Mar 26, 2008
    #6
  7. On Mar 26, 5:31 pm, "Peter J. Holzer" <> wrote:
    > On 2008-03-26 04:00, Deepan Perl XML Parser <> wrote:
    >
    > > On Mar 25, 7:13 pm, Lawrence Statton <> wrote:
    > >> Deepan Perl XML Parser <> writes:

    >
    > >> > No i am writing my own XML parser.

    >
    > >> Don't. There are many good XML parsers out there, the world doesn't
    > >> need another one.

    >
    > > Okay then can you name any parsers that would get the CDATA section?

    >
    > Which one doesn't?
    >
    > LibXML certainly does (I just tested it). I think expat does, too.
    > I have my doubts about the pure perl XML parser, but that has a lot of
    > other problems too and shouldn't be used.
    >
    > hp


    This one XML::parser doesn't. It just signals you as CDATA starts and
    ends here. It is not possible to get the data which is present using
    this.
     
    Deepan Perl XML Parser, Mar 27, 2008
    #7
  8. On 2008-03-27 04:32, Deepan Perl XML Parser <> wrote:
    > On Mar 26, 5:31 pm, "Peter J. Holzer" <> wrote:
    >> On 2008-03-26 04:00, Deepan Perl XML Parser <> wrote:
    >> > On Mar 25, 7:13 pm, Lawrence Statton <> wrote:
    >> >> Deepan Perl XML Parser <> writes:

    >>
    >> >> > No i am writing my own XML parser.

    >>
    >> >> Don't. There are many good XML parsers out there, the world doesn't
    >> >> need another one.

    >>
    >> > Okay then can you name any parsers that would get the CDATA section?

    >>
    >> Which one doesn't?
    >>
    >> LibXML certainly does (I just tested it). I think expat does, too.
    >> I have my doubts about the pure perl XML parser, but that has a lot of
    >> other problems too and shouldn't be used.

    >
    > This one XML::parser doesn't. It just signals you as CDATA starts and
    > ends here. It is not possible to get the data which is present using
    > this.


    Works for me:

    chronos:/wsrdb/users/hjp/tmp 20:47 112% cat foo.xml
    <script>
    <![CDATA[
    function matchwo(a,b)
    {
    if (a < b && a < 0) then
    {
    return 1;
    }
    else
    {
    return 0;
    }
    }
    ]]>
    </script>
    chronos:/wsrdb/users/hjp/tmp 20:47 113% cat foo
    #!/usr/bin/perl
    use XML::Simple;
    use Data::Dumper;

    $x = XMLin($ARGV[0]);
    print Dumper $x;
    chronos:/wsrdb/users/hjp/tmp 20:47 114% export XML_SIMPLE_PREFERRED_PARSER=XML::parser
    chronos:/wsrdb/users/hjp/tmp 20:47 115% ./foo foo.xml
    $VAR1 = '

    function matchwo(a,b)
    {
    if (a < b && a < 0) then
    {
    return 1;
    }
    else
    {
    return 0;
    }
    }

    ';

    hp
     
    Peter J. Holzer, Mar 27, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DelphiDude
    Replies:
    3
    Views:
    1,192
  2. danpres2k
    Replies:
    3
    Views:
    7,519
    danpres2k
    Aug 25, 2003
  3. CV
    Replies:
    2
    Views:
    609
    Charles DeRykus
    Aug 31, 2004
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    266
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    268
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page