RE Perl Pattern matching

Discussion in 'Perl Misc' started by Deepan Perl XML Parser, Apr 2, 2008.

  1. Hi,
    I am having a string say $str, the value of it is as
    below:

    <responseStatus>HTTP/1.1 200 OK</responseStatus>

    <cookies>

    <cookie name="ASPSESSIONIDSQDCBDBA" path="/" domain="www-
    int.juniper.net">DOCFGJEAKNOMBLHCGEMOIMBA</cookie>

    </cookies>

    <headers>

    <header name="Cache-control">private</header>

    <header name="Content-Encoding">deflate</header>

    <header name="Content-Type">text/html</header>

    <header name="Date">Wed, 26 Mar 2008 04:48:16 GMT</header>

    <header name="Server">Concealed by Juniper Networks Redline EX</
    header>

    <header name="Set-
    Cookie">ASPSESSIONIDSQDCBDBA=DOCFGJEAKNOMBLHCGEMOIMBA; path=/</header>

    <header name="Transfer-Encoding">chunked</header>

    <header name="Vary">Accept-Encoding, User-Agent</header>

    <header name="Via">1.1 sac-p-green-dx2 (Juniper Networks
    Application Acceleration Platform - DX 5.1.8 0)</header>

    <header name="Warning">214 www-int.juniper.net &quot;Juniper
    Networks DX Active&quot;</header>

    <header name="X-Powered-By">ASP.NET</header>

    </headers>

    <content>

    <contentLength>27887</contentLength>

    <compression>71.3</compression>

    <encodingScheme>deflate</encodingScheme>

    <text><![CDATA[
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"..."http://
    www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">..<html>..<head>....<title>
    Intranet Home Page</title>..<script language="JavaScript" type="text/
    javascript">..function clicker()..{..document.seek2.qt.value =
    document.seek1.qt.value;..return true;..}</form>.. <!-- close Main2 --
    >..</div><!-- close Main1 -->....</body>..</html>..

    ]]></text>

    <mimeType>text/html</mimeType>

    </content>

    ----------------

    Now i want to get everything between "<text><![CDATA[" and "]]></
    text>" [ie i need to capture the CDATA section]and i am using the
    below code

    if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
    {
    print $1;
    }


    But not getting anything. Can anyone find out the fault in it?
    Deepan Perl XML Parser, Apr 2, 2008
    #1
    1. Advertising

  2. Deepan Perl XML Parser

    Ben Bullock Guest

    On Apr 2, 2:23 pm, Deepan Perl XML Parser <> wrote:

    > if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
    > {
    > print $1;
    >
    > }
    >
    > But not getting anything. Can anyone find out the fault in it?


    You need an "s" at the end:

    if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )

    See http://perldoc.perl.org/perlre.html#Modifiers
    Ben Bullock, Apr 2, 2008
    #2
    1. Advertising

  3. On Apr 2, 10:30 am, Ben Bullock <> wrote:
    > On Apr 2, 2:23 pm, Deepan Perl XML Parser <> wrote:
    >
    > > if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
    > > {
    > > print $1;

    >
    > > }

    >
    > > But not getting anything. Can anyone find out the fault in it?

    >
    > You need an "s" at the end:
    >
    > if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )
    >
    > Seehttp://perldoc.perl.org/perlre.html#Modifiers


    Thank You Ben!
    Deepan Perl XML Parser, Apr 2, 2008
    #3
  4. Deepan Perl XML Parser

    Mirco Wahab Guest

    Re: Perl Pattern matching

    Deepan Perl XML Parser wrote:
    > Now i want to get everything between "<text><![CDATA[" and "]]></
    > text>" [ie i need to capture the CDATA section]and i am using the
    > below code
    >
    > if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
    > {
    > print $1;
    > }


    Your expression is (besides the /s modifier) perfectly valid
    but I'd like to make an additional remark. You could strip
    the newline characters (if any) and extract more than one
    CDATA section, sth. like:

    my $reg = qr{
    <text> # find section <text>
    <!\[CDATA\[ [\r\n]? # which contains another CDATA section
    (.+?) # capture the CDATA lines but ?check? \]\]
    [\r\n]?\]\]> # until CDATA terminator
    </text> # maybe even the <text> is closed properly
    }sx;

    print $1 while $str =~ /$reg/g; # extract each CDATA section

    Regards

    M.
    Mirco Wahab, Apr 2, 2008
    #4
  5. Deepan Perl XML Parser

    Ben Bullock Guest

    On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:


    > You're trying to parse XML with regular expressions. Don't do that.
    > Perl has a large selection of excellent modules for processing XML. Use
    > them.


    Chris, do you talk like that to people in real life, or is it just the
    internet?
    Ben Bullock, Apr 2, 2008
    #5
  6. >>>>> "BB" == Ben Bullock <> writes:

    BB> On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:
    >> You're trying to parse XML with regular expressions. Don't do
    >> that. Perl has a large selection of excellent modules for
    >> processing XML. Use them.


    BB> Chris, do you talk like that to people in real life, or is it
    BB> just the internet?

    When you've said the same thing over and over to people who aren't
    getting it, there is a clear temptation to speak slowly, with short
    sentences and short words.

    Charlton


    --
    Charlton Wilbur
    Charlton Wilbur, Apr 3, 2008
    #6
  7. On Wed, 02 Apr 2008 22:16:09 +0000, Ben Bullock wrote:

    > On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:
    >
    >
    >> You're trying to parse XML with regular expressions. Don't do that.
    >> Perl has a large selection of excellent modules for processing XML. Use
    >> them.

    >
    > Chris, do you talk like that to people in real life, or is it just the
    > internet?


    I do. Even (especially?) if someone is new around here and is making a
    mistake thousands have made before.

    M4
    Martijn Lievaart, Apr 9, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. seema

    perl pattern matching

    seema, Mar 12, 2005, in forum: Perl
    Replies:
    1
    Views:
    726
    Bob Walton
    Mar 12, 2005
  2. Xah Lee
    Replies:
    9
    Views:
    854
    Chris Smith
    Feb 2, 2005
  3. Xah Lee
    Replies:
    4
    Views:
    604
    Aaron Sherman
    Feb 11, 2005
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    220
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    213
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page