regex problem

Discussion in 'Perl Misc' started by maheshpop1@gmail.com, May 4, 2006.

  1. Guest

    Hi folks,

    A little regex problem, need a lil help with the solution. I tried and
    might have missed out something here.

    STRING:
    O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
    /home/user/temp/All/testdata/

    Here is the regex
    if ($line =~
    /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
    ]+)/ ))

    Iam trying to capture the data in the paranthesis. however it doesnt
    seem to work.
    So guys anyone can you just evaluate this. perhaps a fresh look might
    capture something that i missed.

    cheers
    POP.
    , May 4, 2006
    #1
    1. Advertising

  2. wrote:
    > STRING:
    > O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
    > /home/user/temp/All/testdata/

    [...]
    > Iam trying to capture the data in the paranthesis.


    I don't see any paranthesis in the your sample data.

    jue
    Jürgen Exner, May 4, 2006
    #2
    1. Advertising

  3. Guest

    Oh sorry,

    Here is the sample data with the paranthesis.

    STRING:

    (O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
    =(All/com/testdata)=
    (/home/user/temp/All/testdata/)

    Here is the regex
    if ($line =~
    /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
    ]+)/ ))

    cheers
    POP
    , May 4, 2006
    #3
  4. Paul Lalli Guest

    wrote:
    > Hi folks,
    >
    > A little regex problem, need a lil help with the solution. I tried and
    > might have missed out something here.
    >
    > STRING:
    > O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
    > /home/user/temp/All/testdata/
    >
    > Here is the regex
    > if ($line =~
    > /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
    > ]+)/ ))



    That regular exprssion is insanely unreadable. WHY on earth is every
    single token inside a character class? I see exactly four of those [ ]
    that are actually needed - the ones that start with a ^. Get rid of
    every other [ ] in that regular expression, and then reformat it using
    the /x modifier and some prudent use of whitespace. If you haven't
    figured out your problem after that, post the modified version.

    Paul Lalli
    Paul Lalli, May 4, 2006
    #4
  5. Guest

    wrote:


    > Here is the regex
    > if ($line =~
    > /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
    > ]+)/ ))
    >


    I don't understand why you have all the open / close square brackets.

    To match any number of spaces you just need \s* , not [\s]*

    It's not clear exactly what you want since you have only posted 1
    possible line of data. If you just want to extract everything between
    parentheses and you don't know how many sets of parentheses there are,
    the following should work (provided there are no nested parentheses)

    #--------------------------------------------------------------------------------
    use strict;
    use warnings;
    use Data::Dumper;


    my $teststr = '(O) (2005-06-14) 14:43 +0000 (pop)
    (commain/com/testdata)';
    $teststr .= ' =(All/com/testdata)= ';
    $teststr .= ' (/home/user/temp/All/testdata/)';

    my @results = ();
    while ($teststr =~ /\((.*?)\)/g)
    {
    push @results , $1;
    }
    print Dumper @results;

    #-----------------------------------------------

    C:\develop\NiallPerlScripts>clpm16.pl
    $VAR1 = 'O';
    $VAR2 = '2005-06-14';
    $VAR3 = 'pop';
    $VAR4 = 'commain/com/testdata';
    $VAR5 = 'All/com/testdata';
    $VAR6 = '/home/user/temp/All/testdata/';


    If you want a regex that matches though you will have to define more
    specifically what the data looks like

    Hope this helps
    , May 4, 2006
    #5
  6. Guest

    Hi

    Its actually multiple lines of data like the below and I need to
    extract the ones which I manually highlighted inthe parantheses in the
    first string as an example..

    STRING ex:
    (O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
    =(All/com/testdata)= (/home/user/temp/All/testdata/)
    M 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
    /home/user/temp/All/testdata/
    E 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
    /home/user/temp/All/testdata/

    thanks Niall for the tip . I reformatted the regex

    if($line=~(^(O|E|M) \s* ([^+]+) +0+ \s* (\w+) \s* ([^=]+) \s*= ([^=]+)
    \s*=\s* ([^ ]+)/ ))

    Some one kindly point me to a good regex tutorial with samples.
    This actually reads a text file full of the above files and extracts
    the relevant (data in paranethesis in the first example line).

    thanks for assist guys.
    pop
    , May 4, 2006
    #6
  7. Csaba Guest

    wrote in
    news::

    > Oh sorry,
    >
    > Here is the sample data with the paranthesis.
    >
    > STRING:
    >
    > (O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
    > =(All/com/testdata)=
    > (/home/user/temp/All/testdata/)
    >
    > Here is the regex
    > if ($line =~
    > /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\
    > s]*([^
    >]+)/ ))
    >
    > cheers
    > POP
    >


    Maybe you should try Text::Balanced, especially extract_bracketed

    http://search.cpan.org/~dconway/Text-Balanced-1.97/lib/Text/Balanced.pm

    --
    Life is complex, with real and imaginary parts.
    Csaba, May 6, 2006
    #7
  8. David Combs Guest

    In article <>,
    >
    >Some one kindly point me to a good regex tutorial with samples.



    Buy the book "regular expressions", by Friedl, pub by O'Reilly,
    gotten much cheaper from www.bookpool.com.

    It is "the" reference!

    David
    David Combs, May 29, 2006
    #8
  9. Xicheng Jia Guest

    David Combs wrote:
    > In article <>,
    > >
    > >Some one kindly point me to a good regex tutorial with samples.

    >
    >
    > Buy the book "regular expressions", by Friedl, pub by O'Reilly,
    > gotten much cheaper from www.bookpool.com.
    > It is "the" reference!


    it should be: "Mastering Regular Expressions", 2nd Edition By Jeffrey
    E. F. Friedl.
    Publisher : O'Reilly
    Pub Date : July 2002
    ISBN : 0-596-00289-0
    Pages : 484

    this is the "bible" to knowing the regex engine behind.

    Also, there are several very nice papers about using regex in the
    following book:
    "Computer Science & Perl Programming: Best of TPJ", by Jon Orwant

    Xicheng :)
    Xicheng Jia, May 29, 2006
    #9
  10. On Mon, 28 May 2006, Xicheng Jia wrote:

    [...]
    > Also, there are several very nice papers about using regex in the
    > following book: "Computer Science & Perl Programming: Best of TPJ",
    > by Jon Orwant


    In addition to the book recommendations, I'd recommend getting the
    PCRE (perl-compatible regular expressions) package, including its
    "pcretest" facility, and using it to play around with regexes and
    patterns on-line. http://www.pcre.org/
    Alan J. Flavell, May 29, 2006
    #10
  11. DJ Stunks Guest

    Csaba wrote:
    > wrote in
    > news::
    >
    > > Oh sorry,
    > >
    > > Here is the sample data with the paranthesis.
    > >
    > > STRING:
    > >
    > > (O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
    > > =(All/com/testdata)=
    > > (/home/user/temp/All/testdata/)
    > >
    > > Here is the regex
    > > if ($line =~
    > > /(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\
    > > s]*([^
    > >]+)/ ))
    > >
    > > cheers
    > > POP
    > >

    >
    > Maybe you should try Text::Balanced, especially extract_bracketed
    >
    > http://search.cpan.org/~dconway/Text-Balanced-1.97/lib/Text/Balanced.pm


    This is really just speculation, but I don't think his real data has
    the parentheses in it; I think he just put those in to show what fields
    he wanted collected...

    -jp
    DJ Stunks, May 29, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    684
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,600
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    585
  4. Xah Lee
    Replies:
    1
    Views:
    925
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    716
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page