Novice - help with pattern matching needed

Discussion in 'Perl Misc' started by Robert Day, Feb 7, 2004.

  1. Robert Day

    Robert Day Guest

    Hi

    I am using a very basic Perl script to parse a file and extract just
    the elements I need but one aspect is causing me trouble and I am sure
    the answer is probably quite simple. Below are examples of two of the
    lines (watch wrapping) - the value I seek is that between the date on
    the left and the "UV Port" on the right.

    Enter bookmobile session location code (or NONE) : NONE 06 FEB 2004
    March Mobile A UV Port 51
    Circulation
    06 FEB 2004 Papworth Library
    UV Port 50

    The section of code dealing with this is currently

    if(/UV/) {
    $library = $`;
    $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
    $library =~ s/- CAMBOOK//g;
    $library =~ s/(\w+)/\u\L$1/g;
    print "$library\n";
    }

    The 2nd and 3rd pattern matches deal with other lines in the data (not
    shown) in which the value I seek is all CAPS or has "- CAMBOOK"
    appended. This code works fine on line 2 of the sample data given
    above but I don't know how to get rid of "Enter bookmobile session
    location code (or NONE) : NONE" when it appears (as it does on a few
    entries). i have tried various patterns and I am sure the solution is
    simple but it eludes me at present. Can anyone help?

    Robert
    Robert Day, Feb 7, 2004
    #1
    1. Advertising

  2. Robert Day

    gnari Guest

    "Robert Day" <> wrote in message
    news:...
    > ... the value I seek is that between the date on
    > the left and the "UV Port" on the right.
    >
    > Enter bookmobile session location code (or NONE) : NONE 06 FEB 2004
    > March Mobile A UV Port 51
    > Circulation
    > 06 FEB 2004 Papworth Library
    > UV Port 50
    >
    > The section of code dealing with this is currently
    >
    > if(/UV/) {
    > $library = $`;
    > $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;


    are you sure about the '^' here?

    > $library =~ s/- CAMBOOK//g;
    > $library =~ s/(\w+)/\u\L$1/g;
    > print "$library\n";
    > }


    I just would do somethng like:
    if ( ($library)=/\d\d \w\w\w \d{4} (.*?)(- CAMBOOK)? UV/ ) {
    print "$library\n";
    }

    gnari
    gnari, Feb 7, 2004
    #2
    1. Advertising

  3. Robert Day wrote:
    > I am using a very basic Perl script to parse a file and extract
    > just the elements I need ...


    <snip>

    > I don't know how to get rid of "Enter bookmobile session location
    > code (or NONE) : NONE" when it appears (as it does on a few
    > entries). i have tried various patterns and I am sure the solution
    > is simple but it eludes me at present. Can anyone help?


    As regards the approach I have to ask: If you want to extract
    something, why do you not write code that does just that rather than
    deleting everything that you do not want to keep?

    $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
    ----------------------^
    What's your considerations behind beginning the pattern with the ^
    metacharacter?

    perldoc perlvar points out that the $` variable "anywhere in a program
    imposes a considerable performance penalty on all regular expression
    matches". There appears not to be any reason to use it here.

    > $library = $`;
    > $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
    > $library =~ s/- CAMBOOK//g;


    You may want to replace those three lines with:

    my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Feb 8, 2004
    #3
  4. Robert Day

    Robert Guest

    "Gunnar Hjalmarsson" <> wrote in message
    news:c03u4u$107mv3$-berlin.de...
    >
    > As regards the approach I have to ask: If you want to extract
    > something, why do you not write code that does just that rather than
    > deleting everything that you do not want to keep?


    It seemed simpler because there is consistency in the stuff to remove but
    the value I want to keep could be one of 70 different values, with a variety
    of different formats.

    >
    > $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
    > ----------------------^
    > What's your considerations behind beginning the pattern with the ^
    > metacharacter?
    >


    This is a leftover from the way the code worked before the introduction of
    entries with the "Enter bookmobile....." line. At that time the dates were
    always the leftmost item so always matched the ^ metacharacter.

    > You may want to replace those three lines with:
    >
    > my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;
    >


    Thanks. I'll give it a go (and then try to understand exactly what it is
    doing!)
    Robert
    Robert, Feb 8, 2004
    #4
  5. Robert wrote:
    > "Gunnar Hjalmarsson" <> wrote in message
    > news:c03u4u$107mv3$-berlin.de...
    >> As regards the approach I have to ask: If you want to extract
    >> something, why do you not write code that does just that rather
    >> than deleting everything that you do not want to keep?

    >
    > It seemed simpler because there is consistency in the stuff to
    > remove but the value I want to keep could be one of 70 different
    > values, with a variety of different formats.


    Okay. As you can see from both my and gnari's examples, that should
    not prevent you from capturing rather than removing stuff.

    >> You may want to replace those three lines with:
    >>
    >> my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;

    >
    > Thanks. I'll give it a go (and then try to understand exactly what
    > it is doing!)


    It can also be written:

    my $library;
    if ( /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/ ) {
    $library = $1;
    }

    Please study perldoc perlre about capturing, the meaning of the $1
    variable, etc.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Feb 8, 2004
    #5
  6. Robert Day

    R Day Guest

    "Gunnar Hjalmarsson" <> wrote in message
    news:c05n13$13j4u9$-berlin.de...
    > It can also be written:
    >
    > my $library;
    > if ( /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/ ) {
    > $library = $1;
    > }


    Thanks. This works as required.

    > Please study perldoc perlre about capturing, the meaning of the $1
    > variable, etc.


    I will do.

    Robert
    R Day, Feb 8, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Yoon Soo

    pattern matching code - little help needed

    Yoon Soo, Mar 7, 2004, in forum: C Programming
    Replies:
    0
    Views:
    314
    Yoon Soo
    Mar 7, 2004
  2. arun

    novice user. free help needed

    arun, Mar 9, 2005, in forum: C Programming
    Replies:
    3
    Views:
    325
    Randy Yates
    Mar 9, 2005
  3. TwelveEighty

    Novice Tomcat design pattern question

    TwelveEighty, Dec 1, 2007, in forum: Java
    Replies:
    9
    Views:
    337
    Juha Laiho
    Dec 3, 2007
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    230
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    224
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page