regular expression ignore example?

Discussion in 'Perl Misc' started by vorticitywolfe@gmail.com, Jun 6, 2007.

  1. Guest

    Hello,

    I have looked all over trying to find an example of how to grab
    certain parts of a text file while ignoring others in the same line.

    the file:

    *************test.txt*************************
    CREATED AT: Wed-June6-17:50 2007
    NUM OBS: 1440
    Values of two numbers:30.0/45.0 More text
    that can be disregarded.
    ***********************************************

    Here is what I want to grab from it:

    **************output************************
    CREATED AT: Wed-June6-17:50 2007
    VALUE: 30.0/45.0
    **********************************************

    Of course, the file is much larger than this, but I just want the
    general syntax of how to accomplish this.

    I've just started to use reg ex's, so bear with me...here is what I
    have thusfar:

    while(<file>){
    if ($_=~ /(\d+)/)
    {
    print $_;
    }
    }

    Thanks for your help!
    Jonathan
     
    , Jun 6, 2007
    #1
    1. Advertising

  2. Paul Lalli Guest

    On Jun 6, 2:15 pm, wrote:
    > Hello,
    >
    > I have looked all over trying to find an example of how to grab
    > certain parts of a text file while ignoring others in the same line.
    >
    > the file:
    >
    > *************test.txt*************************
    > CREATED AT: Wed-June6-17:50 2007
    > NUM OBS: 1440
    > Values of two numbers:30.0/45.0 More text
    > that can be disregarded.
    > ***********************************************
    >
    > Here is what I want to grab from it:
    >
    > **************output************************
    > CREATED AT: Wed-June6-17:50 2007
    > VALUE: 30.0/45.0
    > **********************************************
    >
    > Of course, the file is much larger than this, but I just want the
    > general syntax of how to accomplish this.
    >
    > I've just started to use reg ex's, so bear with me...here is what I
    > have thusfar:
    >
    > while(<file>){
    > if ($_=~ /(\d+)/)
    > {
    > print $_;
    >
    > }
    > }


    A regular expression has two intentions. First, it determines whether
    or not a given string of text "matches" some pattern. Second, it is
    used to "pull" certain parts of that match out to be stored
    elsewhere. You are only using the first intent - you're determining
    whether or not the line matches one or more digits.

    What you should instead do is find out if this text matches the actual
    format of the line, and secondarily pull out from that match the data
    you want to keep:

    #!/usr/bin/perl
    use strict;
    use warnings;

    while (my $line = <DATA>) {
    if ($line =~ /^CREATED AT:/) {
    #If line starts with that text, print the entire line
    print $line;
    }
    if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
    #If line starts with Values of, pull out
    #the relevant text that you want to print
    print "VALUE: $1\n";
    }
    }
    __DATA__
    CREATED AT: Wed-June6-17:50 2007
    NUM OBS: 1440
    Values of two numbers:30.0/45.0 More text
    that can be disregarded.


    The above program outputs:
    CREATED AT: Wed-June6-17:50 2007
    VALUE: 30.0/45.0

    For more information on using meta characters and quantifiers (ie,
    the .*? above) and the captured submatches (the $1), please have a
    read of:
    perldoc perlretut
    perldoc perlre
    perldoc perlreref

    Paul Lalli
     
    Paul Lalli, Jun 6, 2007
    #2
    1. Advertising

  3. Guest

    On Jun 6, 6:42 pm, Paul Lalli <> wrote:
    > On Jun 6, 2:15 pm, wrote:
    >
    >
    >
    > > Hello,

    >
    > > I have looked all over trying to find an example of how to grab
    > > certain parts of a text file while ignoring others in the same line.

    >
    > > the file:

    >
    > > *************test.txt*************************
    > > CREATED AT: Wed-June6-17:50 2007
    > > NUM OBS: 1440
    > > Values of two numbers:30.0/45.0 More text
    > > that can be disregarded.
    > > ***********************************************

    >
    > > Here is what I want to grab from it:

    >
    > > **************output************************
    > > CREATED AT: Wed-June6-17:50 2007
    > > VALUE: 30.0/45.0
    > > **********************************************

    >
    > > Of course, the file is much larger than this, but I just want the
    > > general syntax of how to accomplish this.

    >
    > > I've just started to use reg ex's, so bear with me...here is what I
    > > have thusfar:

    >
    > > while(<file>){
    > > if ($_=~ /(\d+)/)
    > > {
    > > print $_;

    >
    > > }
    > > }

    >
    > A regular expression has two intentions. First, it determines whether
    > or not a given string of text "matches" some pattern. Second, it is
    > used to "pull" certain parts of that match out to be stored
    > elsewhere. You are only using the first intent - you're determining
    > whether or not the line matches one or more digits.
    >
    > What you should instead do is find out if this text matches the actual
    > format of the line, and secondarily pull out from that match the data
    > you want to keep:
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    >
    > while (my $line = <DATA>) {
    > if ($line =~ /^CREATED AT:/) {
    > #If line starts with that text, print the entire line
    > print $line;
    > }
    > if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
    > #If line starts with Values of, pull out
    > #the relevant text that you want to print
    > print "VALUE: $1\n";
    > }}
    >
    > __DATA__
    > CREATED AT: Wed-June6-17:50 2007
    > NUM OBS: 1440
    > Values of two numbers:30.0/45.0 More text
    > that can be disregarded.
    >
    > The above program outputs:
    > CREATED AT: Wed-June6-17:50 2007
    > VALUE: 30.0/45.0
    >
    > For more information on using meta characters and quantifiers (ie,
    > the .*? above) and the captured submatches (the $1), please have a
    > read of:
    > perldoc perlretut
    > perldoc perlre
    > perldoc perlreref
    >
    > Paul Lalli


    Excellent! That is just the information that I need to get me going.
    I thought there may be a way to first pull in all the lines with
    numbers and then clean them up with some code, and that is pretty much
    what yours does! Thanks!

    Jonathan
     
    , Jun 6, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,307
  2. David Cho
    Replies:
    2
    Views:
    922
    =?Utf-8?B?VmlzaG51LUNoaXZ1a3VsYQ==?=
    Feb 15, 2005
  3. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    851
    Alan Moore
    Dec 2, 2005
  4. katy28
    Replies:
    0
    Views:
    3,475
    katy28
    Feb 27, 2008
  5. Rob Meade
    Replies:
    6
    Views:
    270
    Rob Meade
    Mar 1, 2004
Loading...

Share This Page