regular expression ignore example?

vorticitywolfe · Jun 6, 2007

Hello,

I have looked all over trying to find an example of how to grab
certain parts of a text file while ignoring others in the same line.

the file:

*************test.txt*************************
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.
***********************************************

Here is what I want to grab from it:

**************output************************
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0
**********************************************

Of course, the file is much larger than this, but I just want the
general syntax of how to accomplish this.

I've just started to use reg ex's, so bear with me...here is what I
have thusfar:

while(<file>){
if ($_=~ /(\d+)/)
{
print $_;
}
}

Thanks for your help!
Jonathan

Paul Lalli · Jun 6, 2007

Hello,

I have looked all over trying to find an example of how to grab
certain parts of a text file while ignoring others in the same line.

the file:

*************test.txt*************************
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.
***********************************************

Here is what I want to grab from it:

**************output************************
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0
**********************************************

Of course, the file is much larger than this, but I just want the
general syntax of how to accomplish this.

I've just started to use reg ex's, so bear with me...here is what I
have thusfar:

while(<file>){
if ($_=~ /(\d+)/)
{
print $_;

}
}

A regular expression has two intentions. First, it determines whether
or not a given string of text "matches" some pattern. Second, it is
used to "pull" certain parts of that match out to be stored
elsewhere. You are only using the first intent - you're determining
whether or not the line matches one or more digits.

What you should instead do is find out if this text matches the actual
format of the line, and secondarily pull out from that match the data
you want to keep:

#!/usr/bin/perl
use strict;
use warnings;

while (my $line = <DATA>) {
if ($line =~ /^CREATED AT:/) {
#If line starts with that text, print the entire line
print $line;
}
if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
#If line starts with Values of, pull out
#the relevant text that you want to print
print "VALUE: $1\n";
}
}
__DATA__
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.

The above program outputs:
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0

For more information on using meta characters and quantifiers (ie,
the .*? above) and the captured submatches (the $1), please have a
read of:
perldoc perlretut
perldoc perlre
perldoc perlreref

Paul Lalli

vorticitywolfe · Jun 6, 2007

A regular expression has two intentions. First, it determines whether
or not a given string of text "matches" some pattern. Second, it is
used to "pull" certain parts of that match out to be stored
elsewhere. You are only using the first intent - you're determining
whether or not the line matches one or more digits.

What you should instead do is find out if this text matches the actual
format of the line, and secondarily pull out from that match the data
you want to keep:

#!/usr/bin/perl
use strict;
use warnings;

while (my $line = <DATA>) {
if ($line =~ /^CREATED AT:/) {
#If line starts with that text, print the entire line
print $line;
}
if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
#If line starts with Values of, pull out
#the relevant text that you want to print
print "VALUE: $1\n";
}}

__DATA__
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.

The above program outputs:
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0

For more information on using meta characters and quantifiers (ie,
the .*? above) and the captured submatches (the $1), please have a
read of:
perldoc perlretut
perldoc perlre
perldoc perlreref

Paul Lalli

Excellent! That is just the information that I need to get me going.
I thought there may be a way to first pull in all the lines with
numbers and then clean them up with some code, and that is pretty much
what yours does! Thanks!

Jonathan

emacs lisp text processing example (html5 figure/figcaption)	7	Jul 4, 2011
ogl example question	0	Jan 24, 2007
What's happening with the last child process here????	8	Jan 10, 2007
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
"Boston Tea Party" <[email protected]>,	0	Mar 1, 2007
mmap thoughts	1	May 12, 2007
My own handy Pocket Reference notes	15	Aug 17, 2004
Errata for The C Programming Language, Second Edition, by Brian Kernighanand Dennis Ritchie	4	May 16, 2009

regular expression ignore example?

vorticitywolfe

Paul Lalli

vorticitywolfe

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads