regular expression ignore example?

V

vorticitywolfe

Hello,

I have looked all over trying to find an example of how to grab
certain parts of a text file while ignoring others in the same line.

the file:

*************test.txt*************************
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.
***********************************************

Here is what I want to grab from it:

**************output************************
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0
**********************************************

Of course, the file is much larger than this, but I just want the
general syntax of how to accomplish this.

I've just started to use reg ex's, so bear with me...here is what I
have thusfar:

while(<file>){
if ($_=~ /(\d+)/)
{
print $_;
}
}

Thanks for your help!
Jonathan
 
P

Paul Lalli

Hello,

I have looked all over trying to find an example of how to grab
certain parts of a text file while ignoring others in the same line.

the file:

*************test.txt*************************
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.
***********************************************

Here is what I want to grab from it:

**************output************************
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0
**********************************************

Of course, the file is much larger than this, but I just want the
general syntax of how to accomplish this.

I've just started to use reg ex's, so bear with me...here is what I
have thusfar:

while(<file>){
if ($_=~ /(\d+)/)
{
print $_;

}
}

A regular expression has two intentions. First, it determines whether
or not a given string of text "matches" some pattern. Second, it is
used to "pull" certain parts of that match out to be stored
elsewhere. You are only using the first intent - you're determining
whether or not the line matches one or more digits.

What you should instead do is find out if this text matches the actual
format of the line, and secondarily pull out from that match the data
you want to keep:

#!/usr/bin/perl
use strict;
use warnings;

while (my $line = <DATA>) {
if ($line =~ /^CREATED AT:/) {
#If line starts with that text, print the entire line
print $line;
}
if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
#If line starts with Values of, pull out
#the relevant text that you want to print
print "VALUE: $1\n";
}
}
__DATA__
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.


The above program outputs:
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0

For more information on using meta characters and quantifiers (ie,
the .*? above) and the captured submatches (the $1), please have a
read of:
perldoc perlretut
perldoc perlre
perldoc perlreref

Paul Lalli
 
V

vorticitywolfe

A regular expression has two intentions. First, it determines whether
or not a given string of text "matches" some pattern. Second, it is
used to "pull" certain parts of that match out to be stored
elsewhere. You are only using the first intent - you're determining
whether or not the line matches one or more digits.

What you should instead do is find out if this text matches the actual
format of the line, and secondarily pull out from that match the data
you want to keep:

#!/usr/bin/perl
use strict;
use warnings;

while (my $line = <DATA>) {
if ($line =~ /^CREATED AT:/) {
#If line starts with that text, print the entire line
print $line;
}
if ($line =~ /^Values of.*?(\d+\.\d+\/\d+\.\d+)/) {
#If line starts with Values of, pull out
#the relevant text that you want to print
print "VALUE: $1\n";
}}

__DATA__
CREATED AT: Wed-June6-17:50 2007
NUM OBS: 1440
Values of two numbers:30.0/45.0 More text
that can be disregarded.

The above program outputs:
CREATED AT: Wed-June6-17:50 2007
VALUE: 30.0/45.0

For more information on using meta characters and quantifiers (ie,
the .*? above) and the captured submatches (the $1), please have a
read of:
perldoc perlretut
perldoc perlre
perldoc perlreref

Paul Lalli

Excellent! That is just the information that I need to get me going.
I thought there may be a way to first pull in all the lines with
numbers and then clean them up with some code, and that is pretty much
what yours does! Thanks!

Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,019
Latest member
RoxannaSta

Latest Threads

Top