Set a variable from a substring of another variable using Regular Expression

T

Tony

Howdy,

Here's my problem, I am reading in a file which contains lines of
different lengths. I need to set the last word on lines which match a
pattern (contain 'dogs') as the value of another variable.


# My file contains some lines that do not have the word
# dogs in them; and I am only interested here in lines
# that contain the word "dogs"

$START="dogs are:";

# I have opened my file, and now each instance of $_
# represents an entire line in sequence.
# So long as there is a line - repeat this loop.

while( $_=<FILE> ) {
$CURRENT_LINE=$_;
$CURRENT_LINE_DUPE=$CURRENT_LINE;

# Above I have setup two variables to hold the value
# of the current line.
#
# If and only if the current line contains the phrase "dogs Are:"
# do I continue to work on this line.

if ($CURRENT_LINE = /$START/) {

# What I want to do next is set the variable DOGSARE to the
# last word in this line.

$DOGSARE=~ m|(\w$)|;
print "$RDOGSARE";
}
}

In short - this doesn't work. The problem is setting the DOGSARE
variable value; I cannot seem to get it to set.

Any help would be greatly appreciated.

Thanks in advance,

TC
 
M

Mark Clements

Tony said:

You have a fair few issues here...

First up:

always run with

use warnings;
use strict;

which will catch a lot of potential errors for you.
Here's my problem, I am reading in a file which contains lines of
different lengths. I need to set the last word on lines which match a
pattern (contain 'dogs') as the value of another variable.


# My file contains some lines that do not have the word
# dogs in them; and I am only interested here in lines
# that contain the word "dogs"

$START="dogs are:";
Traditionally, variables tend not to be all uppercase, but anyway:

you have specified that you are interested in lines that contain the
word dogs, but you are using $START to generate a regex. Why set $START
to "dogs are:"?
# I have opened my file, and now each instance of $_
# represents an entire line in sequence.
# So long as there is a line - repeat this loop.

while( $_=<FILE> ) {
$CURRENT_LINE=$_;
$CURRENT_LINE_DUPE=$CURRENT_LINE;
How about:

while( my $currentLine = <FILE> ){
chomp $currentLine;

I'll assume that you are testing the call to

open FILE,....

for its return value and FILE is a valid filehandle. read

perldoc -f chomp
# Above I have setup two variables to hold the value
# of the current line.
#
# If and only if the current line contains the phrase "dogs Are:"
# do I continue to work on this line.

if ($CURRENT_LINE = /$START/) {
You are *not* using the match operator here on $CURRENT_LINE here: you
are assigning the return value of /$START/ (matching on $_ by default
)to $CURRENT_LINE, so $CURRENT_LINE will now contain true if the match
has succeeded.

Try:

if( $currentLine =~ /$START/o ){

read man perlre. Make sure you understand what the /o flag does (it
isn't necessary, but may give you some performance benefit, but note
that if $START changes for any reason then this regex will *not* be
recompiled and will continue to match against the old value). Read on....

# What I want to do next is set the variable DOGSARE to the
# last word in this line.

$DOGSARE=~ m|(\w$)|;
print "$RDOGSARE";
}
}

The regex is wrong, and you aren't pulling out the matched field
correctly. You can combine this with the previous test:

if( $currentLine =~ /$START(?:.*)\b(\w+)$/o ){
my $dogsAre = $1;
print "$1\n";
}

You need to read man perlre, and look at some regex examples. You seem
to be missing quite a lot...

regards,

Mark
 
T

Tad McClellan

Tony said:
Here's my problem, I am reading in a file which contains lines of
different lengths. I need to set the last word


You need to set the last word to _what_?

$START="dogs are:";
while( $_=<FILE> ) {
$CURRENT_LINE=$_;
$CURRENT_LINE_DUPE=$CURRENT_LINE;

# Above I have setup two variables to hold the value
# of the current line.


You have set up *three* variables to hold the value of the current line.

# If and only if the current line contains the phrase "dogs Are:"


That string does NOT match the pattern that you put
into $START, case matters.

(UPPER CASE variable names suck, BTW.)

if ($CURRENT_LINE = /$START/) {


I'm guessing you are missing a tilde character on that line:

if ($CURRENT_LINE =~ /$START/) {

# What I want to do next is set the variable DOGSARE to the
# last word in this line.

$DOGSARE=~ m|(\w$)|;


If you think that is how you set the value of $DOGSARE, you
should stop coding and do some more reading before proceeding.

How many characters will your pattern put into $1 ?

$dogs_are = $1 if /(\w+)$/;

print "$RDOGSARE";


Sheesh!

You have a typo on the variable name.

(Which is precisely the reason why UPPERCASE variable names suck.)

You should always enable "use strict" when developing Perl code.

(Most especially when you are going to ask hundreds of people
around the world to help you with it.)

You should not use useless double quotes like that, see:

perldoc -q vars

In short - this doesn't work.


Of course not, it has a whole boatload of problems.

Thanks in advance,


It is demeaning to be asked to do the work of a machine, please
don't do that to us again, enable "strict" (and "warnings" too).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top