using the result of a variable regular expression

leifwessman · Aug 26, 2004

Hi!

I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

$regexp = "(\d)(\w)(\d)";
$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted
by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

Thanks for any input!

Leif

Brian McCauley · Aug 26, 2004

I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

$regexp = "(\d)(\w)(\d)";
$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted
by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

What you are trying to do is use something called a symbolic ref:

print $$numb; # works - print value of $3

But you have to be careful using symrefs...

{
# Untaint and check $numb - don't clobber $1 etc
die 'Not a number' unless do { ($numb) = $numb =~ /(^\d+$)/ };
no strict 'refs';
print $$numb;
}

That said I wouldn't use one myself because I never use $1 etc (other
than in the RHS of s/// or in while(//g).

if (my @captures = $data =~ /$regexp/) {
print $captures[$numb-1];
}

Gunnar Hjalmarsson · Aug 26, 2004

I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

Your problem starts before that code: You have not enabled strictures
and warnings!

use strict;
use warnings;

$regexp = "(\d)(\w)(\d)";

There is your second problem: $regex get the value '(d)(w)(d)', which
is not what you want.

my $regexp = '(\d)(\w)(\d)';
-----------------^------------^

1) Please copy and paste code that you post, do not retype it!

2) Warnings would have told you that something was wrong.

$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

You can do:

if ( my @capt = $data =~ /$regexp/ ) {
print $capt[$numb-1];
}

Anno Siegel · Aug 26, 2004

Gunnar Hjalmarsson said:
I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

Click to expand...

Your problem starts before that code: You have not enabled strictures
and warnings!

use strict;
use warnings;

$regexp = "(\d)(\w)(\d)";

Click to expand...

There is your second problem: $regex get the value '(d)(w)(d)', which
is not what you want.

my $regexp = '(\d)(\w)(\d)';
-----------------^------------^

1) Please copy and paste code that you post, do not retype it!

2) Warnings would have told you that something was wrong.

$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

Click to expand...

You can do:

if ( my @capt = $data =~ /$regexp/ ) {
print $capt[$numb-1];
}

Or, without an auxiliary variable:

defined and print for ( $data =~ /$regex/ )[ $numb - 1];

Anno

Anno Siegel · Aug 26, 2004

Gunnar Hjalmarsson said:
(e-mail address removed) wrote:

You can do:

if ( my @capt = $data =~ /$regexp/ ) {
print $capt[$numb-1];
}

Or, without an auxiliary variable:

defined and print for ( $data =~ /$regex/ )[ $numb - 1];

Anno

Tore Aursand · Aug 26, 2004

Some code to illustrate my problem:
[...]

Your code won't run. Please copy-and-paste working code, instead of
retyping it.

You should also add these:

use strict;
use warnings;

$regexp = "(\d)(\w)(\d)";
$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted
by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

You can match into an array;

if ( my @match = $data =~ /$regexp/ ) {
print @match[$numb-1];
}

John W. Krahn · Aug 26, 2004

I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

$regexp = "(\d)(\w)(\d)";
$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted
by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

You are extracting single characters. How about substr()?

print substr( $data, $numb - 1, 1 )

Why not define your regexp based on the submitted value?

my @fields = ( '\d', '\w', '\d' );
$fields[ $numb - 1 ] = '(' . $fields[ $numb - 1 ] . ')';
my $regexp = join '', @fields;
if ( $data =~ /$regexp/ ) {
print $1;
}

Or you could use the @+ and @- arrays:

my $regexp = '(\d)(\w)(\d)';
if ( $data =~ /$regexp/ ) {
print substr( $data, $-[ $numb ], $+[ $numb ] - $-[ $numb ] );
}

John

Brian McCauley · Aug 27, 2004

Anno said:
Gunnar Hjalmarsson said:

You can do:

if ( my @capt = $data =~ /$regexp/ ) {
print $capt[$numb-1];
}

Click to expand...

Or, without an auxiliary variable:

defined and print for ( $data =~ /$regex/ )[ $numb - 1];

In this particular case it is probably safe to assume that we want to
treat the case where the ${numb}th capture didn't capture to be
equivalent to the case where /$regex/ didn't match.

However it is important to be aware that you are making such an assumption.

Anno Siegel · Aug 27, 2004

Brian McCauley said:
Anno said:

Gunnar Hjalmarsson said:

You can do:

if ( my @capt = $data =~ /$regexp/ ) {
print $capt[$numb-1];
}

Click to expand...

Or, without an auxiliary variable:

defined and print for ( $data =~ /$regex/ )[ $numb - 1];

Click to expand...

In this particular case it is probably safe to assume that we want to
treat the case where the ${numb}th capture didn't capture to be
equivalent to the case where /$regex/ didn't match.

However it is important to be aware that you are making such an assumption.

You are right. I thought about explaining how it is okay (under this
assumption) to use the regex without explicitly checking if it matched,
but decided to let it slip. Thanks for pointing it out.

Anno

Sara · Aug 27, 2004

Hi!

I need to extract a certain value from a text. But the result isn't
always in the variable $1 - it might be in $2, $3, $4 or some other
predefined variable.

Some code to illustrate my problem:

$regexp = "(\d)(\w)(\d)";
$numb = 3; # Means the result I'm looking for is in $3
# I don't know this number, it's submitted
by user
# and may differ

if ($data =~ /$regexp/) {

print $numb; # does not work, prints "3"

# alternative solution that works
# but it's UGLY
if ($numb == 1) {
print $1;
} elsif ($numb == 2) {
print $2;
} elsif ($numb == 3) {
print $3;
}

# is there another way?
}

Thanks for any input!

Leif

Hi there Leif:

Interesting question. As pointed out, $$numb will work nicely for you.
The odd thing being that this LOOKS like a scalar dereference, which
it really isn't since 2 isn't the memory location of the value. Seems
like there is an ambiguity in there somewhere but I can't pinpoint it.

Thanks for posting.

G

Brian McCauley · Aug 27, 2004

Following said:
Interesting question. As pointed out, $$numb will work nicely for you.

For certain values of "nice".

The odd thing being that this LOOKS like a scalar dereference, which
it really isn't

Yes it is. It's a scalar dereference of a _symbolic_ reference.

since 2 isn't the memory location of the value.

If it were a _hard_ scalar reference then its numeric value would be the
address in memory.

Seems like there is an ambiguity in there somewhere but I can't pinpoint it.

No ambiguity. Perl's scalar values can contain either ordinary
strings/numbers or they can contain hard references. It is possible to
convert a hard reference[1] into an address in memory simply by using it
in a numeric context. It is not possible to go the other way[4]. If
you use a non-reference in a reference context then it will never be
treated as a memory address - it will be converted into a string and
looked up in the symbol table - i.e. it will be a symbolic reference.
Of course most of the time one has "strict qw(refs)" in effect which
causes symbolic references to be diallowed except in a few special
cases[2].

[1] (other than one to an overloaded type object)
[2] To do with symbolic CODErefs[3].
[3] And due to a bug any symrefs resolved at compile-time.
[4] In Perl - you can of course do anything you want by dropping down
into C.

Brian McCauley · Aug 28, 2004

bowsayge said:
Brian McCauley said to us:

For certain values of "nice".

Click to expand...

[...]

Bowsayge hopes that this is a better value of "nice":

'8 t 4' =~ /(\d) (\w) (\d)/;
my $numb = 3;
print "matched: ", eval("\$$numb"), "\n";

There is no need to enable sym-refs.

All the reasons to avoid symrefs bad also apply to eval(STRING), only
more so.

Anno Siegel · Aug 28, 2004

bowsayge said:
Brian McCauley said to us:

For certain values of "nice".

Click to expand...

[...]

Bowsayge hopes that this is a better value of "nice":

Not really.

'8 t 4' =~ /(\d) (\w) (\d)/;
my $numb = 3;
print "matched: ", eval("\$$numb"), "\n";

There is no need to enable sym-refs.

Sure. You can re-write any symref unsing eval like that, so string
eval is the more general mechanism. It also allows Perl to break its own
rules in more ways than mere symrefs do, so it's higher in the hierarchy
of nastiness, not lower.

It is also ugly because it's disproportionate, in the way it would be
ugly to start a sawmill to make a toothpick from a twig. You are running
another Perl interpreter to interpret a program that reads "$1" or "$5" or
something.

That said, your solution is, of course, perfectly valid. The symref
solution needs to unexpectedly talk about "strict", and may need a
bare block to limit the effect. So "eval" is shorter and more to the
point, and it's arguably as readable. Since the string you eval is
entirely defined in the program text (as opposed to an external source),
there is no additional risk in "eval".

But "nicer", no.

Anno

Ben Morrow · Aug 31, 2004

Quoth Brian McCauley said:
It is possible to
convert a hard reference[1] into an address in memory simply by using it
in a numeric context. It is not possible to go the other way[4]. If

[1] NMF

[4] In Perl - you can of course do anything you want by dropping down
into C.

You don't need C: unpack 'P' will work nicely...

Ben

How do I get the text that is found by a regular expression?	10	Apr 30, 2014
need regular expression to replace part of result based on a search pattern	13	Jul 11, 2012
Recursion regular expression (xtended)	1	Aug 16, 2010
Select files based on text list of filenames(part of the name:date) with condition	0	May 4, 2022
FAQ 6.24 How do I match a regular expression that's in a variable?	0	Apr 19, 2011
grep using regular expression	3	Dec 5, 2006
Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023
Help with perl special variable	5	Apr 11, 2012

using the result of a variable regular expression

leifwessman

Brian McCauley

Gunnar Hjalmarsson

Anno Siegel

Anno Siegel

Tore Aursand

John W. Krahn

Brian McCauley

Anno Siegel

Sara

Brian McCauley

Brian McCauley

Anno Siegel

Ben Morrow

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads