using the result of a variable regular expression

Discussion in 'Perl Misc' started by leifwessman@hotmail.com, Aug 26, 2004.

  1. Guest

    Hi!

    I need to extract a certain value from a text. But the result isn't
    always in the variable $1 - it might be in $2, $3, $4 or some other
    predefined variable.

    Some code to illustrate my problem:

    $regexp = "(\d)(\w)(\d)";
    $numb = 3; # Means the result I'm looking for is in $3
    # I don't know this number, it's submitted
    by user
    # and may differ

    if ($data =~ /$regexp/) {

    print $numb; # does not work, prints "3"

    # alternative solution that works
    # but it's UGLY
    if ($numb == 1) {
    print $1;
    } elsif ($numb == 2) {
    print $2;
    } elsif ($numb == 3) {
    print $3;
    }

    # is there another way?
    }

    Thanks for any input!

    Leif
     
    , Aug 26, 2004
    #1
    1. Advertising

  2. wrote:

    > I need to extract a certain value from a text. But the result isn't
    > always in the variable $1 - it might be in $2, $3, $4 or some other
    > predefined variable.
    >
    > Some code to illustrate my problem:
    >
    > $regexp = "(\d)(\w)(\d)";
    > $numb = 3; # Means the result I'm looking for is in $3
    > # I don't know this number, it's submitted
    > by user
    > # and may differ
    >
    > if ($data =~ /$regexp/) {
    >
    > print $numb; # does not work, prints "3"


    What you are trying to do is use something called a symbolic ref:

    print $$numb; # works - print value of $3

    But you have to be careful using symrefs...

    {
    # Untaint and check $numb - don't clobber $1 etc
    die 'Not a number' unless do { ($numb) = $numb =~ /(^\d+$)/ };
    no strict 'refs';
    print $$numb;
    }

    That said I wouldn't use one myself because I never use $1 etc (other
    than in the RHS of s/// or in while(//g).

    if (my @captures = $data =~ /$regexp/) {
    print $captures[$numb-1];
    }
     
    Brian McCauley, Aug 26, 2004
    #2
    1. Advertising

  3. wrote:
    > I need to extract a certain value from a text. But the result isn't
    > always in the variable $1 - it might be in $2, $3, $4 or some other
    > predefined variable.
    >
    > Some code to illustrate my problem:


    Your problem starts before that code: You have not enabled strictures
    and warnings!

    use strict;
    use warnings;

    > $regexp = "(\d)(\w)(\d)";


    There is your second problem: $regex get the value '(d)(w)(d)', which
    is not what you want.

    my $regexp = '(\d)(\w)(\d)';
    -----------------^------------^

    1) Please copy and paste code that you post, do not retype it!

    2) Warnings would have told you that something was wrong.

    > $numb = 3; # Means the result I'm looking for is in $3
    > # I don't know this number, it's submitted by user
    > # and may differ
    >
    > if ($data =~ /$regexp/) {
    >
    > print $numb; # does not work, prints "3"
    >
    > # alternative solution that works
    > # but it's UGLY
    > if ($numb == 1) {
    > print $1;
    > } elsif ($numb == 2) {
    > print $2;
    > } elsif ($numb == 3) {
    > print $3;
    > }
    >
    > # is there another way?
    > }


    You can do:

    if ( my @capt = $data =~ /$regexp/ ) {
    print $capt[$numb-1];
    }

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Aug 26, 2004
    #3
  4. Anno Siegel Guest

    Gunnar Hjalmarsson <> wrote in comp.lang.perl.misc:
    > wrote:
    > > I need to extract a certain value from a text. But the result isn't
    > > always in the variable $1 - it might be in $2, $3, $4 or some other
    > > predefined variable.
    > >
    > > Some code to illustrate my problem:

    >
    > Your problem starts before that code: You have not enabled strictures
    > and warnings!
    >
    > use strict;
    > use warnings;
    >
    > > $regexp = "(\d)(\w)(\d)";

    >
    > There is your second problem: $regex get the value '(d)(w)(d)', which
    > is not what you want.
    >
    > my $regexp = '(\d)(\w)(\d)';
    > -----------------^------------^
    >
    > 1) Please copy and paste code that you post, do not retype it!
    >
    > 2) Warnings would have told you that something was wrong.
    >
    > > $numb = 3; # Means the result I'm looking for is in $3
    > > # I don't know this number, it's submitted by user
    > > # and may differ
    > >
    > > if ($data =~ /$regexp/) {
    > >
    > > print $numb; # does not work, prints "3"
    > >
    > > # alternative solution that works
    > > # but it's UGLY
    > > if ($numb == 1) {
    > > print $1;
    > > } elsif ($numb == 2) {
    > > print $2;
    > > } elsif ($numb == 3) {
    > > print $3;
    > > }
    > >
    > > # is there another way?
    > > }

    >
    > You can do:
    >
    > if ( my @capt = $data =~ /$regexp/ ) {
    > print $capt[$numb-1];
    > }


    Or, without an auxiliary variable:

    defined and print for ( $data =~ /$regex/ )[ $numb - 1];

    Anno
     
    Anno Siegel, Aug 26, 2004
    #4
  5. Anno Siegel Guest

    Gunnar Hjalmarsson <> wrote in comp.lang.perl.misc:
    > wrote:


    > > I need to extract a certain value from a text. But the result isn't
    > > always in the variable $1 - it might be in $2, $3, $4 or some other
    > > predefined variable.


    > You can do:
    >
    > if ( my @capt = $data =~ /$regexp/ ) {
    > print $capt[$numb-1];
    > }


    Or, without an auxiliary variable:

    defined and print for ( $data =~ /$regex/ )[ $numb - 1];

    Anno
     
    Anno Siegel, Aug 26, 2004
    #5
  6. Tore Aursand Guest

    On Thu, 26 Aug 2004 09:19:34 -0700, leifwessman wrote:
    > Some code to illustrate my problem:
    > [...]


    Your code won't run. Please copy-and-paste working code, instead of
    retyping it.

    You should also add these:

    use strict;
    use warnings;

    > $regexp = "(\d)(\w)(\d)";
    > $numb = 3; # Means the result I'm looking for is in $3
    > # I don't know this number, it's submitted
    > by user
    > # and may differ
    >
    > if ($data =~ /$regexp/) {
    >
    > print $numb; # does not work, prints "3"
    >
    > # alternative solution that works
    > # but it's UGLY
    > if ($numb == 1) {
    > print $1;
    > } elsif ($numb == 2) {
    > print $2;
    > } elsif ($numb == 3) {
    > print $3;
    > }
    >
    > # is there another way?
    > }


    You can match into an array;

    if ( my @match = $data =~ /$regexp/ ) {
    print @match[$numb-1];
    }


    --
    Tore Aursand <>
    "Computer science education cannot make anybody an expert programmer
    any more than studying brushes and pigment can make somebody an expert
    painter." (Eric Raymond)
     
    Tore Aursand, Aug 26, 2004
    #6
  7. wrote:
    >
    > I need to extract a certain value from a text. But the result isn't
    > always in the variable $1 - it might be in $2, $3, $4 or some other
    > predefined variable.
    >
    > Some code to illustrate my problem:
    >
    > $regexp = "(\d)(\w)(\d)";
    > $numb = 3; # Means the result I'm looking for is in $3
    > # I don't know this number, it's submitted
    > by user
    > # and may differ
    >
    > if ($data =~ /$regexp/) {
    >
    > print $numb; # does not work, prints "3"
    >
    > # alternative solution that works
    > # but it's UGLY
    > if ($numb == 1) {
    > print $1;
    > } elsif ($numb == 2) {
    > print $2;
    > } elsif ($numb == 3) {
    > print $3;
    > }
    >
    > # is there another way?
    > }


    You are extracting single characters. How about substr()?

    print substr( $data, $numb - 1, 1 )


    Why not define your regexp based on the submitted value?

    my @fields = ( '\d', '\w', '\d' );
    $fields[ $numb - 1 ] = '(' . $fields[ $numb - 1 ] . ')';
    my $regexp = join '', @fields;
    if ( $data =~ /$regexp/ ) {
    print $1;
    }


    Or you could use the @+ and @- arrays:

    my $regexp = '(\d)(\w)(\d)';
    if ( $data =~ /$regexp/ ) {
    print substr( $data, $-[ $numb ], $+[ $numb ] - $-[ $numb ] );
    }



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Aug 26, 2004
    #7
  8. Anno Siegel wrote:

    > Gunnar Hjalmarsson <> wrote in comp.lang.perl.misc:
    >>You can do:
    >>
    >> if ( my @capt = $data =~ /$regexp/ ) {
    >> print $capt[$numb-1];
    >> }

    >
    >
    > Or, without an auxiliary variable:
    >
    > defined and print for ( $data =~ /$regex/ )[ $numb - 1];


    In this particular case it is probably safe to assume that we want to
    treat the case where the ${numb}th capture didn't capture to be
    equivalent to the case where /$regex/ didn't match.

    However it is important to be aware that you are making such an assumption.
     
    Brian McCauley, Aug 27, 2004
    #8
  9. Anno Siegel Guest

    Brian McCauley <> wrote in comp.lang.perl.misc:
    > Anno Siegel wrote:
    >
    > > Gunnar Hjalmarsson <> wrote in comp.lang.perl.misc:
    > >>You can do:
    > >>
    > >> if ( my @capt = $data =~ /$regexp/ ) {
    > >> print $capt[$numb-1];
    > >> }

    > >
    > >
    > > Or, without an auxiliary variable:
    > >
    > > defined and print for ( $data =~ /$regex/ )[ $numb - 1];

    >
    > In this particular case it is probably safe to assume that we want to
    > treat the case where the ${numb}th capture didn't capture to be
    > equivalent to the case where /$regex/ didn't match.
    >
    > However it is important to be aware that you are making such an assumption.


    You are right. I thought about explaining how it is okay (under this
    assumption) to use the regex without explicitly checking if it matched,
    but decided to let it slip. Thanks for pointing it out.

    Anno
     
    Anno Siegel, Aug 27, 2004
    #9
  10. Sara Guest

    wrote in message news:<cgl2im$>...
    > Hi!
    >
    > I need to extract a certain value from a text. But the result isn't
    > always in the variable $1 - it might be in $2, $3, $4 or some other
    > predefined variable.
    >
    > Some code to illustrate my problem:
    >
    > $regexp = "(\d)(\w)(\d)";
    > $numb = 3; # Means the result I'm looking for is in $3
    > # I don't know this number, it's submitted
    > by user
    > # and may differ
    >
    > if ($data =~ /$regexp/) {
    >
    > print $numb; # does not work, prints "3"
    >
    > # alternative solution that works
    > # but it's UGLY
    > if ($numb == 1) {
    > print $1;
    > } elsif ($numb == 2) {
    > print $2;
    > } elsif ($numb == 3) {
    > print $3;
    > }
    >
    > # is there another way?
    > }
    >
    > Thanks for any input!
    >
    > Leif


    Hi there Leif:

    Interesting question. As pointed out, $$numb will work nicely for you.
    The odd thing being that this LOOKS like a scalar dereference, which
    it really isn't since 2 isn't the memory location of the value. Seems
    like there is an ambiguity in there somewhere but I can't pinpoint it.

    Thanks for posting.

    G
     
    Sara, Aug 27, 2004
    #10
  11. Following up in the wrong part of this thread Sara wrote:
    > wrote in message news:<cgl2im$>...
    >
    >>if ($numb == 1) {
    >>print $1;
    >>} elsif ($numb == 2) {
    >> print $2;
    >>} elsif ($numb == 3) {
    >> print $3;
    >>}

    >
    > Interesting question. As pointed out, $$numb will work nicely for you.


    For certain values of "nice".

    > The odd thing being that this LOOKS like a scalar dereference, which
    > it really isn't


    Yes it is. It's a scalar dereference of a _symbolic_ reference.

    > since 2 isn't the memory location of the value.


    If it were a _hard_ scalar reference then its numeric value would be the
    address in memory.

    > Seems like there is an ambiguity in there somewhere but I can't pinpoint it.


    No ambiguity. Perl's scalar values can contain either ordinary
    strings/numbers or they can contain hard references. It is possible to
    convert a hard reference[1] into an address in memory simply by using it
    in a numeric context. It is not possible to go the other way[4]. If
    you use a non-reference in a reference context then it will never be
    treated as a memory address - it will be converted into a string and
    looked up in the symbol table - i.e. it will be a symbolic reference.
    Of course most of the time one has "strict qw(refs)" in effect which
    causes symbolic references to be diallowed except in a few special
    cases[2].

    [1] (other than one to an overloaded type object)
    [2] To do with symbolic CODErefs[3].
    [3] And due to a bug any symrefs resolved at compile-time.
    [4] In Perl - you can of course do anything you want by dropping down
    into C.
     
    Brian McCauley, Aug 27, 2004
    #11
  12. bowsayge wrote:

    > Brian McCauley said to us:
    >
    >
    >>>Interesting question. As pointed out, $$numb will work nicely for you.

    >>
    >>For certain values of "nice".
    >>

    >
    > [...]
    >
    > Bowsayge hopes that this is a better value of "nice":
    >
    > '8 t 4' =~ /(\d) (\w) (\d)/;
    > my $numb = 3;
    > print "matched: ", eval("\$$numb"), "\n";
    >
    > There is no need to enable sym-refs.


    All the reasons to avoid symrefs bad also apply to eval(STRING), only
    more so.
     
    Brian McCauley, Aug 28, 2004
    #12
  13. Anno Siegel Guest

    bowsayge <> wrote in comp.lang.perl.misc:
    > Brian McCauley said to us:
    >
    > >> Interesting question. As pointed out, $$numb will work nicely for you.

    > >
    > > For certain values of "nice".
    > >

    > [...]
    >
    > Bowsayge hopes that this is a better value of "nice":


    Not really.

    > '8 t 4' =~ /(\d) (\w) (\d)/;
    > my $numb = 3;
    > print "matched: ", eval("\$$numb"), "\n";
    >
    > There is no need to enable sym-refs.


    Sure. You can re-write any symref unsing eval like that, so string
    eval is the more general mechanism. It also allows Perl to break its own
    rules in more ways than mere symrefs do, so it's higher in the hierarchy
    of nastiness, not lower.

    It is also ugly because it's disproportionate, in the way it would be
    ugly to start a sawmill to make a toothpick from a twig. You are running
    another Perl interpreter to interpret a program that reads "$1" or "$5" or
    something.

    That said, your solution is, of course, perfectly valid. The symref
    solution needs to unexpectedly talk about "strict", and may need a
    bare block to limit the effect. So "eval" is shorter and more to the
    point, and it's arguably as readable. Since the string you eval is
    entirely defined in the program text (as opposed to an external source),
    there is no additional risk in "eval".

    But "nicer", no.

    Anno
     
    Anno Siegel, Aug 28, 2004
    #13
  14. Ben Morrow Guest

    Quoth Brian McCauley <>:
    > It is possible to
    > convert a hard reference[1] into an address in memory simply by using it
    > in a numeric context. It is not possible to go the other way[4]. If


    [1] NMF

    > [4] In Perl - you can of course do anything you want by dropping down
    > into C.


    You don't need C: unpack 'P' will work nicely... :)

    Ben

    --
    Although few may originate a policy, we are all able to judge it.
    - Pericles of Athens, c.430 B.C.
     
    Ben Morrow, Aug 31, 2004
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,310
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    851
    Alan Moore
    Dec 2, 2005
  3. Michael Tan
    Replies:
    32
    Views:
    1,005
    Ara.T.Howard
    Jul 21, 2005
  4. Tony
    Replies:
    2
    Views:
    129
    Tad McClellan
    Apr 21, 2005
  5. Jimmy
    Replies:
    13
    Views:
    442
    Arne Vajhøj
    Jul 25, 2012
Loading...

Share This Page