noob question: Trying to extract part of a string in a variable to another variable

Discussion in 'Perl Misc' started by cayenne, Apr 25, 2004.

  1. cayenne

    cayenne Guest

    Hello all,
    I'm a perl noob...and just can't quite figure out how to do something
    that should be pretty simple.

    Here's an example.

    I have $mail_address = 'fred jones <>'

    I want to use regular expressions to just parse out the userid here of
    fred_jones

    I'm trying things like this:

    $mail_address =~ /\w+@/;

    But, doesn't seem to work. I'm a little hazy on exactly how the =~
    works...through examples I've successfully used it for substitutions
    like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
    expression and extract it to the variable...or even to another
    variable leaving $mail_address unchanged.

    I've looked in books at the substr() function, but, I don't know how
    to use regular expressions to find the offset point, etc.

    Can someone give me an example...or pointers to a good reference on
    this type of thing?

    Thanks in advance,

    chilecayenne
     
    cayenne, Apr 25, 2004
    #1
    1. Advertising

  2. cayenne

    gnari Guest

    "cayenne" <> wrote in message
    news:...
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;
    >
    > But, doesn't seem to work.


    'doesn't seem to work' does not tell us anything
    except that you expected it to do something other
    than what it does. many of us have negligent PSI
    powers, so it helps us not a lot.

    on the other hand, maybe what you want is:

    my ($id)= $mail_address =~ /(\w+)@/;

    >
    > I've looked in books at the substr() function, but, I don't know how
    > to use regular expressions to find the offset point, etc.


    >
    > Can someone give me an example...or pointers to a good reference on
    > this type of thing?



    take a look at the perl documentation:
    perldoc perlop
    perldoc perlre

    gnari
     
    gnari, Apr 25, 2004
    #2
    1. Advertising

  3. cayenne wrote:
    > Here's an example.
    >
    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones
    >
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;
    >
    > But, doesn't seem to work.


    Please define "doesn't seem to work". What exactly do you expect that
    statement to do and what do you observe instead? Like, what do you mean by
    "parse out"? Do you want to remove the userid from the string? Or do you
    want to capture the userid in a different variable?

    > I'm a little hazy on exactly how the =~
    > works...


    It is the binding operator. If used the substitute or match will be applied
    to the variable on it's left side instead of to the default $_.

    > through examples I've successfully used it for substitutions
    > like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
    > expression and extract it to the variable...or even to another
    > variable leaving $mail_address unchanged.


    Well, Perl regular expressions do that automatically. Just use grouping:

    my $mail_address = 'fred jones <>';
    $mail_address =~ /(\w+)@/;
    print $1;

    Further details "perldoc perlretut" or for the advanced part "perldoc
    perlre"

    However, I hope you are aware that '\w' does not even begin to cover the
    full set of possible email aliases.
    Please see "perldoc -q valid", third paragraph for further information.

    > I've looked in books at the substr() function, but, I don't know how
    > to use regular expressions to find the offset point, etc.


    You don't. You would use index() to find the position of a character or
    string in a text.

    jue
     
    Jürgen Exner, Apr 25, 2004
    #3
  4. cayenne

    Bob Walton Guest

    cayenne wrote:

    ....


    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones

    ....


    > Can someone give me an example...or pointers to a good reference on
    > this type of thing?

    ....
    > chilecayenne
    >


    Try:

    my($userid)=$mail_address=~/(\w+)@/;

    References:

    perldoc perlre
    perldoc perlretut
    perldoc perlop

    The books: "Learning Perl (3rd edition)", "Programming Perl (3rd
    edition)" and "Mastering Regular Expressions (2nd edition)".

    Online: learn.perl.org, www.perl.com, www.perldoc.com

    --
    Bob Walton
    Email: http://bwalton.com/cgi-bin/emailbob.pl
     
    Bob Walton, Apr 25, 2004
    #4
  5. Re: noob question: Trying to extract part of a string in a variableto another variable

    cayenne schrieb:
    > Hello all,
    > I'm a perl noob...and just can't quite figure out how to do something
    > that should be pretty simple.
    >
    > Here's an example.
    >
    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones
    >
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;
    >
    > But, doesn't seem to work. I'm a little hazy on exactly how the =~
    > works...through examples I've successfully used it for substitutions
    > like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
    > expression and extract it to the variable...or even to another
    > variable leaving $mail_address unchanged.
    >
    > I've looked in books at the substr() function, but, I don't know how
    > to use regular expressions to find the offset point, etc.
    >
    > Can someone give me an example...or pointers to a good reference on
    > this type of thing?
    >
    > Thanks in advance,
    >
    > chilecayenne


    Hi,

    you have to mark the part you want to get.

    $mail_address =~ m/(\w+?)@/;
    $name = $1;

    Take brackets to mark what you want. You will find the result in $1. If
    you specify more then one part, you will find the second hit in $2. The
    questionsign within the brackets avoids, that you get as much as
    possible into your result (if there two or more @).
    Other way to get results is:

    my @result = $mail_address =~ m/(\w+?)@/;

    In $result[0] you will find then name.

    Milo
     
    Milo Minderbinder, Apr 25, 2004
    #5
  6. cayenne

    Web Surfer Guest

    [This followup was posted to comp.lang.perl.misc]

    In article <>,
    says...
    > Hello all,
    > I'm a perl noob...and just can't quite figure out how to do something
    > that should be pretty simple.
    >
    > Here's an example.
    >
    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones
    >
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;
    >
    > But, doesn't seem to work. I'm a little hazy on exactly how the =~
    > works...through examples I've successfully used it for substitutions
    > like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
    > expression and extract it to the variable...or even to another
    > variable leaving $mail_address unchanged.
    >
    > I've looked in books at the substr() function, but, I don't know how
    > to use regular expressions to find the offset point, etc.
    >
    > Can someone give me an example...or pointers to a good reference on
    > this type of thing?
    >
    > Thanks in advance,
    >
    > chilecayenne
    >



    #!/usr/bin/perl -w

    use strict;

    my ( $mail_address , $userid );

    $mail_address = 'fred jones <>';
    $mail_address =~ /(\w+)@/;

    $userid = $1;

    print "Userid = [$userid]\n";

    exit 0;
     
    Web Surfer, Apr 25, 2004
    #6
  7. Jürgen Exner <> wrote:

    > Just use grouping:
    >
    > my $mail_address = 'fred jones <>';
    > $mail_address =~ /(\w+)@/;
    > print $1;



    But don't use it like that!

    You should never use the dollar-digit variables without first ensuring
    that the match *succeeded*.

    if ( $mail_address =~ /(\w+)@/ ) {
    print $1;
    }


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 25, 2004
    #7
  8. Milo Minderbinder <> wrote:

    [ snip full-quote, please don't do that]

    > you have to mark the part you want to get.
    >
    > $mail_address =~ m/(\w+?)@/;
    > $name = $1;
    >
    > Take brackets to mark what you want. You will find the result in $1.

    ^^^^
    ^^^^

    No, you *might* find the result in $1.

    If you've tested that the match *succeeded*,
    _then_ you will find the result in $1.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 25, 2004
    #8
  9. Web Surfer <> wrote:

    > $mail_address =~ /(\w+)@/;
    > $userid = $1;



    What is with this epidemic of teaching the WRONG way in this thread?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 25, 2004
    #9
  10. Robin wrote:

    > Regular expressions are not the right way to find the offset unless you
    > want to use $1 an $2 and $3...etc, and then use index, it still isn't an
    > optimal way to find the offset point.


    Darn right it's not. If your pattern has subexpressions, then on a match the
    offset of each subexpression appears in the @- array. That is, the offset
    of $1 is in $-[0], $2 is in $-[1], and so forth.

    Note that offsets, no matter how they're found, are irrelevant to the
    original question anyway. All he wanted was the value of the matched
    substring, not its position. He was thinking he might need to offset to get
    the substring, but he was barking in the wrong forest with that idea.

    So tell me Robin, when are you going to stop posting nonsense answers to
    questions you don't understand?

    sherm--

    --
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Hire me! My resume: http://www.dot-app.org
     
    Sherm Pendley, Apr 26, 2004
    #10
  11. cayenne

    Robin Guest

    "cayenne" <> wrote in message
    news:...
    > Hello all,
    > I'm a perl noob...and just can't quite figure out how to do something
    > that should be pretty simple.
    >
    > Here's an example.
    >
    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones
    >
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;
    >
    > But, doesn't seem to work. I'm a little hazy on exactly how the =~
    > works...through examples I've successfully used it for substitutions
    > like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
    > expression and extract it to the variable...or even to another
    > variable leaving $mail_address unchanged.
    >
    > I've looked in books at the substr() function, but, I don't know how
    > to use regular expressions to find the offset point, etc.
    >
    > Can someone give me an example...or pointers to a good reference on
    > this type of thing?
    >
    > Thanks in advance,
    >
    > chilecayenne


    Regular expressions are not the right way to find the offset unless you want
    to use $1 an $2 and $3...etc, and then use index, it still isn't an optimal
    way to find the offset point. Just change up your regular expression looks
    like the other code, man I'm so tired.
    -Robin
     
    Robin, Apr 26, 2004
    #11
  12. cayenne

    Joe Smith Guest

    Re: noob question: Trying to extract part of a string in a variableto another variable

    Sherm Pendley wrote:

    > If your pattern has subexpressions, then on a match the
    > offset of each subexpression appears in the @- array. That is, the offset
    > of $1 is in $-[0], $2 is in $-[1], and so forth.


    Incorrect. The offset of $& is in $-[0], the offset of $1 is in $-[1], etc.
    -Joe
     
    Joe Smith, Apr 26, 2004
    #12
  13. cayenne

    Anno Siegel Guest

    Jürgen Exner <> wrote in comp.lang.perl.misc:
    > cayenne wrote:


    [...]

    > > I've looked in books at the substr() function, but, I don't know how
    > > to use regular expressions to find the offset point, etc.

    >
    > You don't.


    Ah, but you do, though not in this case. The @- and @+ arrays are
    there to support it.

    Anno
     
    Anno Siegel, Apr 26, 2004
    #13
  14. In article <>,
    (cayenne) wrote:

    > I have $mail_address = 'fred jones <>'
    >
    > I want to use regular expressions to just parse out the userid here of
    > fred_jones
    >
    > I'm trying things like this:
    >
    > $mail_address =~ /\w+@/;


    What you seem to be asking for is this:

    my ($user_id) = ($mail_address =~ m/(\w+)@/);

    However, please note that \w doesn't really have the complete set of
    valid characters to prefix the '@' sign in an email address.

    Just off the top of my head, I know that '.', '-', '?', '=', and more
    are valid. Possibly any unicode character other than whitespace and '@'
    are valid. It might even be valid to have '<' in an email address.

    At the very least, you probably want

    my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

    HTH,
    Ricky

    --
    Pukku
     
    Richard Morse, Apr 26, 2004
    #14
  15. Richard Morse <> wrote:
    > At the very least, you probably want
    >
    > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);



    Be careful where you use '-' inside a range:
    Invalid [] range ".-+" before HERE mark in regex m/([\w.-+ << HERE =]+)@/

    Put the hyphen last: [\w.+=-]

    --
    Glenn Jackman
    NCF Sysadmin
     
    Glenn Jackman, Apr 26, 2004
    #15
  16. Glenn Jackman <> wrote:

    > Put the hyphen last: [\w.+=-]



    Or first.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 26, 2004
    #16
  17. cayenne

    cayenne Guest

    Richard Morse <> wrote in message news:<>...
    > In article <>,
    > (cayenne) wrote:
    >
    > > I have $mail_address = 'fred jones <>'
    > >
    > > I want to use regular expressions to just parse out the userid here of
    > > fred_jones
    > >
    > > I'm trying things like this:
    > >
    > > $mail_address =~ /\w+@/;

    >
    > What you seem to be asking for is this:
    >
    > my ($user_id) = ($mail_address =~ m/(\w+)@/);
    >
    > However, please note that \w doesn't really have the complete set of
    > valid characters to prefix the '@' sign in an email address.
    >
    > Just off the top of my head, I know that '.', '-', '?', '=', and more
    > are valid. Possibly any unicode character other than whitespace and '@'
    > are valid. It might even be valid to have '<' in an email address.
    >
    > At the very least, you probably want
    >
    > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
    >
    > HTH,
    > Ricky


    Just quickly, can you explain the extensive use of parens here? I
    understand the () in the regular expression, to keep those parts the
    match...but, what is the function of the () around $user_id and the
    entire part after the = sign?

    Thanks in advance,

    CC
     
    cayenne, May 19, 2004
    #17
  18. In article <>,
    (cayenne) wrote:

    > Richard Morse <> wrote in message
    > news:<>...
    >
    > > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

    >
    > Just quickly, can you explain the extensive use of parens here? I
    > understand the () in the regular expression, to keep those parts the
    > match...but, what is the function of the () around $user_id and the
    > entire part after the = sign?


    Parens around $user_id force the match to happen in a list context. A
    match in a scalar context would return the number of matches, while in a
    list context, it returns the various matches.

    my $user_id = ($mail_address =~ m/.../)

    would have $user_id be the value 1 (because there is one match, as it
    isn't a /g match).

    The parens around the match are there because it makes it easier for me
    to read it. I've never not put them there, although a quick test I just
    did seems to indicate that they aren't necessary.

    HTH,
    Ricky

    --
    Pukku
     
    Richard Morse, May 19, 2004
    #18
  19. cayenne

    Paul Lalli Guest

    Re: noob question: Trying to extract part of a string in a variableto another variable

    On Wed, 19 May 2004, cayenne wrote:

    > Richard Morse <> wrote in message news:<>...
    > >
    > > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);
    > >

    > Just quickly, can you explain the extensive use of parens here? I
    > understand the () in the regular expression, to keep those parts the
    > match...but, what is the function of the () around $user_id and the
    > entire part after the = sign?
    >


    The parens around $user_id force the binding operation of =~ to be
    evaluated in list context. This is done because a pattern match in list
    context returns a list of all of the captured matches (ie, the things that
    go into $1, $2, etc). This is a shorthand way of writing the two
    statements:

    $mail_address =~ m/([\w.-+=]+)@/
    my $user_id = $1;

    The parens around the whole pattern match here are actually unnecessary.
    This is because the =~ operator has a higher precedence than the =
    operator. They are likely used here just for clarity, to make sure the
    readers of the code are aware that ($user_id) is being assigned to the
    return value of the pattern match, rather than the alternate
    interpretation of the assignment of $user_id to $mail_address being
    pattern matched against the pattern (which would be written like so:
    (my $user_id = $mail_address) =~ m/([\w.-+=]+)@/;

    Please let me know if this is not clear enough.

    Paul Lalli
     
    Paul Lalli, May 19, 2004
    #19
  20. Re: noob question: Trying to extract part of a string in a variabletoanother variable

    Paul Lalli wrote:
    >
    > On Wed, 19 May 2004, cayenne wrote:
    >
    > > Richard Morse <> wrote in message news:<>...
    > > >
    > > > my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

    > >
    > > Just quickly, can you explain the extensive use of parens here? I
    > > understand the () in the regular expression, to keep those parts the
    > > match...but, what is the function of the () around $user_id and the
    > > entire part after the = sign?

    >
    > The parens around $user_id force the binding operation of =~ to be
    > evaluated in list context. This is done because a pattern match in list
    > context returns a list of all of the captured matches (ie, the things that
    > go into $1, $2, etc). This is a shorthand way of writing the two
    > statements:
    >
    > $mail_address =~ m/([\w.-+=]+)@/
    > my $user_id = $1;


    They are not the same at all. If the match fails the first will set
    $user_id to undef but your version will set $user_id to the contents of
    a previously successful match's capturing parentheses or ''.




    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, May 20, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. haylow
    Replies:
    0
    Views:
    590
    haylow
    Jun 15, 2004
  2. Phil Winstanley [Microsoft MVP ASP.NET]

    Re: Noob question: "Could not find a part of the path"

    Phil Winstanley [Microsoft MVP ASP.NET], Jun 15, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    378
    Phil Winstanley [Microsoft MVP ASP.NET]
    Jun 15, 2004
  3. Carnell, James E
    Replies:
    2
    Views:
    240
    Marc 'BlackJack' Rintsch
    Sep 5, 2007
  4. Sandhya Prabhakaran
    Replies:
    6
    Views:
    581
    alex23
    Aug 3, 2009
  5. Jack
    Replies:
    8
    Views:
    294
Loading...

Share This Page