noob question: Trying to extract part of a string in a variable to another variable

cayenne · Apr 25, 2004

Hello all,
I'm a perl noob...and just can't quite figure out how to do something
that should be pretty simple.

Here's an example.

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work. I'm a little hazy on exactly how the =~
works...through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Can someone give me an example...or pointers to a good reference on
this type of thing?

Thanks in advance,

chilecayenne

gnari · Apr 25, 2004

cayenne said:
I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work.

'doesn't seem to work' does not tell us anything
except that you expected it to do something other
than what it does. many of us have negligent PSI
powers, so it helps us not a lot.

on the other hand, maybe what you want is:

my ($id)= $mail_address =~ /(\w+)@/;

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Can someone give me an example...or pointers to a good reference on
this type of thing?

take a look at the perl documentation:
perldoc perlop
perldoc perlre

gnari

Jürgen Exner · Apr 25, 2004

cayenne said:
Here's an example.

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work.

Please define "doesn't seem to work". What exactly do you expect that
statement to do and what do you observe instead? Like, what do you mean by
"parse out"? Do you want to remove the userid from the string? Or do you
want to capture the userid in a different variable?

I'm a little hazy on exactly how the =~
works...

It is the binding operator. If used the substitute or match will be applied
to the variable on it's left side instead of to the default $_.

through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.

Well, Perl regular expressions do that automatically. Just use grouping:

my $mail_address = 'fred jones <[email protected]>';
$mail_address =~ /(\w+)@/;
print $1;

Further details "perldoc perlretut" or for the advanced part "perldoc
perlre"

However, I hope you are aware that '\w' does not even begin to cover the
full set of possible email aliases.
Please see "perldoc -q valid", third paragraph for further information.

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

You don't. You would use index() to find the position of a character or
string in a text.

jue

Bob Walton · Apr 25, 2004

cayenne wrote:

....

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones ....

Can someone give me an example...or pointers to a good reference on
this type of thing? ....
chilecayenne

Try:

my($userid)=$mail_address=~/(\w+)@/;

References:

perldoc perlre
perldoc perlretut
perldoc perlop

The books: "Learning Perl (3rd edition)", "Programming Perl (3rd
edition)" and "Mastering Regular Expressions (2nd edition)".

Online: learn.perl.org, www.perl.com, www.perldoc.com

Milo Minderbinder · Apr 25, 2004

cayenne said:
Hello all,
I'm a perl noob...and just can't quite figure out how to do something
that should be pretty simple.

Here's an example.

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work. I'm a little hazy on exactly how the =~
works...through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Can someone give me an example...or pointers to a good reference on
this type of thing?

Thanks in advance,

chilecayenne

Hi,

you have to mark the part you want to get.

$mail_address =~ m/(\w+?)@/;
$name = $1;

Take brackets to mark what you want. You will find the result in $1. If
you specify more then one part, you will find the second hit in $2. The
questionsign within the brackets avoids, that you get as much as
possible into your result (if there two or more @).
Other way to get results is:

my @result = $mail_address =~ m/(\w+?)@/;

In $result[0] you will find then name.

Milo

Web Surfer · Apr 25, 2004

[This followup was posted to comp.lang.perl.misc]

Hello all,
I'm a perl noob...and just can't quite figure out how to do something
that should be pretty simple.

Here's an example.

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work. I'm a little hazy on exactly how the =~
works...through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Can someone give me an example...or pointers to a good reference on
this type of thing?

Thanks in advance,

chilecayenne

#!/usr/bin/perl -w

use strict;

my ( $mail_address , $userid );

$mail_address = 'fred jones <[email protected]>';
$mail_address =~ /(\w+)@/;

$userid = $1;

print "Userid = [$userid]\n";

exit 0;

Tad McClellan · Apr 25, 2004

Jürgen Exner said:
Just use grouping:

my $mail_address = 'fred jones <[email protected]>';
$mail_address =~ /(\w+)@/;
print $1;

But don't use it like that!

You should never use the dollar-digit variables without first ensuring
that the match *succeeded*.

if ( $mail_address =~ /(\w+)@/ ) {
print $1;
}

Tad McClellan · Apr 25, 2004

[ snip full-quote, please don't do that]

you have to mark the part you want to get.

$mail_address =~ m/(\w+?)@/;
$name = $1;

Take brackets to mark what you want. You will find the result in $1.

^^^^
^^^^

No, you *might* find the result in $1.

If you've tested that the match *succeeded*,
_then_ you will find the result in $1.

Tad McClellan · Apr 25, 2004

Web Surfer said:
$mail_address =~ /(\w+)@/;
$userid = $1;

What is with this epidemic of teaching the WRONG way in this thread?

Sherm Pendley · Apr 26, 2004

Robin said:
Regular expressions are not the right way to find the offset unless you
want to use $1 an $2 and $3...etc, and then use index, it still isn't an
optimal way to find the offset point.

Darn right it's not. If your pattern has subexpressions, then on a match the
offset of each subexpression appears in the @- array. That is, the offset
of $1 is in $-[0], $2 is in $-[1], and so forth.

Note that offsets, no matter how they're found, are irrelevant to the
original question anyway. All he wanted was the value of the matched
substring, not its position. He was thinking he might need to offset to get
the substring, but he was barking in the wrong forest with that idea.

So tell me Robin, when are you going to stop posting nonsense answers to
questions you don't understand?

sherm--

Robin · Apr 26, 2004

cayenne said:
Hello all,
I'm a perl noob...and just can't quite figure out how to do something
that should be pretty simple.

Here's an example.

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

But, doesn't seem to work. I'm a little hazy on exactly how the =~
works...through examples I've successfully used it for substitutions
like x =~ s/tom/joe/g; but, I'm just wanting to match a regular
expression and extract it to the variable...or even to another
variable leaving $mail_address unchanged.

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Can someone give me an example...or pointers to a good reference on
this type of thing?

Thanks in advance,

chilecayenne

Regular expressions are not the right way to find the offset unless you want
to use $1 an $2 and $3...etc, and then use index, it still isn't an optimal
way to find the offset point. Just change up your regular expression looks
like the other code, man I'm so tired.
-Robin

Joe Smith · Apr 26, 2004

Sherm said:
If your pattern has subexpressions, then on a match the
offset of each subexpression appears in the @- array. That is, the offset
of $1 is in $-[0], $2 is in $-[1], and so forth.

Incorrect. The offset of $& is in $-[0], the offset of $1 is in $-[1], etc.
-Joe

Anno Siegel · Apr 26, 2004

Jürgen Exner said:
cayenne wrote:
[...]

I've looked in books at the substr() function, but, I don't know how
to use regular expressions to find the offset point, etc.

Click to expand...

You don't.

Ah, but you do, though not in this case. The @- and @+ arrays are
there to support it.

Anno

Richard Morse · Apr 26, 2004

I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

What you seem to be asking for is this:

my ($user_id) = ($mail_address =~ m/(\w+)@/);

However, please note that \w doesn't really have the complete set of
valid characters to prefix the '@' sign in an email address.

Just off the top of my head, I know that '.', '-', '?', '=', and more
are valid. Possibly any unicode character other than whitespace and '@'
are valid. It might even be valid to have '<' in an email address.

At the very least, you probably want

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

HTH,
Ricky

Glenn Jackman · Apr 26, 2004

Richard Morse said:
At the very least, you probably want

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

Be careful where you use '-' inside a range:
Invalid [] range ".-+" before HERE mark in regex m/([\w.-+ << HERE =]+)@/

Put the hyphen last: [\w.+=-]

Tad McClellan · Apr 26, 2004

Glenn Jackman said:
Put the hyphen last: [\w.+=-]

Or first.

cayenne · May 19, 2004

Richard Morse said:
I have $mail_address = 'fred jones <[email protected]>'

I want to use regular expressions to just parse out the userid here of
fred_jones

I'm trying things like this:

$mail_address =~ /\w+@/;

Click to expand...

What you seem to be asking for is this:

my ($user_id) = ($mail_address =~ m/(\w+)@/);

However, please note that \w doesn't really have the complete set of
valid characters to prefix the '@' sign in an email address.

Just off the top of my head, I know that '.', '-', '?', '=', and more
are valid. Possibly any unicode character other than whitespace and '@'
are valid. It might even be valid to have '<' in an email address.

At the very least, you probably want

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

HTH,
Ricky

Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?

Thanks in advance,

CC

Richard Morse · May 19, 2004

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

Click to expand...

Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?

Parens around $user_id force the match to happen in a list context. A
match in a scalar context would return the number of matches, while in a
list context, it returns the various matches.

my $user_id = ($mail_address =~ m/.../)

would have $user_id be the value 1 (because there is one match, as it
isn't a /g match).

The parens around the match are there because it makes it easier for me
to read it. I've never not put them there, although a quick test I just
did seems to indicate that they aren't necessary.

HTH,
Ricky

Paul Lalli · May 19, 2004

Richard Morse said:
Richard Morse said:

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

Click to expand...

Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?

The parens around $user_id force the binding operation of =~ to be
evaluated in list context. This is done because a pattern match in list
context returns a list of all of the captured matches (ie, the things that
go into $1, $2, etc). This is a shorthand way of writing the two
statements:

$mail_address =~ m/([\w.-+=]+)@/
my $user_id = $1;

The parens around the whole pattern match here are actually unnecessary.
This is because the =~ operator has a higher precedence than the =
operator. They are likely used here just for clarity, to make sure the
readers of the code are aware that ($user_id) is being assigned to the
return value of the pattern match, rather than the alternate
interpretation of the assignment of $user_id to $mail_address being
pattern matched against the pattern (which would be written like so:
(my $user_id = $mail_address) =~ m/([\w.-+=]+)@/;

Please let me know if this is not clear enough.

Paul Lalli

John W. Krahn · May 20, 2004

Paul said:
Richard Morse said:

my ($user_id) = ($mail_address =~ m/([\w.-+=]+)@/);

Click to expand...

Just quickly, can you explain the extensive use of parens here? I
understand the () in the regular expression, to keep those parts the
match...but, what is the function of the () around $user_id and the
entire part after the = sign?

Click to expand...

The parens around $user_id force the binding operation of =~ to be
evaluated in list context. This is done because a pattern match in list
context returns a list of all of the captured matches (ie, the things that
go into $1, $2, etc). This is a shorthand way of writing the two
statements:

$mail_address =~ m/([\w.-+=]+)@/
my $user_id = $1;

They are not the same at all. If the match fails the first will set
$user_id to undef but your version will set $user_id to the contents of
a previously successful match's capturing parentheses or ''.

John

How to go about building a crud app when you are a noob	1	Jan 2, 2023
Copying part of a vector element to a string variable	3	Oct 8, 2013
I'm tempted to quit out of frustration	1	Aug 13, 2023
Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
Trying to parse/match a C string literal	12	Sep 24, 2009
Erase Last Character of basic::string Variable	4	Dec 11, 2012
parsley parsing question, how to make a variable grammar	0	Jun 13, 2014
Variable argument function as a parameter of a variable argument function	11	Nov 20, 2011

noob question: Trying to extract part of a string in a variable to another variable

cayenne

gnari

Jürgen Exner

Bob Walton

Milo Minderbinder

Web Surfer

Tad McClellan

Tad McClellan

Tad McClellan

Sherm Pendley

Robin

Joe Smith

Anno Siegel

Richard Morse

Glenn Jackman

Tad McClellan

cayenne

Richard Morse

Paul Lalli

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads