regex question about ?, *, and $1

V

vorticitywolfe

Alright, this is confusing me:

$string = 2340Z 4SL -ABCD PS T1203045;

if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{
print $1,$2,$3,$4;
}

This works like I want it returns values for $1,$2,$3 and $4

However,
if you change it to this
$string = 2340Z 4SL PS T1203045;

It doesn't return anything for $1,$2,$3 or $4

Why?

I thought that any match followed by a " * " says that it matches 0 or
more, so if it's not matched it's undefined but the parenthesis still
"hold" it's place, i.e. $1 is still associated with the first
parenthetical match and doesn't switch to the second (AB|YZ) in this
case.

So in the end I would like this to work like this
$string = 2340Z 4SL -ABCD PS T1203045; ### print -ABCD PS
$string = 2340Z 4SL PS T1203045; ### print PS
$string = 2340Z 4SL -CD T1203045; ### print -CD

I have some logic to do all of that, but it's the regex that is
sticking me, particularly the * vs. ? vs. what is returned.

Thanks for the help in advance!
 
T

Tad J McClellan

Alright, this is confusing me:

$string = 2340Z 4SL -ABCD PS T1203045;


perl -e '$string = 2340Z 4SL -ABCD PS T1203045;'
Bareword found where operator expected at -e line 1, near "2340Z"
(Missing operator before Z?)
Number found where operator expected at -e line 1, near "Z 4"
(Do you need to predeclare Z?)
Bareword found where operator expected at -e line 1, near "4SL"
(Missing operator before SL?)
syntax error at -e line 1, near "2340Z "

if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{
print $1,$2,$3,$4;
}


perl -e 'if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{
print $1,$2,$3,$4;
}'
syntax error at -e line 1, near "if $string "
Execution of -e aborted due to compilation errors.

This works


That is impossible.
 
V

vorticitywolfe

perl -e '$string = 2340Z 4SL -ABCD PS T1203045;'
Bareword found where operator expected at -e line 1, near "2340Z"
(Missing operator before Z?)
Number found where operator expected at -e line 1, near "Z 4"
(Do you need to predeclare Z?)
Bareword found where operator expected at -e line 1, near "4SL"
(Missing operator before SL?)
syntax error at -e line 1, near "2340Z "


perl -e 'if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{> print $1,$2,$3,$4;

syntax error at -e line 1, near "if $string "
Execution of -e aborted due to compilation errors.


That is impossible.

correction
$string = "2340Z 4SL -ABCD PS T1203045";

forgot the quotes in the translation
 
M

Michele Dondi


Alright: you did good not to top post, but no need to quote the entire
message you're replying to either. Not certainly the .sig, unless you
want to comment on it.
correction
$string = "2340Z 4SL -ABCD PS T1203045";

forgot the quotes in the translation

There's a lesson in this: don't retype. Copy and paste instead. It's
also easier, ain't it?


Michele
 
P

Paul Lalli

Alright, this is confusing me:

$string = 2340Z 4SL -ABCD PS T1203045;

if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{
print $1,$2,$3,$4;

}

This works like I want it returns values for $1,$2,$3 and $4

I completely fail to believe you.

$ perl -le'
if ("2340Z 4SL -ABCD PS T1203045" =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|
PS|PT)*?/g) {
print qq{1: "$1", 2: "$2", 3: "$3", 4: "$4"};
}
'
1: "", 2: "", 3: "", 4: ""

This is perl, v5.8.8 built for cygwin-thread-multi-64int

Paul Lalli
 
V

vorticitywolfe

Ok,

After carefully checking again, it works if the "*?" at the end is
removed. That is why it is confusing. I guess that wasn't clear. I
don't understand why the "*?" makes it not return the values for
$1,$2, etc. If it is in there, it returns something, but the values
are blank or undefined. So that makes me think it is matching it...
 
P

Peter Makholm

After carefully checking again, it works if the "*?" at the end is
removed. That is why it is confusing. I guess that wasn't clear. I
don't understand why the "*?" makes it not return the values for
$1,$2, etc. If it is in there, it returns something, but the values
are blank or undefined. So that makes me think it is matching it...

Yes, because in you original you try to match Zero or more instances
of something, followed by zero or more instances of something else,
followed by zero or more instances of yet another thing, and so on.

So what happens is that perl starts at the beginning of you string and
says: Yes I got zero instances of (-|+), I got zero instances on
(AB|YZ), I got zero instances of (EF|CD), I got zero spaces and, finaly
I got zero instances of (PL|PS|PT). So I got a match.

Removing the ending '*?' makes perl actually looking for something
such that it doesn't match the empty substring at the beginning of you
string.

//Makholm
 
D

Dr.Ruud

(e-mail address removed) schreef:
if $string =~ m/(-|\+)*(AB|YZ)*(EF|CD)*\s??(PL|PS|PT)*?/g{
print $1,$2,$3,$4;
}

$string = 2340Z 4SL PS T1203045;

It doesn't return anything for $1,$2,$3 or $4
Why?

I would change things like "(AB|YZ)*" to something like "((?:AB|YZ)?)".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top