Q: // and "magic"

J Krugman · Apr 5, 2005

In perlre I found these puzzling lines:

@chars = split //, $string; # // is not magic in split
($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /

I don't understand the comments. What's all the "magic" about?

In an attempt to understand the first comment, I consulted perldoc
-f split, which made matters worse. I found no mention at all of
"magic", but I came across this:

Using the empty pattern "//" specifically matches
the null string, and is not be confused with the
use of "//" to mean "the last successful pattern
match".

Now I'm hopelessly confused. I understand that "//" matches the
null string, but I have no idea what the last sentence above (about
the "other" use of "//") is talking about. Any help sorting this
out would be greatly appreciated.

TIA!

jill

A. Sinan Unur · Apr 5, 2005

In perlre I found these puzzling lines:

@chars = split //, $string; # // is not magic in split
($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /

I don't understand the comments. What's all the "magic" about?

In an attempt to understand the first comment, I consulted perldoc
-f split, which made matters worse. I found no mention at all of
"magic", but I came across this:

Using the empty pattern "//" specifically matches
the null string, and is not be confused with the
use of "//" to mean "the last successful pattern
match".

Now I'm hopelessly confused. I understand that "//" matches the
null string, but I have no idea what the last sentence above (about
the "other" use of "//") is talking about.

In the context of the split function, // matches the empty string.

Elsewhere, // means the last successful pattern match.

IMHO, the passage above is very clear, but here is the relevant section
from perldoc perlop (where m// is being discussed):

If the PATTERN evaluates to the empty string, the last
*successfully* matched regular expression is used instead. In
this case, only the "g" and "c" flags on the empty pattern is
honoured - the other flags are taken from the original pattern.
If no match has previously succeeded, this will (silently) act
instead as a genuine empty pattern (which will always match).

Sinan

xhoster · Apr 5, 2005

A. Sinan Unur said:
In the context of the split function, // matches the empty string.

Elsewhere, // means the last successful pattern match.

IMHO, the passage above is very clear, but here is the relevant section
from perldoc perlop (where m// is being discussed):

If the PATTERN evaluates to the empty string, the last
*successfully* matched regular expression is used instead. In
this case, only the "g" and "c" flags on the empty pattern is
honoured - the other flags are taken from the original pattern.
If no match has previously succeeded, this will (silently) act
instead as a genuine empty pattern (which will always match).

So, does anyone find this behavior useful? I've never intentionally used
it, and I can't imagine doing so in the future.

Xho

A. Sinan Unur · Apr 5, 2005

So, does anyone find this behavior useful? I've never intentionally
used it, and I can't imagine doing so in the future.

At the risk of sounding like an AOLer, I am curious as well. I tried
thinking of a way to use this feature. Couldn't think of anything, but
that is probably a reflection of my limitations

I have a feeling Abigail might contribute some magic.

Sinan

Chris Mattern · Apr 5, 2005

A. Sinan Unur said:
At the risk of sounding like an AOLer, I am curious as well. I tried
thinking of a way to use this feature. Couldn't think of anything, but
that is probably a reflection of my limitations

I have a feeling Abigail might contribute some magic.

If you have an "untaint this" regexp, you might wind up using several
times in a row on several variables. But mostly, I think this was
Larry getting a little overenthusiastic in "save the programmer
keystrokes" mode. And once it was around for awhile, of course it
couldn't be taken out because it would break stuff.

--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"

A. Sinan Unur · Apr 5, 2005

So, does anyone find this behavior useful? I've never intentionally
used it, and I can't imagine doing so in the future.

I can think of one situation where this feature might be useful.

Consider the following:

#! perl

use strict;
use warnings;

my $s = 'one two three onetwo three one two three four';
my %count;

if( $s =~ /\b(one)\b/ or $s =~ /\b(two)\b/ ) {
++$count{$1} while( $s =~ //g );
}
__END__

Here, I am interested in counting the number of times the word 'one' in
the text. If there are no 'one's, then I want to count the number of
times 'two' occurs.

I think this is the most succint way of expressing the intent above. I
do not know if it would offer any speed advantages over other methods of
doing the same thing.

The construct might allow the programmer to more naturally avoid
alternation in regular expressions in favor of or tests in the
conditional and that might result in a performance benefit as well.

All this is speculation, however.

Sinan

J Krugman · Apr 5, 2005

So, does anyone find this behavior useful? I've never intentionally used
it, and I can't imagine doing so in the future.

I think this may have something to do with my confusion: I have
never seen // used in any situation in which it wasn't clearly
intended to match the empty string, as in split //, ... . It would
be great to se a meaningful example.

jill

Anno Siegel · Apr 5, 2005

Chris Mattern said:
If you have an "untaint this" regexp, you might wind up using several
times in a row on several variables. But mostly, I think this was
Larry getting a little overenthusiastic in "save the programmer
keystrokes" mode. And once it was around for awhile, of course it
couldn't be taken out because it would break stuff.

I'm inclined to believe it was at some stage meant to be the last
successfully *compiled* regex that set //. That would make much more
sense as a keystroke-saver, though still somewhat obscure.

As it is, it could be used to choose a regex from a selection by
matching them against a test string (or more), then using //. There
are clearer, not much longer ways to do that, even without qr//.
It's a misfeature and no one uses it.

Anno

Ala Qumsieh · Apr 5, 2005

So, does anyone find this behavior useful? I've never intentionally used
it, and I can't imagine doing so in the future.

I have used it and have seen it used before, but only in the context of
Perl Golf to save some chars. Of course, in real production code, I
would strongly advise against using it since it can easily lead to
confusion and has no real advantage.

--Ala

Alex Hart · Apr 6, 2005

So, does anyone find this behavior useful? I've never intentionally
used

it, and I can't imagine doing so in the future.

I use this all the time.

This can be used instead of the "o" option. Meaning the regex will not
be recompiled each time perl sees it. If perl sees a string inside a
regex, it will recompile it each time, even if the string hasn't
changed. If you set the "o" option, then it is fixed for the whole
program, once it is compiled. Using // can avoid perl recompiling each
time, but the string can still change later.

Here's an example

sub Search { # search a list of names for a string
my ($string) = @_;
$string =~ /$string/i;
foreach (@list_of_names) {
if (//) {
push @found, $_;
}
}
}

Now, the regex is only compiled once each time the function is called.
With the "o" flag, running Search() twice would search for the same
string twice.

There are other ways to achieve the same thing, but I like //.

Hope that makes sense.

- Alex Hart

Joe Smith · Apr 6, 2005

Alex said:
This can be used instead of the "o" option. Meaning the regex will not
be recompiled each time perl sees it. If perl sees a string inside a
regex, it will recompile it each time, even if the string hasn't
changed.

Earlier versions of perl operated in that fashion.
-Joe

Anno Siegel · Apr 6, 2005

Alex Hart said:
I use this all the time.

This can be used instead of the "o" option. Meaning the regex will not
be recompiled each time perl sees it. If perl sees a string inside a
regex, it will recompile it each time, even if the string hasn't
changed. If you set the "o" option, then it is fixed for the whole
program, once it is compiled. Using // can avoid perl recompiling each
time, but the string can still change later.

Here's an example

sub Search { # search a list of names for a string
my ($string) = @_;
$string =~ /$string/i;

That won't necessarily match. It will match if $string (which would be
better named $pattern) doesn't contain regex meta characters (and sometimes
if it does). It won't match, for instance, for "a[bc]".

That is exactly the problem with the // kludge: Given an arbitrary regex,
there is no way of constructing a string that the regex will match.

foreach (@list_of_names) {
if (//) {
push @found, $_;
}
}
}

Now, the regex is only compiled once each time the function is called.
With the "o" flag, running Search() twice would search for the same
string twice.

There are other ways to achieve the same thing, but I like //.

Why? It's obscure and unsafe. Use qr//.

Anno

Brian McCauley · Apr 6, 2005

So, does anyone find this behavior useful? I've never intentionally used
it, and I can't imagine doing so in the future.

Xho

I'm with Xho on this.

Joe Smith · Apr 8, 2005

So, does anyone find this behavior useful? I've never intentionally used
it, and I can't imagine doing so in the future.

That's the way vi works. (And jove but not emacs.)
-Joe

xhoster · Apr 8, 2005

Joe Smith said:
That's the way vi works. (And jove but not emacs.)

My version of vi doesn't work that way. It uses the most recently
specified search, not the most recently successful search. (But I just use
'n' when I want to repeat a search, so I could ask the same question
on this vi feature as I did on the Perl one.)

Xho

Ilya Zakharevich · Apr 9, 2005

[A complimentary Cc of this posting was sent to
Chris Mattern

If you have an "untaint this" regexp, you might wind up using several
times in a row on several variables.

IIRC, the original reason for this (extremely counter-productive)
misfeature is a simplification of something-to-perl translator (sed,
or awk?). It MIGHT have had some usability before REx-object were
implemented; I expect that now it has none.

Hope this helps,
Ilya

Black magic, or insanity?	32	Jan 21, 2014
Persistence API - magic?	15	Sep 6, 2011
What's the detailed explanation for why the 1st function is correct and the 2nd is wrong?	3	Dec 16, 2022
Module python-magic on/for Windows?	12	May 11, 2008
C language. work with text	3	Dec 10, 2021
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
[SUMMARY] Magic Fingers (#120)	0	Apr 19, 2007
[SUMMARY] Magic Squares (#124)	0	May 24, 2007

Q: // and "magic"

J Krugman

A. Sinan Unur

xhoster

A. Sinan Unur

Chris Mattern

A. Sinan Unur

J Krugman

Anno Siegel

Ala Qumsieh

Alex Hart

Joe Smith

Anno Siegel

Brian McCauley

Joe Smith

xhoster

Ilya Zakharevich

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads