Q: // and "magic"

Discussion in 'Perl Misc' started by J Krugman, Apr 5, 2005.

  1. J Krugman

    J Krugman Guest

    In perlre I found these puzzling lines:

    @chars = split //, $string; # // is not magic in split
    ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /

    I don't understand the comments. What's all the "magic" about?

    In an attempt to understand the first comment, I consulted perldoc
    -f split, which made matters worse. I found no mention at all of
    "magic", but I came across this:

    Using the empty pattern "//" specifically matches
    the null string, and is not be confused with the
    use of "//" to mean "the last successful pattern
    match".

    Now I'm hopelessly confused. I understand that "//" matches the
    null string, but I have no idea what the last sentence above (about
    the "other" use of "//") is talking about. Any help sorting this
    out would be greatly appreciated.

    TIA!

    jill



    --
    To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.
     
    J Krugman, Apr 5, 2005
    #1
    1. Advertising

  2. J Krugman <> wrote in news:d2u5j6$a4j$1
    @reader1.panix.com:

    > In perlre I found these puzzling lines:
    >
    > @chars = split //, $string; # // is not magic in split
    > ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /
    >
    > I don't understand the comments. What's all the "magic" about?
    >
    > In an attempt to understand the first comment, I consulted perldoc
    > -f split, which made matters worse. I found no mention at all of
    > "magic", but I came across this:
    >
    > Using the empty pattern "//" specifically matches
    > the null string, and is not be confused with the
    > use of "//" to mean "the last successful pattern
    > match".
    >
    > Now I'm hopelessly confused. I understand that "//" matches the
    > null string, but I have no idea what the last sentence above (about
    > the "other" use of "//") is talking about.


    In the context of the split function, // matches the empty string.

    Elsewhere, // means the last successful pattern match.

    IMHO, the passage above is very clear, but here is the relevant section
    from perldoc perlop (where m// is being discussed):

    If the PATTERN evaluates to the empty string, the last
    *successfully* matched regular expression is used instead. In
    this case, only the "g" and "c" flags on the empty pattern is
    honoured - the other flags are taken from the original pattern.
    If no match has previously succeeded, this will (silently) act
    instead as a genuine empty pattern (which will always match).

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Apr 5, 2005
    #2
    1. Advertising

  3. J Krugman

    Guest

    "A. Sinan Unur" <> wrote:
    >
    > In the context of the split function, // matches the empty string.
    >
    > Elsewhere, // means the last successful pattern match.
    >
    > IMHO, the passage above is very clear, but here is the relevant section
    > from perldoc perlop (where m// is being discussed):
    >
    > If the PATTERN evaluates to the empty string, the last
    > *successfully* matched regular expression is used instead. In
    > this case, only the "g" and "c" flags on the empty pattern is
    > honoured - the other flags are taken from the original pattern.
    > If no match has previously succeeded, this will (silently) act
    > instead as a genuine empty pattern (which will always match).


    So, does anyone find this behavior useful? I've never intentionally used
    it, and I can't imagine doing so in the future.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Apr 5, 2005
    #3
  4. wrote in news:20050405125614.353$:

    > "A. Sinan Unur" <> wrote:
    >>
    >> In the context of the split function, // matches the empty string.
    >>
    >> Elsewhere, // means the last successful pattern match.
    >>
    >> IMHO, the passage above is very clear, but here is the relevant
    >> section from perldoc perlop (where m// is being discussed):
    >>
    >> If the PATTERN evaluates to the empty string, the last
    >> *successfully* matched regular expression is used instead. In
    >> this case, only the "g" and "c" flags on the empty pattern is
    >> honoured - the other flags are taken from the original pattern.
    >> If no match has previously succeeded, this will (silently) act
    >> instead as a genuine empty pattern (which will always match).

    >
    > So, does anyone find this behavior useful? I've never intentionally
    > used it, and I can't imagine doing so in the future.


    At the risk of sounding like an AOLer, I am curious as well. I tried
    thinking of a way to use this feature. Couldn't think of anything, but
    that is probably a reflection of my limitations :)

    I have a feeling Abigail might contribute some magic.

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Apr 5, 2005
    #4
  5. A. Sinan Unur wrote:

    > wrote in news:20050405125614.353$:
    >
    >> "A. Sinan Unur" <> wrote:
    >>>
    >>> In the context of the split function, // matches the empty string.
    >>>
    >>> Elsewhere, // means the last successful pattern match.
    >>>
    >>> IMHO, the passage above is very clear, but here is the relevant
    >>> section from perldoc perlop (where m// is being discussed):
    >>>
    >>> If the PATTERN evaluates to the empty string, the last
    >>> *successfully* matched regular expression is used instead. In
    >>> this case, only the "g" and "c" flags on the empty pattern is
    >>> honoured - the other flags are taken from the original pattern.
    >>> If no match has previously succeeded, this will (silently) act
    >>> instead as a genuine empty pattern (which will always match).

    >>
    >> So, does anyone find this behavior useful? I've never intentionally
    >> used it, and I can't imagine doing so in the future.

    >
    > At the risk of sounding like an AOLer, I am curious as well. I tried
    > thinking of a way to use this feature. Couldn't think of anything, but
    > that is probably a reflection of my limitations :)
    >
    > I have a feeling Abigail might contribute some magic.
    >

    If you have an "untaint this" regexp, you might wind up using several
    times in a row on several variables. But mostly, I think this was
    Larry getting a little overenthusiastic in "save the programmer
    keystrokes" mode. And once it was around for awhile, of course it
    couldn't be taken out because it would break stuff.

    --
    Christopher Mattern

    "Which one you figure tracked us?"
    "The ugly one, sir."
    "...Could you be more specific?"
     
    Chris Mattern, Apr 5, 2005
    #5
  6. wrote in news:20050405125614.353$:

    > "A. Sinan Unur" <> wrote:
    >>
    >> In the context of the split function, // matches the empty string.
    >>
    >> Elsewhere, // means the last successful pattern match.
    >>
    >> IMHO, the passage above is very clear, but here is the relevant
    >> section from perldoc perlop (where m// is being discussed):
    >>
    >> If the PATTERN evaluates to the empty string, the last
    >> *successfully* matched regular expression is used instead. In
    >> this case, only the "g" and "c" flags on the empty pattern is
    >> honoured - the other flags are taken from the original pattern.
    >> If no match has previously succeeded, this will (silently) act
    >> instead as a genuine empty pattern (which will always match).

    >
    > So, does anyone find this behavior useful? I've never intentionally
    > used it, and I can't imagine doing so in the future.


    I can think of one situation where this feature might be useful.

    Consider the following:

    #! perl

    use strict;
    use warnings;

    my $s = 'one two three onetwo three one two three four';
    my %count;

    if( $s =~ /\b(one)\b/ or $s =~ /\b(two)\b/ ) {
    ++$count{$1} while( $s =~ //g );
    }
    __END__

    Here, I am interested in counting the number of times the word 'one' in
    the text. If there are no 'one's, then I want to count the number of
    times 'two' occurs.

    I think this is the most succint way of expressing the intent above. I
    do not know if it would offer any speed advantages over other methods of
    doing the same thing.

    The construct might allow the programmer to more naturally avoid
    alternation in regular expressions in favor of or tests in the
    conditional and that might result in a performance benefit as well.

    All this is speculation, however.

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Apr 5, 2005
    #6
  7. J Krugman

    J Krugman Guest

    In <20050405125614.353$> writes:

    >"A. Sinan Unur" <> wrote:
    >>
    >> In the context of the split function, // matches the empty string.
    >>
    >> Elsewhere, // means the last successful pattern match.
    >>
    >> IMHO, the passage above is very clear, but here is the relevant section
    >> from perldoc perlop (where m// is being discussed):
    >>
    >> If the PATTERN evaluates to the empty string, the last
    >> *successfully* matched regular expression is used instead. In
    >> this case, only the "g" and "c" flags on the empty pattern is
    >> honoured - the other flags are taken from the original pattern.
    >> If no match has previously succeeded, this will (silently) act
    >> instead as a genuine empty pattern (which will always match).


    >So, does anyone find this behavior useful? I've never intentionally used
    >it, and I can't imagine doing so in the future.


    I think this may have something to do with my confusion: I have
    never seen // used in any situation in which it wasn't clearly
    intended to match the empty string, as in split //, ... . It would
    be great to se a meaningful example.

    jill
    --
    To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.
     
    J Krugman, Apr 5, 2005
    #7
  8. J Krugman

    Anno Siegel Guest

    Chris Mattern <> wrote in comp.lang.perl.misc:
    > A. Sinan Unur wrote:
    >
    > > wrote in news:20050405125614.353$:
    > >
    > >> "A. Sinan Unur" <> wrote:
    > >>>
    > >>> In the context of the split function, // matches the empty string.
    > >>>
    > >>> Elsewhere, // means the last successful pattern match.
    > >>>
    > >>> IMHO, the passage above is very clear, but here is the relevant
    > >>> section from perldoc perlop (where m// is being discussed):
    > >>>
    > >>> If the PATTERN evaluates to the empty string, the last
    > >>> *successfully* matched regular expression is used instead. In
    > >>> this case, only the "g" and "c" flags on the empty pattern is
    > >>> honoured - the other flags are taken from the original pattern.
    > >>> If no match has previously succeeded, this will (silently) act
    > >>> instead as a genuine empty pattern (which will always match).
    > >>
    > >> So, does anyone find this behavior useful? I've never intentionally
    > >> used it, and I can't imagine doing so in the future.

    > >
    > > At the risk of sounding like an AOLer, I am curious as well. I tried
    > > thinking of a way to use this feature. Couldn't think of anything, but
    > > that is probably a reflection of my limitations :)
    > >
    > > I have a feeling Abigail might contribute some magic.
    > >

    > If you have an "untaint this" regexp, you might wind up using several
    > times in a row on several variables. But mostly, I think this was
    > Larry getting a little overenthusiastic in "save the programmer
    > keystrokes" mode. And once it was around for awhile, of course it
    > couldn't be taken out because it would break stuff.


    I'm inclined to believe it was at some stage meant to be the last
    successfully *compiled* regex that set //. That would make much more
    sense as a keystroke-saver, though still somewhat obscure.

    As it is, it could be used to choose a regex from a selection by
    matching them against a test string (or more), then using //. There
    are clearer, not much longer ways to do that, even without qr//.
    It's a misfeature and no one uses it.

    Anno
     
    Anno Siegel, Apr 5, 2005
    #8
  9. J Krugman

    Ala Qumsieh Guest

    wrote:
    > So, does anyone find this behavior useful? I've never intentionally used
    > it, and I can't imagine doing so in the future.


    I have used it and have seen it used before, but only in the context of
    Perl Golf to save some chars. Of course, in real production code, I
    would strongly advise against using it since it can easily lead to
    confusion and has no real advantage.

    --Ala
     
    Ala Qumsieh, Apr 5, 2005
    #9
  10. J Krugman

    Alex Hart Guest

    > So, does anyone find this behavior useful? I've never intentionally
    used
    > it, and I can't imagine doing so in the future.


    I use this all the time.

    This can be used instead of the "o" option. Meaning the regex will not
    be recompiled each time perl sees it. If perl sees a string inside a
    regex, it will recompile it each time, even if the string hasn't
    changed. If you set the "o" option, then it is fixed for the whole
    program, once it is compiled. Using // can avoid perl recompiling each
    time, but the string can still change later.


    Here's an example


    sub Search { # search a list of names for a string
    my ($string) = @_;
    $string =~ /$string/i;
    foreach (@list_of_names) {
    if (//) {
    push @found, $_;
    }
    }
    }

    Now, the regex is only compiled once each time the function is called.
    With the "o" flag, running Search() twice would search for the same
    string twice.

    There are other ways to achieve the same thing, but I like //.

    Hope that makes sense.

    - Alex Hart
     
    Alex Hart, Apr 6, 2005
    #10
  11. J Krugman

    Joe Smith Guest

    Alex Hart wrote:

    > This can be used instead of the "o" option. Meaning the regex will not
    > be recompiled each time perl sees it. If perl sees a string inside a
    > regex, it will recompile it each time, even if the string hasn't
    > changed.


    Earlier versions of perl operated in that fashion.
    -Joe
     
    Joe Smith, Apr 6, 2005
    #11
  12. J Krugman

    Anno Siegel Guest

    Alex Hart <> wrote in comp.lang.perl.misc:
    > > So, does anyone find this behavior useful? I've never intentionally

    > used
    > > it, and I can't imagine doing so in the future.

    >
    > I use this all the time.
    >
    > This can be used instead of the "o" option. Meaning the regex will not
    > be recompiled each time perl sees it. If perl sees a string inside a
    > regex, it will recompile it each time, even if the string hasn't
    > changed. If you set the "o" option, then it is fixed for the whole
    > program, once it is compiled. Using // can avoid perl recompiling each
    > time, but the string can still change later.
    >
    >
    > Here's an example
    >
    >
    > sub Search { # search a list of names for a string
    > my ($string) = @_;
    > $string =~ /$string/i;


    That won't necessarily match. It will match if $string (which would be
    better named $pattern) doesn't contain regex meta characters (and sometimes
    if it does). It won't match, for instance, for "a[bc]".

    That is exactly the problem with the // kludge: Given an arbitrary regex,
    there is no way of constructing a string that the regex will match.

    > foreach (@list_of_names) {
    > if (//) {
    > push @found, $_;
    > }
    > }
    > }
    >
    > Now, the regex is only compiled once each time the function is called.
    > With the "o" flag, running Search() twice would search for the same
    > string twice.
    >
    > There are other ways to achieve the same thing, but I like //.


    Why? It's obscure and unsafe. Use qr//.

    Anno
     
    Anno Siegel, Apr 6, 2005
    #12
  13. wrote:

    > "A. Sinan Unur" <> wrote:
    >
    >> If the PATTERN evaluates to the empty string, the last
    >> *successfully* matched regular expression is used instead. In
    >> this case, only the "g" and "c" flags on the empty pattern is
    >> honoured - the other flags are taken from the original pattern.
    >> If no match has previously succeeded, this will (silently) act
    >> instead as a genuine empty pattern (which will always match).

    >
    > So, does anyone find this behavior useful? I've never intentionally used
    > it, and I can't imagine doing so in the future.
    >
    > Xho


    I'm with Xho on this.
     
    Brian McCauley, Apr 6, 2005
    #13
  14. J Krugman

    Joe Smith Guest

    wrote:

    >> If the PATTERN evaluates to the empty string, the last
    >> *successfully* matched regular expression is used instead. In
    >> this case, only the "g" and "c" flags on the empty pattern is
    >> honoured - the other flags are taken from the original pattern.
    >> If no match has previously succeeded, this will (silently) act
    >> instead as a genuine empty pattern (which will always match).

    >
    > So, does anyone find this behavior useful? I've never intentionally used
    > it, and I can't imagine doing so in the future.


    That's the way vi works. (And jove but not emacs.)
    -Joe
     
    Joe Smith, Apr 8, 2005
    #14
  15. J Krugman

    Guest

    Joe Smith <> wrote:
    > wrote:
    >
    > >> If the PATTERN evaluates to the empty string, the last
    > >> *successfully* matched regular expression is used instead. In
    > >> this case, only the "g" and "c" flags on the empty pattern is
    > >> honoured - the other flags are taken from the original pattern.
    > >> If no match has previously succeeded, this will (silently) act
    > >> instead as a genuine empty pattern (which will always match).

    > >
    > > So, does anyone find this behavior useful? I've never intentionally
    > > used it, and I can't imagine doing so in the future.

    >
    > That's the way vi works. (And jove but not emacs.)


    My version of vi doesn't work that way. It uses the most recently
    specified search, not the most recently successful search. (But I just use
    'n' when I want to repeat a search, so I could ask the same question
    on this vi feature as I did on the Perl one.)

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Apr 8, 2005
    #15
  16. [A complimentary Cc of this posting was sent to
    Chris Mattern
    <>], who wrote in article <>:
    > >> So, does anyone find this behavior useful? I've never intentionally
    > >> used it, and I can't imagine doing so in the future.


    > > At the risk of sounding like an AOLer, I am curious as well. I tried
    > > thinking of a way to use this feature. Couldn't think of anything, but
    > > that is probably a reflection of my limitations :)


    > If you have an "untaint this" regexp, you might wind up using several
    > times in a row on several variables.


    IIRC, the original reason for this (extremely counter-productive)
    misfeature is a simplification of something-to-perl translator (sed,
    or awk?). It MIGHT have had some usability before REx-object were
    implemented; I expect that now it has none.

    Hope this helps,
    Ilya
     
    Ilya Zakharevich, Apr 9, 2005
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Keith

    Analog and Report Magic

    Keith, Jun 4, 2004, in forum: HTML
    Replies:
    0
    Views:
    372
    Keith
    Jun 4, 2004
  2. Tobin Fricke
    Replies:
    2
    Views:
    7,134
    Tobin Fricke
    May 27, 2004
  3. Tobin Fricke
    Replies:
    7
    Views:
    430
    Jeremy Yallop
    May 28, 2004
  4. andrea

    Archives and magic bytes

    andrea, Mar 24, 2005, in forum: Python
    Replies:
    5
    Views:
    318
    andrea crotti
    Mar 26, 2005
  5. Giles Bowkett
    Replies:
    9
    Views:
    411
    Giles Bowkett
    Dec 17, 2007
Loading...

Share This Page