substr forces scalar context with array argument

Discussion in 'Perl Misc' started by Andrew, Nov 29, 2005.

  1. Andrew

    Andrew Guest

    In the code below, the array '@tmp' with two elements is treated as its
    scalar (the number 2), when used as argument to substr. When I
    syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
    what i want. Why is this so? Is there a way to force list context
    (besides explicit enumeration)? I can't seem to find any mention of
    this anywhere. Could it be an oversight/bug in substr? (probably not,
    given the thoroughness that has gone into the development of perl
    functions, but this is unperl-like limitation :)

    my $s='abracadabra';
    my ($offset,$length)=@tmp=(3,4);

    my @code=(
    'substr($s, $offset, $length)',
    'substr($s, @tmp)',
    'substr($s, @tmp[0,1])',
    'substr($s, $tmp[0], $tmp[1])'
    );


    foreach $c (@code) {
    print "\n", $c, ' ---> ', eval($c);
    }


    #-------------- output -------------------

    substr($s, $offset, $length) ---> acad
    substr($s, @tmp) ---> racadabra
    substr($s, @tmp[0,1]) ---> cadabra
    substr($s, $tmp[0], $tmp[1]) ---> acad

    #------------ end output ----------------

    ( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )

    TIA

    andrew
    Andrew, Nov 29, 2005
    #1
    1. Advertising

  2. Andrew

    Paul Lalli Guest

    Andrew wrote:
    > In the code below, the array '@tmp' with two elements is treated as its
    > scalar (the number 2), when used as argument to substr. When I
    > syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
    > what i want. Why is this so? Is there a way to force list context
    > (besides explicit enumeration)? I can't seem to find any mention of
    > this anywhere.


    $ perldoc -f substr
    substr EXPR,OFFSET,LENGTH,REPLACEMENT
    substr EXPR,OFFSET,LENGTH
    substr EXPR,OFFSET

    substr is expecting between 2 and 4 scalars. So when you pass it an
    array, yes, it converts that array to a scalar. That's what it's
    documented to do.

    > Could it be an oversight/bug in substr? (probably not,
    > given the thoroughness that has gone into the development of perl
    > functions, but this is unperl-like limitation :)


    I don't see it as a limitation at all. How frequently are you going to
    have such distinct values as an offset and a length contained in a
    single array? Unless you do so very intentionally, it's just not
    likely to happen.

    Paul Lalli
    Paul Lalli, Nov 29, 2005
    #2
    1. Advertising

  3. Andrew

    Paul Lalli Guest

    Andrew wrote:
    > #-------------- output -------------------
    >
    > substr($s, $offset, $length) ---> acad
    > substr($s, @tmp) ---> racadabra
    > substr($s, @tmp[0,1]) ---> cadabra
    > substr($s, $tmp[0], $tmp[1]) ---> acad
    >
    > #------------ end output ----------------
    >
    > ( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )


    Didn't see this comment originally. The issue here is that a list
    slice returns a list, not an array. A list in scalar context returns
    the last item of that list. Therefore,
    substr($s, @tmp[0,1]);
    is equivalent to:
    substr($s, 4);
    since $tmp[1] == 4.

    Paul Lalli
    Paul Lalli, Nov 29, 2005
    #3
  4. Andrew

    Andrew Guest

    > > In the code below, the array '@tmp' with two elements is treated as its
    >> scalar (the number 2), when used as argument to substr. When I
    >> syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
    >> what i want. Why is this so? Is there a way to force list context
    >> (besides explicit enumeration)? I can't seem to find any mention of
    >> this anywhere.

    >
    >$ perldoc -f substr
    > substr EXPR,OFFSET,LENGTH,REPLACEMENT
    > substr EXPR,OFFSET,LENGTH
    > substr EXPR,OFFSET
    >
    >substr is expecting between 2 and 4 scalars. So when you pass it an
    >array, yes, it converts that array to a scalar. That's what it's
    >documented to do.


    Yes, I had looked at "perldoc -f substr" before posting. Au contraire,
    the fact that "substr is expecting between 2 and 4 scalars", as you put
    it, is NOT documented. The word 'scalar' is nowhere to be found in the
    doc.

    Furthermore, I suppose we are running into the issue of semantics
    "versus" syntax, and the delineation between the two. What a perl list
    means "semantically" (conceptually) may be something different from
    what it is syntactically.

    The documentation bit "substr EXPR,OFFSET,LENGTH,REPLACEMENT" certainly
    suggests a list. And, in as much as an @array is a (semantic?)
    "representation" of multiple scalars, i may be misled to substitute
    this representation for that which it represents.

    In ordinary subroutines that i build, "($x,$y)" and "@tmp" ( where
    @tmp=($x,$y) ) are certainly interchangeable as arguments to the
    subroutine (grabbed via @_ or $_[0], $_[1], ..., etc., within the sub).
    Using either yields exact same results.

    Thus, it appears to me, that the "substr" function does something out
    of the norm with the arguments, distinguishing between and acting
    differently with the two SYNTACTICAL variations (of that which is
    semantically equivalent)


    >> Could it be an oversight/bug in substr? (probably not,
    >> given the thoroughness that has gone into the development of perl
    >> functions, but this is unperl-like limitation :)

    >
    >I don't see it as a limitation at all. How frequently are you going to
    >have such distinct values as an offset and a length contained in a
    >single array? Unless you do so very intentionally, it's just not
    >likely to happen.
    >


    Real-life example: I have to extract substrings of different offsets
    and lengths (with many iterations). I define my offsets and lengths in
    a List of Lists:

    @LoL=( [1,4], [5,3], [10,3], [54, 5], [9,2], .... etc. );

    Then (for every iteration) i want to grab my multiple substrings like
    so:

    @substrings=map { substr($string, @{$LoL[$_]}) } (0..$#LoL);

    OK, i realize, i can "fix" this by adding more code:

    @substrings=map { substr($string, $LoL[$_][0], $LoL[$_][1] ) }
    (0..$#LoL);

    But what if my offsets and lengths where in a List of Hashes of Lists
    of Lists? :)
    And, the principle of it... syntactical brevity... which, in my
    experience, has always been a hallmark of Perl...particularly vis a vis
    arrays/hashes/scalars.

    andrew
    Andrew, Nov 29, 2005
    #4
  5. Andrew

    Guest

    "Andrew" <> wrote:
    > In the code below, the array '@tmp' with two elements is treated as its
    > scalar (the number 2), when used as argument to substr. When I
    > syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
    > what i want. Why is this so?


    Because substr has a prototype demanding scalars.

    ~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
    $$;$$


    > Is there a way to force list context
    > (besides explicit enumeration)? I can't seem to find any mention of
    > this anywhere.


    perldoc -f substr

    substr EXPR,OFFSET,LENGTH,REPLACEMENT
    substr EXPR,OFFSET,LENGTH
    substr EXPR,OFFSET

    Nowhere does it say LIST, so I wouldn't expect it to take an array
    interpretted as a list.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 29, 2005
    #5
  6. Andrew

    Andrew Guest

    >"Andrew" <> wrote:
    >> In the code below, the array '@tmp' with two elements is treated as its
    >> scalar (the number 2), when used as argument to substr. When I
    >> syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
    >> what i want. Why is this so?

    >
    >Because substr has a prototype demanding scalars.
    >
    >~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
    >$$;$$


    Is there a reason the prototype is demanding scalars? (Or is it simply
    "the king's decree"?)

    Again, in my observation and experience, perl particulars are designed
    with careful consideration of all the tradeoffs, and the most optimal
    (and usually intuitive) alternative is chosen (for syntax and
    functionality).

    >
    >> Is there a way to force list context
    >> (besides explicit enumeration)? I can't seem to find any mention of
    >> this anywhere.

    >
    >perldoc -f substr
    >
    > substr EXPR,OFFSET,LENGTH,REPLACEMENT
    > substr EXPR,OFFSET,LENGTH
    > substr EXPR,OFFSET
    >
    >Nowhere does it say LIST, so I wouldn't expect it to take an array
    >interpretted as a list.


    Assuming that with the above you are rebutting my comment that "the
    word 'scalar' is nowhere to be found...": So the documentation does not
    mention either lists or scalars, which may leave the reader guessing.
    So, the purpose of my response to yours is to prevent the quick
    dismissal of the notion that the documentation /may/ need to be a bit
    more explicit. Although, admittedly, I may be wrong about that...
    particularly, if this is covered somewhere else in the docs.

    andrew
    Andrew, Nov 29, 2005
    #6
  7. Andrew

    Guest

    "Andrew" <> wrote:
    > >"Andrew" <> wrote:
    > >> In the code below, the array '@tmp' with two elements is treated as
    > >> its scalar (the number 2), when used as argument to substr. When I
    > >> syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
    > >> does what i want. Why is this so?

    > >
    > >Because substr has a prototype demanding scalars.
    > >
    > >~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
    > >$$;$$

    >
    > Is there a reason the prototype is demanding scalars? (Or is it simply
    > "the king's decree"?)


    In the 2nd and 3rd slots, it makes sense to demand scalars, because it may
    be natural to use (say) the scalar value (i.e. count) of a grep, map, or
    array to specify the offset or length. It wouldn't seem particularly
    natural to use those counts as a string (the first argument), but then
    again the alternative behavior isn't natural either. If the 2nd and 3rd
    arguments are interpreted as scalars, the first ones is going to be, too.

    Aside from which, I am aware of no built-ins taking an array treated as a
    list in which the list is heterogenous. Why would substr be different?

    > Again, in my observation and experience, perl particulars are designed
    > with careful consideration of all the tradeoffs, and the most optimal
    > (and usually intuitive) alternative is chosen (for syntax and
    > functionality).


    And this case is no exception.

    >
    > >
    > >> Is there a way to force list context
    > >> (besides explicit enumeration)? I can't seem to find any mention of
    > >> this anywhere.

    > >
    > >perldoc -f substr
    > >
    > > substr EXPR,OFFSET,LENGTH,REPLACEMENT
    > > substr EXPR,OFFSET,LENGTH
    > > substr EXPR,OFFSET
    > >
    > >Nowhere does it say LIST, so I wouldn't expect it to take an array
    > >interpretted as a list.

    >
    > Assuming that with the above you are rebutting my comment that "the
    > word 'scalar' is nowhere to be found...": So the documentation does not
    > mention either lists or scalars, which may leave the reader guessing.


    Not a reader who knows about the format used in perldoc. If it doesn't
    say LIST, it is not a LIST. Can you imagine how longs the docs would be if
    they had to add a tag [And here we didn't say LIST because we didn't mean
    LIST] after every other word?

    > So, the purpose of my response to yours is to prevent the quick
    > dismissal of the notion that the documentation /may/ need to be a bit
    > more explicit.


    If you proposed actual verbage to implement that greater explicitness, I'd
    take you notion more seriously.

    > Although, admittedly, I may be wrong about that...
    > particularly, if this is covered somewhere else in the docs.


    I don't know if there is a part of perldoc that describes the format by
    which is describes the format by which it describes the formats it
    describes. If there is, I'm glad I've never needed to consult it.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 29, 2005
    #7
  8. Andrew

    Paul Lalli Guest

    Andrew wrote:
    > >
    > >Because substr has a prototype demanding scalars.
    > >
    > >~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
    > >$$;$$

    >
    > Is there a reason the prototype is demanding scalars? (Or is it simply
    > "the king's decree"?)


    Because it makes the most sense to use a scalar there.

    > Again, in my observation and experience, perl particulars are designed
    > with careful consideration of all the tradeoffs, and the most optimal
    > (and usually intuitive) alternative is chosen (for syntax and
    > functionality).


    Yes. As it is here. Yours is the first post I can ever remember
    seeing wishing it was the other way around.

    > >perldoc -f substr
    > >
    > > substr EXPR,OFFSET,LENGTH,REPLACEMENT
    > > substr EXPR,OFFSET,LENGTH
    > > substr EXPR,OFFSET
    > >
    > >Nowhere does it say LIST, so I wouldn't expect it to take an array
    > >interpretted as a list.

    >
    > Assuming that with the above you are rebutting my comment that "the
    > word 'scalar' is nowhere to be found...": So the documentation does not
    > mention either lists or scalars, which may leave the reader guessing.
    > So, the purpose of my response to yours is to prevent the quick
    > dismissal of the notion that the documentation /may/ need to be a bit
    > more explicit. Although, admittedly, I may be wrong about that...
    > particularly, if this is covered somewhere else in the docs.


    You should probably read the first two paragraphs of
    perldoc perlfunc
    which is the overall enclosing entity of the `perldoc -f` syntax.

    Paul Lalli
    Paul Lalli, Nov 29, 2005
    #8
  9. Andrew

    Anno Siegel Guest

    <> wrote in comp.lang.perl.misc:
    > "Andrew" <> wrote:
    > > >"Andrew" <> wrote:
    > > >> In the code below, the array '@tmp' with two elements is treated as
    > > >> its scalar (the number 2), when used as argument to substr. When I
    > > >> syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
    > > >> does what i want. Why is this so?


    [...]

    > Aside from which, I am aware of no built-ins taking an array treated as a
    > list in which the list is heterogenous. Why would substr be different?


    The kill() function is an exception. Described as "kill SIGNAL, LIST",
    it accepts an array argument whose first element is the signal and
    the others are pids:

    perl -le '@l = ( "HUP", $$); kill @l'

    kills itself.

    I agree that Perl builtins normally don't work like that. The behavior
    of kill() may be unintentional.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Nov 30, 2005
    #9
  10. Andrew

    Andrew Guest

    Anno Siegel wrote:
    > <> wrote in comp.lang.perl.misc:
    > > "Andrew" <> wrote:
    > > > >"Andrew" <> wrote:
    > > > >> In the code below, the array '@tmp' with two elements is treated as
    > > > >> its scalar (the number 2), when used as argument to substr. When I
    > > > >> syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
    > > > >> does what i want. Why is this so?

    >
    > [...]
    >
    > > Aside from which, I am aware of no built-ins taking an array treated as a
    > > list in which the list is heterogenous. Why would substr be different?

    >
    > The kill() function is an exception. Described as "kill SIGNAL, LIST",
    > it accepts an array argument whose first element is the signal and
    > the others are pids:
    >
    > perl -le '@l = ( "HUP", $$); kill @l'
    >
    > kills itself.
    >
    > I agree that Perl builtins normally don't work like that. The behavior
    > of kill() may be unintentional.


    OK, you've enlightened me on the conventions of perldoc -f, regarding
    arguments, and I now understand the way things are.

    However, it's still not clear _why_. Xho's argument about needing to
    pass the count of grep, map, etc., isn't quite convincing. You can
    always get a count of members of a list, like so:

    scalar( @{[ ...LIST... ]} )

    However, apparently, not the reverse. You can't stipulate that an
    @array be forced in as a list, with some brief syntax (
    force_list(@array) ).

    Thus, i'm not certain that to trade off the convenience of
    substr(@bunch_of_scalar_args) for counts was worth it. Perhaps there
    are other reasons not mentioned in this thread(?)

    andrew
    Andrew, Nov 30, 2005
    #10
  11. Andrew <> wrote:

    > You can't stipulate that an
    > @array be forced in as a list, with some brief syntax (
    > force_list(@array) ).



    map $_, @array;


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 30, 2005
    #11
  12. Andrew

    Guest

    Tad McClellan <> wrote:
    > Andrew <> wrote:
    >
    > > You can't stipulate that an
    > > @array be forced in as a list, with some brief syntax (
    > > force_list(@array) ).

    >
    > map $_, @array;


    Nah, that won't work for what he wants. Now the map rather than the array
    is being thrust into a scalar context, but the end result will be pretty
    much the same.

    The only way I know of is:
    substr $array[0], $array[1], $array[2];

    Which just isn't all that ugly.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 30, 2005
    #12
  13. Anno Siegel wrote:
    > <> wrote in comp.lang.perl.misc:
    >
    >>Aside from which, I am aware of no built-ins taking an array treated as a
    >>list in which the list is heterogenous. Why would substr be different?

    >
    > The kill() function is an exception. Described as "kill SIGNAL, LIST",
    > it accepts an array argument whose first element is the signal and
    > the others are pids:
    >
    > perl -le '@l = ( "HUP", $$); kill @l'
    >
    > kills itself.


    $ perl -le' print prototype "CORE::kill" '
    @

    According to kill()'s prototype it just accepts a list and intreprets the
    first argument as a signal. Sort of like the way split() interprets its first
    argument as a regular expression.


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Dec 1, 2005
    #13
  14. Andrew wrote:
    > Anno Siegel wrote:
    >><> wrote in comp.lang.perl.misc:
    >>
    >>>Aside from which, I am aware of no built-ins taking an array treated as a
    >>>list in which the list is heterogenous. Why would substr be different?

    >>The kill() function is an exception. Described as "kill SIGNAL, LIST",
    >>it accepts an array argument whose first element is the signal and
    >>the others are pids:
    >>
    >> perl -le '@l = ( "HUP", $$); kill @l'
    >>
    >>kills itself.
    >>
    >>I agree that Perl builtins normally don't work like that. The behavior
    >>of kill() may be unintentional.

    >
    > OK, you've enlightened me on the conventions of perldoc -f, regarding
    > arguments, and I now understand the way things are.
    >
    > However, it's still not clear _why_. Xho's argument about needing to
    > pass the count of grep, map, etc., isn't quite convincing. You can
    > always get a count of members of a list, like so:
    >
    > scalar( @{[ ...LIST... ]} )


    Or instead of copying the list to an array and dereferencing it:

    scalar( () = ...LIST... )


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Dec 1, 2005
    #14
  15. Andrew

    Anno Siegel Guest

    John W. Krahn <> wrote in comp.lang.perl.misc:
    > Anno Siegel wrote:
    > > <> wrote in comp.lang.perl.misc:
    > >
    > >>Aside from which, I am aware of no built-ins taking an array treated as a
    > >>list in which the list is heterogenous. Why would substr be different?

    > >
    > > The kill() function is an exception. Described as "kill SIGNAL, LIST",
    > > it accepts an array argument whose first element is the signal and
    > > the others are pids:
    > >
    > > perl -le '@l = ( "HUP", $$); kill @l'
    > >
    > > kills itself.

    >
    > $ perl -le' print prototype "CORE::kill" '
    > @
    >
    > According to kill()'s prototype it just accepts a list and intreprets the
    > first argument as a signal. Sort of like the way split() interprets its first
    > argument as a regular expression.


    Ah, but split() is very special, it doesn't behave like kill() at all.
    Its first argument is passed on *as a regex* and not evaluated like
    any normal function would do. That can't be done with a prototype,
    it must be special-cased by the compiler.

    On top of that, split() behaves as if the first parameter was prototyped
    to a scalar. A list in that position works like the number of its
    elements.

    my $re = qr/ /;
    my $str = 'haha ho2ho hihi';
    my @l = ( $re, $str);

    print "$_\n" for split $re, $str;
    print "\n";

    print "$_\n" for split @l;
    print "\n";

    print "$_\n" for split @l, $str;

    ha
    ho2ho
    hihi

    Use of uninitialized value in split at ./ttt line 12.

    haha ho
    ho hihi


    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Dec 1, 2005
    #15
  16. Anno Siegel wrote:
    > John W. Krahn <> wrote in comp.lang.perl.misc:
    >>Anno Siegel wrote:
    >>> <> wrote in comp.lang.perl.misc:
    >>>
    >>>>Aside from which, I am aware of no built-ins taking an array treated as a
    >>>>list in which the list is heterogenous. Why would substr be different?
    >>>The kill() function is an exception. Described as "kill SIGNAL, LIST",
    >>>it accepts an array argument whose first element is the signal and
    >>>the others are pids:
    >>>
    >>> perl -le '@l = ( "HUP", $$); kill @l'
    >>>
    >>>kills itself.

    >>$ perl -le' print prototype "CORE::kill" '
    >>@
    >>
    >>According to kill()'s prototype it just accepts a list and intreprets the
    >>first argument as a signal. Sort of like the way split() interprets its first
    >>argument as a regular expression.

    >
    > Ah, but split() is very special, it doesn't behave like kill() at all.
    > Its first argument is passed on *as a regex* and not evaluated like
    > any normal function would do. That can't be done with a prototype,
    > it must be special-cased by the compiler.
    >
    > On top of that, split() behaves as if the first parameter was prototyped
    > to a scalar. A list in that position works like the number of its
    > elements.


    I was just pointing out that the first argument is treated differently then
    the subsequent arguments in the list. :)


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Dec 1, 2005
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jack Dowson
    Replies:
    1
    Views:
    301
    Chris Dollin
    May 1, 2007
  2. Replies:
    0
    Views:
    100
  3. Clint Olsen
    Replies:
    6
    Views:
    342
    Jeff 'japhy' Pinyan
    Nov 13, 2003
  4. dutone
    Replies:
    8
    Views:
    93
    dutone
    Jul 2, 2004
  5. Mark

    Replace scalar in another scalar

    Mark, Jan 27, 2005, in forum: Perl Misc
    Replies:
    4
    Views:
    156
    Arndt Jonasson
    Jan 27, 2005
Loading...

Share This Page