substr forces scalar context with array argument

Andrew · Nov 29, 2005

In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere. Could it be an oversight/bug in substr? (probably not,
given the thoroughness that has gone into the development of perl
functions, but this is unperl-like limitation

my $s='abracadabra';
my ($offset,$length)=@tmp=(3,4);

my @code=(
'substr($s, $offset, $length)',
'substr($s, @tmp)',
'substr($s, @tmp[0,1])',
'substr($s, $tmp[0], $tmp[1])'
);

foreach $c (@code) {
print "\n", $c, ' ---> ', eval($c);
}

#-------------- output -------------------

substr($s, $offset, $length) ---> acad
substr($s, @tmp) ---> racadabra
substr($s, @tmp[0,1]) ---> cadabra
substr($s, $tmp[0], $tmp[1]) ---> acad

#------------ end output ----------------

( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )

TIA

andrew

Paul Lalli · Nov 29, 2005

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

$ perldoc -f substr
substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

substr is expecting between 2 and 4 scalars. So when you pass it an
array, yes, it converts that array to a scalar. That's what it's
documented to do.

Could it be an oversight/bug in substr? (probably not,
given the thoroughness that has gone into the development of perl
functions, but this is unperl-like limitation

I don't see it as a limitation at all. How frequently are you going to
have such distinct values as an offset and a length contained in a
single array? Unless you do so very intentionally, it's just not
likely to happen.

Paul Lalli

Paul Lalli · Nov 29, 2005

Andrew said:
#-------------- output -------------------

substr($s, $offset, $length) ---> acad
substr($s, @tmp) ---> racadabra
substr($s, @tmp[0,1]) ---> cadabra
substr($s, $tmp[0], $tmp[1]) ---> acad

#------------ end output ----------------

( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )

Didn't see this comment originally. The issue here is that a list
slice returns a list, not an array. A list in scalar context returns
the last item of that list. Therefore,
substr($s, @tmp[0,1]);
is equivalent to:
substr($s, 4);
since $tmp[1] == 4.

Paul Lalli

Andrew · Nov 29, 2005

In the code below, the array '@tmp' with two elements is treated as its

scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

Click to expand...

$ perldoc -f substr
substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

substr is expecting between 2 and 4 scalars. So when you pass it an
array, yes, it converts that array to a scalar. That's what it's
documented to do.

Yes, I had looked at "perldoc -f substr" before posting. Au contraire,
the fact that "substr is expecting between 2 and 4 scalars", as you put
it, is NOT documented. The word 'scalar' is nowhere to be found in the
doc.

Furthermore, I suppose we are running into the issue of semantics
"versus" syntax, and the delineation between the two. What a perl list
means "semantically" (conceptually) may be something different from
what it is syntactically.

The documentation bit "substr EXPR,OFFSET,LENGTH,REPLACEMENT" certainly
suggests a list. And, in as much as an @array is a (semantic?)
"representation" of multiple scalars, i may be misled to substitute
this representation for that which it represents.

In ordinary subroutines that i build, "($x,$y)" and "@tmp" ( where
@tmp=($x,$y) ) are certainly interchangeable as arguments to the
subroutine (grabbed via @_ or $_[0], $_[1], ..., etc., within the sub).
Using either yields exact same results.

Thus, it appears to me, that the "substr" function does something out
of the norm with the arguments, distinguishing between and acting
differently with the two SYNTACTICAL variations (of that which is
semantically equivalent)

I don't see it as a limitation at all. How frequently are you going to
have such distinct values as an offset and a length contained in a
single array? Unless you do so very intentionally, it's just not
likely to happen.

Real-life example: I have to extract substrings of different offsets
and lengths (with many iterations). I define my offsets and lengths in
a List of Lists:

@LoL=( [1,4], [5,3], [10,3], [54, 5], [9,2], .... etc. );

Then (for every iteration) i want to grab my multiple substrings like
so:

@substrings=map { substr($string, @{$LoL[$_]}) } (0..$#LoL);

OK, i realize, i can "fix" this by adding more code:

@substrings=map { substr($string, $LoL[$_][0], $LoL[$_][1] ) }
(0..$#LoL);

But what if my offsets and lengths where in a List of Hashes of Lists
of Lists?

And, the principle of it... syntactical brevity... which, in my
experience, has always been a hallmark of Perl...particularly vis a vis
arrays/hashes/scalars.

andrew

xhoster · Nov 29, 2005

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so?

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

perldoc -f substr

substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

Nowhere does it say LIST, so I wouldn't expect it to take an array
interpretted as a list.

Xho

Andrew · Nov 29, 2005

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so?

Click to expand...

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).

perldoc -f substr

substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

Nowhere does it say LIST, so I wouldn't expect it to take an array
interpretted as a list.

Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.
So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit. Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

andrew

xhoster · Nov 29, 2005

Andrew said:
Andrew said:

In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?

Click to expand...

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Click to expand...

Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

In the 2nd and 3rd slots, it makes sense to demand scalars, because it may
be natural to use (say) the scalar value (i.e. count) of a grep, map, or
array to specify the offset or length. It wouldn't seem particularly
natural to use those counts as a string (the first argument), but then
again the alternative behavior isn't natural either. If the 2nd and 3rd
arguments are interpreted as scalars, the first ones is going to be, too.

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?

Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).

And this case is no exception.

Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.

Not a reader who knows about the format used in perldoc. If it doesn't
say LIST, it is not a LIST. Can you imagine how longs the docs would be if
they had to add a tag [And here we didn't say LIST because we didn't mean
LIST] after every other word?

So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit.

If you proposed actual verbage to implement that greater explicitness, I'd
take you notion more seriously.

Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

I don't know if there is a part of perldoc that describes the format by
which is describes the format by which it describes the formats it
describes. If there is, I'm glad I've never needed to consult it.

Xho

Paul Lalli · Nov 29, 2005

Andrew said:
Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

Because it makes the most sense to use a scalar there.

Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).

Yes. As it is here. Yours is the first post I can ever remember
seeing wishing it was the other way around.

Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.
So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit. Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

You should probably read the first two paragraphs of
perldoc perlfunc
which is the overall enclosing entity of the `perldoc -f` syntax.

Paul Lalli

Anno Siegel · Nov 30, 2005

Andrew said:
Andrew said:

In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?

Click to expand...

Click to expand...

[...]

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?

The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

Anno

Andrew · Nov 30, 2005

Anno said:
Andrew said:

In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?

Click to expand...

[...]

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?

Click to expand...

The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

OK, you've enlightened me on the conventions of perldoc -f, regarding
arguments, and I now understand the way things are.

However, it's still not clear _why_. Xho's argument about needing to
pass the count of grep, map, etc., isn't quite convincing. You can
always get a count of members of a list, like so:

scalar( @{[ ...LIST... ]} )

However, apparently, not the reverse. You can't stipulate that an
@array be forced in as a list, with some brief syntax (
force_list(@array) ).

Thus, i'm not certain that to trade off the convenience of
substr(@bunch_of_scalar_args) for counts was worth it. Perhaps there
are other reasons not mentioned in this thread(?)

andrew

Tad McClellan · Nov 30, 2005

Andrew said:
You can't stipulate that an
@array be forced in as a list, with some brief syntax (
force_list(@array) ).

map $_, @array;

xhoster · Nov 30, 2005

Tad McClellan said:
map $_, @array;

Nah, that won't work for what he wants. Now the map rather than the array
is being thrust into a scalar context, but the end result will be pretty
much the same.

The only way I know of is:
substr $array[0], $array[1], $array[2];

Which just isn't all that ugly.

Xho

John W. Krahn · Dec 1, 2005

Anno said:
The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

$ perl -le' print prototype "CORE::kill" '
@

According to kill()'s prototype it just accepts a list and intreprets the
first argument as a signal. Sort of like the way split() interprets its first
argument as a regular expression.

John

John W. Krahn · Dec 1, 2005

Andrew said:
Anno said:

The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

Click to expand...

OK, you've enlightened me on the conventions of perldoc -f, regarding
arguments, and I now understand the way things are.

However, it's still not clear _why_. Xho's argument about needing to
pass the count of grep, map, etc., isn't quite convincing. You can
always get a count of members of a list, like so:

scalar( @{[ ...LIST... ]} )

Or instead of copying the list to an array and dereferencing it:

scalar( () = ...LIST... )

John

Anno Siegel · Dec 1, 2005

John W. Krahn said:
$ perl -le' print prototype "CORE::kill" '
@

According to kill()'s prototype it just accepts a list and intreprets the
first argument as a signal. Sort of like the way split() interprets its first
argument as a regular expression.

Ah, but split() is very special, it doesn't behave like kill() at all.
Its first argument is passed on *as a regex* and not evaluated like
any normal function would do. That can't be done with a prototype,
it must be special-cased by the compiler.

On top of that, split() behaves as if the first parameter was prototyped
to a scalar. A list in that position works like the number of its
elements.

my $re = qr/ /;
my $str = 'haha ho2ho hihi';
my @l = ( $re, $str);

print "$_\n" for split $re, $str;
print "\n";

print "$_\n" for split @l;
print "\n";

print "$_\n" for split @l, $str;

ha
ho2ho
hihi

Use of uninitialized value in split at ./ttt line 12.

haha ho
ho hihi

Anno

John W. Krahn · Dec 1, 2005

Anno said:
Ah, but split() is very special, it doesn't behave like kill() at all.
Its first argument is passed on *as a regex* and not evaluated like
any normal function would do. That can't be done with a prototype,
it must be special-cased by the compiler.

On top of that, split() behaves as if the first parameter was prototyped
to a scalar. A list in that position works like the number of its
elements.

I was just pointing out that the first argument is treated differently then
the subsequent arguments in the list.

John

Masking by columns for grep	12	Jun 9, 2005
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
Can't solve problems! please Help	0	Sep 26, 2022
Perl debugger has wrong initial entry point and line numbers	1	Feb 9, 2005
Forcing list context	10	Dec 31, 2003
problem with algorithm for two threads	4	May 23, 2010
function pointer WITH variable length argument list	4	Apr 4, 2009
Pattern for allocating array objects with embedded header?	2	Dec 26, 2012

substr forces scalar context with array argument

Andrew

Paul Lalli

Paul Lalli

Andrew

xhoster

Andrew

xhoster

Paul Lalli

Anno Siegel

Andrew

Tad McClellan

xhoster

John W. Krahn

John W. Krahn

Anno Siegel

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads