substr forces scalar context with array argument

A

Andrew

In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere. Could it be an oversight/bug in substr? (probably not,
given the thoroughness that has gone into the development of perl
functions, but this is unperl-like limitation :)

my $s='abracadabra';
my ($offset,$length)=@tmp=(3,4);

my @code=(
'substr($s, $offset, $length)',
'substr($s, @tmp)',
'substr($s, @tmp[0,1])',
'substr($s, $tmp[0], $tmp[1])'
);


foreach $c (@code) {
print "\n", $c, ' ---> ', eval($c);
}


#-------------- output -------------------

substr($s, $offset, $length) ---> acad
substr($s, @tmp) ---> racadabra
substr($s, @tmp[0,1]) ---> cadabra
substr($s, $tmp[0], $tmp[1]) ---> acad

#------------ end output ----------------

( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )

TIA

andrew
 
P

Paul Lalli

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

$ perldoc -f substr
substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

substr is expecting between 2 and 4 scalars. So when you pass it an
array, yes, it converts that array to a scalar. That's what it's
documented to do.
Could it be an oversight/bug in substr? (probably not,
given the thoroughness that has gone into the development of perl
functions, but this is unperl-like limitation :)

I don't see it as a limitation at all. How frequently are you going to
have such distinct values as an offset and a length contained in a
single array? Unless you do so very intentionally, it's just not
likely to happen.

Paul Lalli
 
P

Paul Lalli

Andrew said:
#-------------- output -------------------

substr($s, $offset, $length) ---> acad
substr($s, @tmp) ---> racadabra
substr($s, @tmp[0,1]) ---> cadabra
substr($s, $tmp[0], $tmp[1]) ---> acad

#------------ end output ----------------

( Also, FWIW, using '@tmp[0,1]' yields a weird result I can't explain )

Didn't see this comment originally. The issue here is that a list
slice returns a list, not an array. A list in scalar context returns
the last item of that list. Therefore,
substr($s, @tmp[0,1]);
is equivalent to:
substr($s, 4);
since $tmp[1] == 4.

Paul Lalli
 
A

Andrew

In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so? Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

$ perldoc -f substr
substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

substr is expecting between 2 and 4 scalars. So when you pass it an
array, yes, it converts that array to a scalar. That's what it's
documented to do.

Yes, I had looked at "perldoc -f substr" before posting. Au contraire,
the fact that "substr is expecting between 2 and 4 scalars", as you put
it, is NOT documented. The word 'scalar' is nowhere to be found in the
doc.

Furthermore, I suppose we are running into the issue of semantics
"versus" syntax, and the delineation between the two. What a perl list
means "semantically" (conceptually) may be something different from
what it is syntactically.

The documentation bit "substr EXPR,OFFSET,LENGTH,REPLACEMENT" certainly
suggests a list. And, in as much as an @array is a (semantic?)
"representation" of multiple scalars, i may be misled to substitute
this representation for that which it represents.

In ordinary subroutines that i build, "($x,$y)" and "@tmp" ( where
@tmp=($x,$y) ) are certainly interchangeable as arguments to the
subroutine (grabbed via @_ or $_[0], $_[1], ..., etc., within the sub).
Using either yields exact same results.

Thus, it appears to me, that the "substr" function does something out
of the norm with the arguments, distinguishing between and acting
differently with the two SYNTACTICAL variations (of that which is
semantically equivalent)

I don't see it as a limitation at all. How frequently are you going to
have such distinct values as an offset and a length contained in a
single array? Unless you do so very intentionally, it's just not
likely to happen.

Real-life example: I have to extract substrings of different offsets
and lengths (with many iterations). I define my offsets and lengths in
a List of Lists:

@LoL=( [1,4], [5,3], [10,3], [54, 5], [9,2], .... etc. );

Then (for every iteration) i want to grab my multiple substrings like
so:

@substrings=map { substr($string, @{$LoL[$_]}) } (0..$#LoL);

OK, i realize, i can "fix" this by adding more code:

@substrings=map { substr($string, $LoL[$_][0], $LoL[$_][1] ) }
(0..$#LoL);

But what if my offsets and lengths where in a List of Hashes of Lists
of Lists? :)
And, the principle of it... syntactical brevity... which, in my
experience, has always been a hallmark of Perl...particularly vis a vis
arrays/hashes/scalars.

andrew
 
X

xhoster

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so?

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Is there a way to force list context
(besides explicit enumeration)? I can't seem to find any mention of
this anywhere.

perldoc -f substr

substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

Nowhere does it say LIST, so I wouldn't expect it to take an array
interpretted as a list.

Xho
 
A

Andrew

Andrew said:
In the code below, the array '@tmp' with two elements is treated as its
scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr does
what i want. Why is this so?

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).
perldoc -f substr

substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET

Nowhere does it say LIST, so I wouldn't expect it to take an array
interpretted as a list.

Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.
So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit. Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

andrew
 
X

xhoster

Andrew said:
Andrew said:
In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?

Because substr has a prototype demanding scalars.

~/perl_misc:$ perl -le 'print prototype "CORE::substr"'
$$;$$

Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

In the 2nd and 3rd slots, it makes sense to demand scalars, because it may
be natural to use (say) the scalar value (i.e. count) of a grep, map, or
array to specify the offset or length. It wouldn't seem particularly
natural to use those counts as a string (the first argument), but then
again the alternative behavior isn't natural either. If the 2nd and 3rd
arguments are interpreted as scalars, the first ones is going to be, too.

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?
Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).

And this case is no exception.
Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.

Not a reader who knows about the format used in perldoc. If it doesn't
say LIST, it is not a LIST. Can you imagine how longs the docs would be if
they had to add a tag [And here we didn't say LIST because we didn't mean
LIST] after every other word?
So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit.

If you proposed actual verbage to implement that greater explicitness, I'd
take you notion more seriously.
Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

I don't know if there is a part of perldoc that describes the format by
which is describes the format by which it describes the formats it
describes. If there is, I'm glad I've never needed to consult it.

Xho
 
P

Paul Lalli

Andrew said:
Is there a reason the prototype is demanding scalars? (Or is it simply
"the king's decree"?)

Because it makes the most sense to use a scalar there.
Again, in my observation and experience, perl particulars are designed
with careful consideration of all the tradeoffs, and the most optimal
(and usually intuitive) alternative is chosen (for syntax and
functionality).

Yes. As it is here. Yours is the first post I can ever remember
seeing wishing it was the other way around.
Assuming that with the above you are rebutting my comment that "the
word 'scalar' is nowhere to be found...": So the documentation does not
mention either lists or scalars, which may leave the reader guessing.
So, the purpose of my response to yours is to prevent the quick
dismissal of the notion that the documentation /may/ need to be a bit
more explicit. Although, admittedly, I may be wrong about that...
particularly, if this is covered somewhere else in the docs.

You should probably read the first two paragraphs of
perldoc perlfunc
which is the overall enclosing entity of the `perldoc -f` syntax.

Paul Lalli
 
A

Anno Siegel

Andrew said:
In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?
[...]

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?

The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

Anno
 
A

Andrew

Anno said:
Andrew said:
In the code below, the array '@tmp' with two elements is treated as
its scalar (the number 2), when used as argument to substr. When I
syntactically break up '@tmp' into '$tmp[0]' and '$tmp[1]', substr
does what i want. Why is this so?
[...]

Aside from which, I am aware of no built-ins taking an array treated as a
list in which the list is heterogenous. Why would substr be different?

The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

OK, you've enlightened me on the conventions of perldoc -f, regarding
arguments, and I now understand the way things are.

However, it's still not clear _why_. Xho's argument about needing to
pass the count of grep, map, etc., isn't quite convincing. You can
always get a count of members of a list, like so:

scalar( @{[ ...LIST... ]} )

However, apparently, not the reverse. You can't stipulate that an
@array be forced in as a list, with some brief syntax (
force_list(@array) ).

Thus, i'm not certain that to trade off the convenience of
substr(@bunch_of_scalar_args) for counts was worth it. Perhaps there
are other reasons not mentioned in this thread(?)

andrew
 
X

xhoster

Tad McClellan said:
map $_, @array;

Nah, that won't work for what he wants. Now the map rather than the array
is being thrust into a scalar context, but the end result will be pretty
much the same.

The only way I know of is:
substr $array[0], $array[1], $array[2];

Which just isn't all that ugly.

Xho
 
J

John W. Krahn

Anno said:
The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

$ perl -le' print prototype "CORE::kill" '
@

According to kill()'s prototype it just accepts a list and intreprets the
first argument as a signal. Sort of like the way split() interprets its first
argument as a regular expression.


John
 
J

John W. Krahn

Andrew said:
Anno said:
The kill() function is an exception. Described as "kill SIGNAL, LIST",
it accepts an array argument whose first element is the signal and
the others are pids:

perl -le '@l = ( "HUP", $$); kill @l'

kills itself.

I agree that Perl builtins normally don't work like that. The behavior
of kill() may be unintentional.

OK, you've enlightened me on the conventions of perldoc -f, regarding
arguments, and I now understand the way things are.

However, it's still not clear _why_. Xho's argument about needing to
pass the count of grep, map, etc., isn't quite convincing. You can
always get a count of members of a list, like so:

scalar( @{[ ...LIST... ]} )

Or instead of copying the list to an array and dereferencing it:

scalar( () = ...LIST... )


John
 
A

Anno Siegel

John W. Krahn said:
$ perl -le' print prototype "CORE::kill" '
@

According to kill()'s prototype it just accepts a list and intreprets the
first argument as a signal. Sort of like the way split() interprets its first
argument as a regular expression.

Ah, but split() is very special, it doesn't behave like kill() at all.
Its first argument is passed on *as a regex* and not evaluated like
any normal function would do. That can't be done with a prototype,
it must be special-cased by the compiler.

On top of that, split() behaves as if the first parameter was prototyped
to a scalar. A list in that position works like the number of its
elements.

my $re = qr/ /;
my $str = 'haha ho2ho hihi';
my @l = ( $re, $str);

print "$_\n" for split $re, $str;
print "\n";

print "$_\n" for split @l;
print "\n";

print "$_\n" for split @l, $str;

ha
ho2ho
hihi

Use of uninitialized value in split at ./ttt line 12.

haha ho
ho hihi


Anno
 
J

John W. Krahn

Anno said:
Ah, but split() is very special, it doesn't behave like kill() at all.
Its first argument is passed on *as a regex* and not evaluated like
any normal function would do. That can't be done with a prototype,
it must be special-cased by the compiler.

On top of that, split() behaves as if the first parameter was prototyped
to a scalar. A list in that position works like the number of its
elements.

I was just pointing out that the first argument is treated differently then
the subsequent arguments in the list. :)


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top