split() and @_: Perl changed between 5.8 and 5.14

K

Kenny McCormack

I have an old program that I wrote back in 2001, which has worked fine ever
since - right up until today, when I ran it for the first time in quite a
while. The script depends on the fact that (when it was written) when you
do split(), it puts the data into @_.

From what I can tell, the following are all true. Please confirm or deny:

1) In 5.8, this worked.
2) Somewhere along the way, this usage became "deprecated". I found a web
site that explicitly said that, while the usage is deprecated, it still
works, since if it was removed, old code (heh heh - such as mine) would
get broken.
3) In 5.14, it doesn't work. No error or warning message is generated, but
@_ is left unchanged.

P.S. I changed the program line from something like:

$x = @_[split(...)-1];

to:

@tmp = split(...);
$x = @tmp[@tmp-1];

And everything seems to be working fine now.

--
One of the best lines I've heard lately:

Obama could cure cancer tomorrow, and the Republicans would be
complaining that he had ruined the pharmaceutical business.

(Heard on Stephanie Miller = but the sad thing is that there is an awful lot
of direct truth in it. We've constructed an economy in which eliminating
cancer would be a horrible disaster. There are many other such examples.)
 
R

Rainer Weikusat

(e-mail address removed) (Kenny McCormack) writes:

[...]
P.S. I changed the program line from something like:

$x = @_[split(...)-1];

to:

@tmp = split(...);
$x = @tmp[@tmp-1];

And everything seems to be working fine now.

Assuming that all you're interested in is what would be the last
element of the list created by split, that your 'split regex' was
/ / and the unsplitted value contained in a variable named $v, you
could achieve the same with either

($x) = $v =~ /(?:.* )?(.*)/

or

$v =~ /(?:.* )?(.*)/ and $x = $1

This should also work when using any other 'split regex' instead of
the single space used in this example.
 
K

Kenny McCormack

Quoth (e-mail address removed) (Kenny McCormack):

Yes, with a warning.

With the caveat that I probably don't have the warning level set high enough
(that's comp.lang.c-speak for it. In the perl world, I probably need some
extra library or something linked in), the fact is that I never got any
warnings on this.

One of the systems that runs this script is running Perl 5.8.6; I just
tested the script and it ran correctly, with no warnings generated.
This usage has been deprecated, and printing a warning, since 5.6. What
version of perl were you using in 2001?

As mentioned above, it runs fine with no warnings under 5.8 (today).
What version of Perl was current on Linux (x86), c. 2001?
P.S. I changed the program line from something like:

$x = @_[split(...)-1];

Oh, *yuck*! You do realise this is implicitly populating @_ at the same
time as you're calculating its subscript?

Yup. That's why they call Perl a "write-only" language.

Anyway, thanks much for your response. It has answered my questions and
clarified my knowledge of the situation.
 
R

Rainer Weikusat

[...]
The general policy now is that you get one major version's worth of
deprecation warnings, and then deprecated features will be removed.

I'm presently using perl 5.10.0 and 5.10.1 because these happen to be
the packaged perl versions available on the Debian (Lenny and Squeeze)
and Ubuntu (10.04) I need to support. Judeging from the current state
of the Debian unstable repository, the next perl version I will likely
need to support will be 5.14 which means that anything which got
deprecated in perl 5.12 will have been removed by then and insofar the
existing code I need to use happened to rely on such a deprecated
feature, it will just break silently. Presently, this is more than
30,700 LOC and I inherited some parts of it. I'm a single person and I
would be glad if I could get new code written as fast as my boss
would like to have it (who is already not overly happy with me using
Perl at all because it can't be compiled into some 'obscure' binary
format). All of this together is a very good argument for not using
Perl at all: At best, it is now one of many languages which mutate
quickly in unpredictable ways at the momentary whims of some
fluctuating set of people and anybody who isn't prepared to maintain a
private perl5 fork with the required set of features should rather
avoid it.
 
R

Rainer Weikusat

(e-mail address removed) (Kenny McCormack) writes:

[...]
$x = @_[split(...)-1];

Oh, *yuck*! You do realise this is implicitly populating @_ at the same
time as you're calculating its subscript?

Yup. That's why they call Perl a "write-only" language.

A old joke-example of a German sentence goes like this: Derjenige, der
den Taeter, der den Pfahl, der an der Bruecke, die auf dem Weg nach
Worms liegt, steht, umgeworfen hat, anzeigt, erhaelt eine Belohnung.

The language doesn't use itself and if some text should be regarded as
'write-only' or not is entirely a matter of the style used to write
it: You may be 'a write-only coder' but Perl (minus you) is not 'a
write-only language'.
 
P

Peter J. Holzer

Quoth (e-mail address removed) (Kenny McCormack):
I have an old program that I wrote back in 2001, which has worked fine ever
since - right up until today, when I ran it for the first time in quite a
while. The script depends on the fact that (when it was written) when you
do split(), it puts the data into @_. [...]
2) Somewhere along the way, this usage became "deprecated". I found a web
site that explicitly said that, while the usage is deprecated, it still
works, since if it was removed, old code (heh heh - such as mine) would
get broken.

This usage has been deprecated, and printing a warning, since 5.6.

Even before that:

% /usr/bin/perl foo
Use of implicit split to @_ is deprecated at foo line 5.
....
% /usr/bin/perl -v

This is perl, version 5.005_03 built for i386-linux

Copyright 1987-1999, Larry Wall
....

I wouldn't be surprised if it was deprecated since 5.0.

hp
 
R

Rainer Weikusat

Ben Morrow said:
So, some time before that happens, build a 5.12 (or a 5.14) and test the
code on that. I'm sure that's not beyond you.

,----
| Presently, this is more than 30,700 LOC and I inherited some parts of
| it. I'm a single person and I would be glad if I could get new code
| written as fast as my boss would like to have it (who is already not
| overly happy with me using Perl at all because it can't be compiled
| into some 'obscure' binary format).
`----

It is certainly not 'beyond me' in a technical sense to spend 1 - 3
weeks with nothing but retesting working code and maybe another with
changing it such that it is again 'politically correct' for the
current definition of that. I could perhaps do this while on holiday,
but - unfortunately - I didn't have a single holiday since 2008, so
that's not really an option. It is certainly beyond the patience of my
employer to do this at any other time, though ...

For as long as nothing gets deprecated which I really do need (leaving
the issue with the inerited code, ie, code written by 'differently
abled' ex-colleagues and some heavily modified CPAN modules, aside),
everything is going to be fine and as soon as this does happen, I will
necessarily stop using newer Perl releases.

NB: This is not so much a statement about me but about the inherent
problems with any policy of this kind. As soon as people use code they
are not familiar with in detail (eg, because they happily download
everything CPAN has to offer :->), they're essentially fucked if
changes to the runtime environment used to run this code render it
incompatible with the version they're using and they will then either
dump this runtime environment altogether or 'opt out' of its future
development.
 
R

Rainer Weikusat

Ben Morrow said:
You like that phrase rather too much, but it's not relevant here.

There was a time when someone considered split splitting into @_ in
scalar context a good idea. Presumably, at that time, someone whose
opinion differed from that was considered to be wrong. Most people
presumably didn't care: The feature existed. They either used it or
didn't use it. Then, times changed and the predominant opinion
changed, too, first to "this wasn't really a good idea" and then to
"it was such a bad idea that it needs to go away". This kind of
'opinion rotation' (whatever an issue at hand happens to be, their
will always people whose opinions on it differ and they're always all
convinced to have good reasons) is - completely correctly - referred
to as 'politics'.

[...]
There is a large section of the p5p community which feel much as you do,
for much the same reasons, without necessarily being quite so dogmatic
about it.

I didn't write anything about my 'feelings', I described a technical
problem I could be facing and the options I have for dealing with it.

[...]
Well, on your head be it. There have been one or two security holes
found in perl itself in the past; if you stop upgrading you won't get
fixes if any more are found.

Can you imagine that people exist who couldn't care less about that
and that these people - more often than not - are in a position to
give orders to others? Again, I didn't write anything about my opinion
on this because it is completely irrelevant: In face of the options

a) spent a signficant amount of time testing or changing
working production code because someone made an incompatible
change to a new version of some infrastructure code

b) continue using the last known-to-be-working version

the only technically possible choice is b).
 
R

Rainer Weikusat

[...]
Well, on your head be it. There have been one or two security holes
found in perl itself in the past; if you stop upgrading you won't get
fixes if any more are found.

JFTR: I'm completely fine with fixing any issue in Perl which affects
me (this includes 'security problems') myself if there is no other
option. I'm already doing this with quite a few other 'large gobs of
C' (eg, Linux) where updateing isn't really feasible because fixing
the occasional problem can be done quickly while major 'redevelopment
efforts' (aka 'retest and change or rewrite existing code') is usually
not an option.
 
P

Peter Makholm

Ben Morrow said:
There is a large section of the p5p community which feel much as you do,
for much the same reasons, without necessarily being quite so dogmatic
about it. Jesse Vincent, the current holder of the pumpkin, is one of
them, so there's no need to panic about things disappearing without a
rather good reason.

I believe that Ricardo Signes is the current holder. He took over from
Jesse somewhere in the 5.15 series. This doesn't change the above, as I
understand it Ricardo also supports the policies and goals championed by
Jesse.

//Makholm
 
R

Rainer Weikusat

[...]
There was a time when someone considered split splitting into @_ in
scalar context a good idea.

As a historical remark targetted at people who are possibly not aware
of that (and in defense of the original choice :):

The behaviour of shift and split regarding @_ is consistent with the
way a Bourne-style UNIX(*) shell treats the so-called 'positional
parameters' which are usually bound to the 'arguments' passed to the
current 'execution context', eg a shell function

show()
{
echo $1
}

will invoke the echo command with its first argument. In Perl, this
would look like this:

sub show
{
print $_[0], "\n";
}

Modifying this as follows:

show()
{
shift
echo $1
}

or

sub show
{
shift;
print $_[0], "\n";
}

would shift the (virtual) argument array one position to the left, ie
throw throw the first argument away and print the second. The shell
also supports a split operations which splits a string into 'words'
separated by the current value of the IFS (internal field separator)
variable. These words are then assigned to the positional parameters,
eg

show()
{
IFS_="$IFS"
IFS=' '
set -- $1
IFS="$IFS_"

echo $2
}

will split the first argument into 'words' using a single space as
separator and print the second of these. In Perl with the original
split, this would look like this:

sub show
{
+split(/ /, $_[0]);
print $_[1], "\n";
}

I assume the reason the Bourne-shell behaves in this way is probably an
efficiency hack from the late 1970s and IMHO, imitating this behaviour
in Perl wasn't a good idea: Perl has arrays, the Bourne-shell language
doesn't and having to force split into a scalar context 'somehow'
doesn't really improve this.
 
C

C.DeRykus

I have an old program that I wrote back in 2001, which has worked fine ever

since - right up until today, when I ran it for the first time in quite a

while. The script depends on the fact that (when it was written) when you

do split(), it puts the data into @_.



From what I can tell, the following are all true. Please confirm or deny:



1) In 5.8, this worked.

2) Somewhere along the way, this usage became "deprecated". I found a web

site that explicitly said that, while the usage is deprecated, it still

works, since if it was removed, old code (heh heh - such as mine) would

get broken.

3) In 5.14, it doesn't work. No error or warning message is generated, but

@_ is left unchanged.



P.S. I changed the program line from something like:



$x = @_[split(...)-1];



to:



@tmp = split(...);

$x = @tmp[@tmp-1];

Shorter:

$x = $tmp[-1];

Or, sidestep warnings, if it's gotta be a one-liner:

$x = @{[split(...)]}[-1];
 
T

Tim McDaniel

P.S. I changed the program line from something like:

$x = @_[split(...)-1];

to:

@tmp = split(...);
$x = @tmp[@tmp-1];

Oddly enough, in perl 5.8.8, there's no warning for that assignment to
$x, but
$x = @tmp[2];
(when @tmp has at least three elements, not undef, &c &c) produces
Scalar value @tmp[2] better written as $tmp[2] at local/test/080.pl line 16.

@tmp[@tmp-1] is an array slice. It returns a list of values, one for
each subscript. Since there is only one value of subscript, it
returns a list of one value.

$x = SOME_LIST assigns to $x the last value of SOME_LIST, just as with
$x = (this, that, the_other, irrelevant, ignored, the_value_assigned);
With a list of one element, it will assign that value to $x.

In sum, that works, but it's generally considered better style to write
$x = $tmp[@tmp-1];
assigning a scalar to a scalar. And, as I indicated, I'm surprised
that there was no warning for the original version.
 
C

C.DeRykus

Kenny McCormack said:
P.S. I changed the program line from something like:
$x = @_[split(...)-1];

@tmp = split(...);
$x = @tmp[@tmp-1];



Oddly enough, in perl 5.8.8, there's no warning for that assignment to

$x, but

$x = @tmp[2];

(when @tmp has at least three elements, not undef, &c &c) produces

Scalar value @tmp[2] better written as $tmp[2] at local/test/080.pl line 16.

A bit far-fetched but here's an example of how
things could go wrong due to context:

perl -E '@tmp = 0..3; sub foo{@tmp};
$foo[0]= $tmp[&foo]; say "\@foo=@foo";
@foo[0]= @tmp[&foo]; say "\@foo=@foo"'

$foo[0]= # @tmp in scalar c. = 4 so $tmp[4]=undef
@foo[0]=0 # @tmp in list c. so 1st element only
@tmp[@tmp-1] is an array slice. It returns a list of values, one for

each subscript. Since there is only one value of subscript, it

returns a list of one value.

$x = SOME_LIST assigns to $x the last value of SOME_LIST, just as with

$x = (this, that, the_other, irrelevant, ignored, the_value_assigned);

With a list of one element, it will assign that value to $x.



In sum, that works, but it's generally considered better style to write

$x = $tmp[@tmp-1];

assigning a scalar to a scalar.


True and, since the OP just wanted the last array
member, the clearest idiom is just: $x = $tmp[-1]
 
R

Rainer Weikusat

[...]
In sum, that works, but it's generally considered better style to write

$x = $tmp[@tmp-1];

assigning a scalar to a scalar.


True and, since the OP just wanted the last array
member, the clearest idiom is just: $x = $tmp[-1]

Well, except that splitting the string into n components in order
extract the last part which is separated by 'some regex' from any
preceding parts (if any) isn't a particularly good idea.

---------
use Benchmark;

my $a = 'a 'x16;
$a .= 'b';

timethese(-5,
{
split => sub {
my @tmp;

@tmp = split(/ /, $a);
return $tmp[-1];
},

re => sub {
$a =~ /^(?:.* )?(.*)/ and return $1;
}});
 
T

Tim McDaniel

[...]
In sum, that works, but it's generally considered better style to write

$x = $tmp[@tmp-1];

assigning a scalar to a scalar.


True and, since the OP just wanted the last array
member, the clearest idiom is just: $x = $tmp[-1]

Well, except that splitting the string into n components in order
extract the last part which is separated by 'some regex' from any
preceding parts (if any) isn't a particularly good idea.

---------
use Benchmark;

my $a = 'a 'x16;
$a .= 'b';

timethese(-5,
{
split => sub {
my @tmp;

@tmp = split(/ /, $a);
return $tmp[-1];
},

re => sub {
$a =~ /^(?:.* )?(.*)/ and return $1;
}});

You presuppose for this that execution time is the main or only
foundation of goodness. I can tell what the split&[-1] above does
instantly; I'd have to look up ?: and think a moment about what the
regex does; they don't do the same things for trailing whitespace.
If it's not on a major execution path, I'd much prefer split&[-1] for
my own readability.
 
R

Rainer Weikusat

(e-mail address removed) (Tim McDaniel) writes:

[...]
splitting the string into n components in order
extract the last part which is separated by 'some regex' from any
preceding parts (if any) isn't a particularly good idea.
[...]
re => sub {
$a =~ /^(?:.* )?(.*)/ and return $1;
}});

You presuppose for this that execution time is the main or only
foundation of goodness. I can tell what the split&[-1] above does
instantly; I'd have to look up ?: and think a moment about what the
regex does;

You presuppose that 'ignorance is bliss' is an universal truth
:). Assuming I didn't knew about :)?) and I came accross it
somewhere in existing code, I would look it up and insofar I'd judge
the use made of it as 'reasonably straight-forward exploitation of an
existing facility' (admittedly, subjective) I'd be happy that I learnt
something new which will likely be of use to me in future.
they don't do the same things for trailing whitespace.

It also doesn't boil eggs or send greeting cards timely in case of
relatives' birthdays or any number of other things, even a lot
computer-related ones: A formal definition of correctness I remember
was 'assuming the preconditions were true initially and the invariant
conditions stayed true during execution, the postconditions will be
true afterwards'. It follows that any correct piece of code can be
rendered incorrect by modifying any of these three sets in a suitable
way.
If it's not on a major execution path, I'd much prefer split&[-1] for
my own readability.

Oh, it will be on a major execution path eventually, maybe tomorrow or
next year or after some kind of 'junior programmer' copied&pasted the
'known to be working' code into a hard realtime system or just because
your assumption was wrong.

It is generally better to avoid problems which can be easily avoided
instead of waiting until the pile of 1st level support guys who died
of nervous exhaustion while hectically trying to convince customers
that what hit them isn't really a problem becomes so high that getting
through the office door becomes difficult (The obvious alternative is
to always change jobs quickly enough that the obnoxious burden of
'making the code actually work well enough to solve the problem' falls
onto someone else. NB: I've encountered quite a few people who acted
in this way).

NB^2: This text is supposed to be somewhat tongue-in-cheek and not
intended to be offensive or abusive.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top