undef($foo) versus $foo = undef()?

T

Tim McDaniel

Is there any reason to prefer one of

$foo = undef;

versus

undef $foo;

? (If you like, include the parens that show that undef is a
function.)

I prefer functions that are pure to those that alter its arguments,
ceteris paribus (no pun intended), so I think I prefer

$foo = undef;
 
U

Uri Guttman

TM> Is there any reason to prefer one of
TM> $foo = undef;

TM> versus

TM> undef $foo;

i prefer NEITHER. explicitly undefing a scalar usually signals a poor
design. usually you let variables exit scope and upon reentry they get
cleared by the my declaration. so when i see code like the above i look
around and try to find a way to not need to do that.

TM> I prefer functions that are pure to those that alter its arguments,
TM> ceteris paribus (no pun intended), so I think I prefer

TM> $foo = undef;

if you must choose one, that is better for a subtle reason. undef $foo
leads newbies to also say undef @bar or undef %bar. those seem to work
fine as the aggregates are cleared. but then that leads to this bad code
to see if there are any value or keys in them: defined( @bar ) or
defined( %bar ). that actually doesn't test if there is any actual data
in them but whether any memory has ever been allocated for them. you
normally just need to do a plain boolean test to see if they are empty
or not. the proper way to clear an aggregate (other than leaving scope
and having them clear by their my declaration) is to assign the empty
list () to them.

so it is a long way around but that is why i never do undef $foo and
almost never do $foo = undef.

in over 10k lines of stem code, i found 9 uses of undef and 4 of them
were this line in one module that needed to ignore the keys of a hash:
while( my( undef, $event ) = each %{$timer_events} ) {

4 of the uses were assigning undef in cases where you had to and the
scalar wasn't going out of scope (it was to be used in nearby code).

so i practice what i preach and don't use undef unless i have to and
when i do, i assign it vs call it for its side effect.

uri
 
I

Ilya Zakharevich

Is there any reason to prefer one of

$foo = undef;

versus

undef $foo;

The second one free()s the buffers associated with $foo. The second
one is 2 opcodes, the first one is 3.

Enjoy,
Ilya
 
T

Tim McDaniel

TM> Is there any reason to prefer one of
TM> $foo = undef;

TM> versus

TM> undef $foo;

i prefer NEITHER. explicitly undefing a scalar usually signals a poor
design. usually you let variables exit scope and upon reentry they
get cleared by the my declaration.

Yes. I do try to limit the scope of variables by using blocks, often
not connected to if or for, just

{
my %unique_xes;
...
}

or whatever.

But there have been a few cases where variable scopes don't nest
nicely, so I explicitly kill the variable as soon as I can.
undef $foo leads newbies to also say undef @bar or undef %bar. those
seem to work fine as the aggregates are cleared.

Well, they certainly work, unlike
@bar = undef;
%baz = undef;
!
but then that leads to this bad code to see if there are any value or
keys in them: defined( @bar ) or defined( %bar ).

"Doctor, doctor, it hurts when I do *this*!"
"Well, don't DO '*this*'!"
the proper way to clear an aggregate (other than leaving scope and
having them clear by their my declaration) is to assign the empty
list () to them.

Is there any functional difference between
undef @bar;
or
@bar = ();
(and similarly for %baz)?

So symmetry would prefer
undef $foo;
undef @bar;
undef %baz;
but value-ness would prefer
$foo = undef;
@bar = ();
%baz = ();
in over 10k lines of stem code, i found 9 uses of undef and 4 of them
were this line in one module that needed to ignore the keys of a
hash:
while( my( undef, $event ) = each %{$timer_events} ) {

Some wording in the man pages makes me wonder whether it would be just
as efficient to do

foreach my $event ( values %{$timer_events} ) {
 
U

Uri Guttman

TM> Well, they certainly work, unlike
TM> @bar = undef;
TM> %baz = undef;
TM> !

TM> Is there any functional difference between
TM> undef @bar;
TM> or
TM> @bar = ();
TM> (and similarly for %baz)?

yes. do one and the other and then test with defined. one is true and
the other is false. the fact that you can even call defined on an
aggregate is really a bug. p6 is eliminating that iirc.

TM> So symmetry would prefer
TM> undef $foo;
TM> undef @bar;
TM> undef %baz;
TM> but value-ness would prefer
TM> $foo = undef;
TM> @bar = ();
TM> %baz = ();

TM> Some wording in the man pages makes me wonder whether it would be just
TM> as efficient to do

TM> foreach my $event ( values %{$timer_events} ) {

but that creates a list of the values first. my code iterates over the
values with less storage needed and more speed.

uri
 
T

Tim McDaniel

TM> Is there any functional difference between
TM> undef @bar;
TM> or
TM> @bar = ();
TM> (and similarly for %baz)?

yes. do one and the other and then test with defined. one is true and
the other is false. the fact that you can even call defined on an
aggregate is really a bug. p6 is eliminating that iirc.

"man perlfunc" says

Use of "defined" on aggregates (hashes and arrays) is deprecated.
It used to report whether memory for that aggregate has ever been
allocated. This behavior may disappear in future versions of
Perl. You should instead use a simple test for size:

if (@an_array) { print "has array elements\n" }
if (%a_hash) { print "has hash members\n" }
TM> Some wording in the man pages makes me wonder whether it would
TM> be just as efficient to do

TM> foreach my $event ( values %{$timer_events} ) {

but that creates a list of the values first. my code iterates over
the values with less storage needed and more speed.

What made me wonder was

The values are returned in an apparently random order. The actual
random order is subject to change in future versions of perl, but
it is guaranteed to be the same order as either the "keys" or
"each" function would produce on the same (unmodified) hash. ...

As a side effect, calling values() resets the HASH's internal
iterator, see "each". (In particular, calling values() in void
context resets the iterator with no other overhead.)

Note that the values are not copied, which means modifying them
will modify the contents of the hash:

for (values %hash) { s/foo/bar/g } # modifies %hash values
for (@hash{keys %hash}) { s/foo/bar/g } # same

particularly that last. It could be generating a list for
values(%hash) and doing further magic to point to the real values
(multiplying the storage needed), or maybe there's an optimization in
such a simple case to make values() work as efficiently as each()?
I've seen people post timing test results here, but I've not looked
into the modules that conveniently implement them.
 
P

Peter J. Holzer

"man perlfunc" says

Use of "defined" on aggregates (hashes and arrays) is deprecated.
It used to report whether memory for that aggregate has ever been
allocated. This behavior may disappear in future versions of
Perl. You should instead use a simple test for size:

if (@an_array) { print "has array elements\n" }
if (%a_hash) { print "has hash members\n" }

Perhaps surprisingly, on at least some versions of perl,

if (keys %a_hash)

was substantially faster than

if (%a_hash)

If your script spends a lot of time checking whether a hash is empty,
you might want to benchmark both versions.
but that creates a list of the values first. my code iterates over
the values with less storage needed and more speed.

What made me wonder was [...]
Note that the values are not copied, which means modifying them
will modify the contents of the hash:

for (values %hash) { s/foo/bar/g } # modifies %hash values
for (@hash{keys %hash}) { s/foo/bar/g } # same

particularly that last. It could be generating a list for
values(%hash) and doing further magic to point to the real values
(multiplying the storage needed),

That's what it does (at least until 5.8.8, maybe it changed in 5.10).
or maybe there's an optimization in
such a simple case to make values() work as efficiently as each()?

Nope. Which makes each a lot more efficient for large hashes.
I've seen people post timing test results here, but I've not looked
into the modules that conveniently implement them.

I don't have any numbers at hand, but I've had a few scripts where
replacing values with each yielded rather impressive speedups (not to
mention that in some cases it was needed to fit into the 3GB limit of
32-bit Linux).

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top