undef($foo) versus $foo = undef()?

Tim McDaniel · Aug 19, 2009

Is there any reason to prefer one of

$foo = undef;

versus

undef $foo;

? (If you like, include the parens that show that undef is a
function.)

I prefer functions that are pure to those that alter its arguments,
ceteris paribus (no pun intended), so I think I prefer

$foo = undef;

Uri Guttman · Aug 19, 2009

TM> Is there any reason to prefer one of
TM> $foo = undef;

TM> versus

TM> undef $foo;

i prefer NEITHER. explicitly undefing a scalar usually signals a poor
design. usually you let variables exit scope and upon reentry they get
cleared by the my declaration. so when i see code like the above i look
around and try to find a way to not need to do that.

TM> I prefer functions that are pure to those that alter its arguments,
TM> ceteris paribus (no pun intended), so I think I prefer

TM> $foo = undef;

if you must choose one, that is better for a subtle reason. undef $foo
leads newbies to also say undef @bar or undef %bar. those seem to work
fine as the aggregates are cleared. but then that leads to this bad code
to see if there are any value or keys in them: defined( @bar ) or
defined( %bar ). that actually doesn't test if there is any actual data
in them but whether any memory has ever been allocated for them. you
normally just need to do a plain boolean test to see if they are empty
or not. the proper way to clear an aggregate (other than leaving scope
and having them clear by their my declaration) is to assign the empty
list () to them.

so it is a long way around but that is why i never do undef $foo and
almost never do $foo = undef.

in over 10k lines of stem code, i found 9 uses of undef and 4 of them
were this line in one module that needed to ignore the keys of a hash:
while( my( undef, $event ) = each %{$timer_events} ) {

4 of the uses were assigning undef in cases where you had to and the
scalar wasn't going out of scope (it was to be used in nearby code).

so i practice what i preach and don't use undef unless i have to and
when i do, i assign it vs call it for its side effect.

uri

Ilya Zakharevich · Aug 19, 2009

Is there any reason to prefer one of

$foo = undef;

versus

undef $foo;

The second one free()s the buffers associated with $foo. The second
one is 2 opcodes, the first one is 3.

Enjoy,
Ilya

Tim McDaniel · Aug 19, 2009

TM> Is there any reason to prefer one of
TM> $foo = undef;

TM> versus

TM> undef $foo;

i prefer NEITHER. explicitly undefing a scalar usually signals a poor
design. usually you let variables exit scope and upon reentry they
get cleared by the my declaration.

Yes. I do try to limit the scope of variables by using blocks, often
not connected to if or for, just

{
my %unique_xes;
...
}

or whatever.

But there have been a few cases where variable scopes don't nest
nicely, so I explicitly kill the variable as soon as I can.

undef $foo leads newbies to also say undef @bar or undef %bar. those
seem to work fine as the aggregates are cleared.

Well, they certainly work, unlike
@bar = undef;
%baz = undef;
!

but then that leads to this bad code to see if there are any value or
keys in them: defined( @bar ) or defined( %bar ).

"Doctor, doctor, it hurts when I do *this*!"
"Well, don't DO '*this*'!"

the proper way to clear an aggregate (other than leaving scope and
having them clear by their my declaration) is to assign the empty
list () to them.

Is there any functional difference between
undef @bar;
or
@bar = ();
(and similarly for %baz)?

So symmetry would prefer
undef $foo;
undef @bar;
undef %baz;
but value-ness would prefer
$foo = undef;
@bar = ();
%baz = ();

in over 10k lines of stem code, i found 9 uses of undef and 4 of them
were this line in one module that needed to ignore the keys of a
hash:
while( my( undef, $event ) = each %{$timer_events} ) {

Some wording in the man pages makes me wonder whether it would be just
as efficient to do

foreach my $event ( values %{$timer_events} ) {

Uri Guttman · Aug 19, 2009

TM> Well, they certainly work, unlike
TM> @bar = undef;
TM> %baz = undef;
TM> !

TM> Is there any functional difference between
TM> undef @bar;
TM> or
TM> @bar = ();
TM> (and similarly for %baz)?

yes. do one and the other and then test with defined. one is true and
the other is false. the fact that you can even call defined on an
aggregate is really a bug. p6 is eliminating that iirc.

TM> So symmetry would prefer
TM> undef $foo;
TM> undef @bar;
TM> undef %baz;
TM> but value-ness would prefer
TM> $foo = undef;
TM> @bar = ();
TM> %baz = ();

TM> Some wording in the man pages makes me wonder whether it would be just
TM> as efficient to do

TM> foreach my $event ( values %{$timer_events} ) {

but that creates a list of the values first. my code iterates over the
values with less storage needed and more speed.

uri

Tim McDaniel · Aug 19, 2009

TM> Is there any functional difference between
TM> undef @bar;
TM> or
TM> @bar = ();
TM> (and similarly for %baz)?

yes. do one and the other and then test with defined. one is true and
the other is false. the fact that you can even call defined on an
aggregate is really a bug. p6 is eliminating that iirc.

"man perlfunc" says

Use of "defined" on aggregates (hashes and arrays) is deprecated.
It used to report whether memory for that aggregate has ever been
allocated. This behavior may disappear in future versions of
Perl. You should instead use a simple test for size:

if (@an_array) { print "has array elements\n" }
if (%a_hash) { print "has hash members\n" }

TM> Some wording in the man pages makes me wonder whether it would
TM> be just as efficient to do

TM> foreach my $event ( values %{$timer_events} ) {

but that creates a list of the values first. my code iterates over
the values with less storage needed and more speed.

What made me wonder was

The values are returned in an apparently random order. The actual
random order is subject to change in future versions of perl, but
it is guaranteed to be the same order as either the "keys" or
"each" function would produce on the same (unmodified) hash. ...

As a side effect, calling values() resets the HASH's internal
iterator, see "each". (In particular, calling values() in void
context resets the iterator with no other overhead.)

Note that the values are not copied, which means modifying them
will modify the contents of the hash:

for (values %hash) { s/foo/bar/g } # modifies %hash values
for (@hash{keys %hash}) { s/foo/bar/g } # same

particularly that last. It could be generating a list for
values(%hash) and doing further magic to point to the real values
(multiplying the storage needed), or maybe there's an optimization in
such a simple case to make values() work as efficiently as each()?
I've seen people post timing test results here, but I've not looked
into the modules that conveniently implement them.

Peter J. Holzer · Aug 19, 2009

"man perlfunc" says

Use of "defined" on aggregates (hashes and arrays) is deprecated.
It used to report whether memory for that aggregate has ever been
allocated. This behavior may disappear in future versions of
Perl. You should instead use a simple test for size:

if (@an_array) { print "has array elements\n" }
if (%a_hash) { print "has hash members\n" }

Perhaps surprisingly, on at least some versions of perl,

if (keys %a_hash)

was substantially faster than

if (%a_hash)

If your script spends a lot of time checking whether a hash is empty,
you might want to benchmark both versions.

but that creates a list of the values first. my code iterates over
the values with less storage needed and more speed.

Click to expand...

What made me wonder was [...]
Note that the values are not copied, which means modifying them
will modify the contents of the hash:

for (values %hash) { s/foo/bar/g } # modifies %hash values
for (@hash{keys %hash}) { s/foo/bar/g } # same

particularly that last. It could be generating a list for
values(%hash) and doing further magic to point to the real values
(multiplying the storage needed),

That's what it does (at least until 5.8.8, maybe it changed in 5.10).

or maybe there's an optimization in
such a simple case to make values() work as efficiently as each()?

Nope. Which makes each a lot more efficient for large hashes.

I've seen people post timing test results here, but I've not looked
into the modules that conveniently implement them.

I don't have any numbers at hand, but I've had a few scripts where
replacing values with each yielded rather impressive speedups (not to
mention that in some cases it was needed to fit into the 3GB limit of
32-bit Linux).

hp

Why does sort return undef in scalar context ?	62	Aug 29, 2011
FAQ 7.22 What's the difference between calling a function as &foo and foo()?	0	Feb 15, 2011
return and undef	8	Oct 4, 2005
undef in Hash of Arrays	2	Jun 25, 2006
FAQ 7.20 Why doesn't "my($foo) = <FILE>;" work right?	0	Feb 28, 2011
How to test if global has been assigned to?	3	Mar 11, 2009
const foo * and const foo &	3	Jan 26, 2011
#define and #undef influence over all the files (Multiple C Files)	3	Nov 27, 2008

undef($foo) versus $foo = undef()?

Tim McDaniel

Uri Guttman

Ilya Zakharevich

Tim McDaniel

Uri Guttman

Tim McDaniel

Peter J. Holzer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads