References - problem understanding them

Martijn Lievaart · Aug 21, 2006

Sure, this will work, but the entire hash gets copied when the sub
returns. For large data structures this can be very inefficient. Passing
around references is much more memory and time efficient for large and
complex data structures.

Although excelent advice in heneral, in this case I would expect Perl to
optimize it out. A return of something can be implemented as copy-destroy,
or as a 'move'. The second is much more efficient.

M4

David Squire · Aug 21, 2006

Martijn said:
Although excelent advice in heneral, in this case I would expect Perl to
optimize it out. A return of something can be implemented as copy-destroy,
or as a 'move'. The second is much more efficient.

It doesn't though:

----

#!/usr/bin/perl

use strict;
use warnings;

my %content2 = test2();
print "\nIn main:\n";
print \%content2, "\n";
print \$content2{'one'}, "\n";
for (keys %content2)
{ print "$_: ",$content2{$_},"\n" }

sub test2
{
my %content2 = (one => "eins", two => "zwei");
print "In sub:\n";
print \%content2, "\n";
print \$content2{'one'}, "\n";
return %content2;
}

----

Output:

In sub:
HASH(0x18153c0)
SCALAR(0x1801380)

In main:
HASH(0x180d00c)
SCALAR(0x1801434)
one: eins
two: zwei

----

This is perl, v5.8.6

DS

Uri Guttman · Aug 21, 2006

ML> Although excelent advice in heneral, in this case I would expect
ML> Perl to optimize it out. A return of something can be implemented
ML> as copy-destroy, or as a 'move'. The second is much more
ML> efficient.

it is trickier than just that. and perl 6 will be doing lazy copies and
copy on write so that will be optimizable. it would be (almost?)
impossible to make perl5 do this internally as it uses a stack to handle
sub arguments and returns.

uri

xhoster · Aug 21, 2006

Martijn Lievaart said:
Although excelent advice in heneral, in this case I would expect Perl to
optimize it out.

You can expect, and maybe even reasonably, but it doesn't do so.

A return of something can be implemented as
copy-destroy, or as a 'move'.

What would a "move" do in the case that one of the things moved
has other references hanging around? You would need to be careful or
the optimization would result in behavioral changes.

Xho

xhoster · Aug 21, 2006

Uri Guttman said:
x> It is excellent style to rely on it. And it isn't like there is a
choice. x> Pure Perl has no user-accessible malloc or free.

x> What alternative would you propose?

there is another style choice. returning a ref to a lexical array/hash
may be confusing but returning an anon array/hash could be less
confusing.

Well, either way it is ref. So that is an alternative to a different thing
than I was thinking of alternating.

so instead of

my @array ;

return \@array ;

do

my $aref = [] ;

return $aref ;

they do the exact same thing but it is clearer code IMO to explicitly
create and return an anon array (or hash). not that i haven't done the
former in some cases but i lean to doing the latter.

Yes, this is often a good move, although I don't do it as often as I
should.

the only downside to the $aref version would be how you stuff it. you
have to deref it with ->[] or @{} where in the former case you can do
the slightly simpler direct access to the array.

This downside is also an upside. If you return a reference to the main
code then it probably needs to access the structure via dereferncing. The
benefit of using a reference inside the sub as well is that the structure
inside the sub is accessed the same way as the structure in the main code
is accessed. That makes it a lot easier to make modifications in which you
move code that used to be in the sub out to the main (or another sub), and
vice versa.

Xho

Martijn Lievaart · Aug 22, 2006

ML> Although excelent advice in heneral, in this case I would expect
ML> Perl to optimize it out. A return of something can be implemented
ML> as copy-destroy, or as a 'move'. The second is much more
ML> efficient.

it is trickier than just that. and perl 6 will be doing lazy copies and
copy on write so that will be optimizable. it would be (almost?)
impossible to make perl5 do this internally as it uses a stack to handle
sub arguments and returns.

Right. I don't know anything about perls internals, so I believe you. It
sounds kind of illogical though, such a stack probably does not contain
the whole hash, probably some small data structure to the guts of the
hash. It seems like a big win to only copy that small datastructure.

But as I said, I know nothing of perls internals, the (mighty inefficient)
parsers/interpreters I've written myself are my only frame of reference
(no pun intended).

M4

David Squire · Aug 22, 2006

Martijn said:
Right. I don't know anything about perls internals, so I believe you. It
sounds kind of illogical though, such a stack probably does not contain
the whole hash, probably some small data structure to the guts of the
hash.

Why would you say that, having just admitted that it you don't know
anything about perl internals? The *whole thing* gets put on the stack -
(all the keys and values as a list), as my reply to your original post
illustrates. Neither the address of the hash nor those of its internal
elements are preserved.

It seems like a big win to only copy that small datastructure.

It would be (modulo caveats about other references etc.). I've seen C++
compilers do exactly that optimization. Perl doesn't though. As many
have now pointed out.

DS

Uri Guttman · Aug 22, 2006

ML> Right. I don't know anything about perls internals, so I believe you. It
ML> sounds kind of illogical though, such a stack probably does not contain
ML> the whole hash, probably some small data structure to the guts of the
ML> hash. It seems like a big win to only copy that small datastructure.

and what about deep vs shallow copies? and circular refs? and reference
counting? if you want those you have to be explicit and do it
yourself. there is no easy solution to those problems and perl5 decided
long ago to just provide a simple stack for call args and returns. it
makes it clean and easy to understand and also not difficult for the
coder to create solutions to the above problems. remember one of perl's
(many) mottos is to make easy things easy and hard things possible. a
stack with user code refs fits that very well.

ML> But as I said, I know nothing of perls internals, the (mighty
ML> inefficient) parsers/interpreters I've written myself are my only
ML> frame of reference (no pun intended).

so please don't make any comments on what would be easy to do inside
perl. if you understand any of the problems i listed then you will know
why they are not handled directly in perl5.

uri

David Squire · Aug 22, 2006

David said:
Why would you say that, having just admitted that it you don't know
anything about perl internals? The *whole thing* gets put on the stack -
(all the keys and values as a list)

....and if you don't believe me, try this:

----

#!/usr/bin/perl

use strict;
use warnings;

sub hashlike_list {
my @list = qw(one uno two dos);
return @list;
}

sub listlike_hash {
my %hash = (
'one' => 'einz',
'two' => 'zwei',
);
return %hash;
}

my %test_hash = hashlike_list();
print "\$test_hash{one} = $test_hash{one}\n";
print "\$test_hash{two} = $test_hash{two}\n";

my @test_list = listlike_hash();
print map "$_, ", @test_list;

----

Aug 22 - 22:02 % ./test.pl
$test_hash{one} = uno
$test_hash{two} = dos
one, einz, two, zwei,

Aaron Sherman · Sep 5, 2006

Sorry, but I strongly disagree. Pointers in C are memory addresses,

As are SV*s

something that otherwise you find in assembler. You can manipulate them, you
can add or subtract to the address, etc, etc.
A reference in Perl is an _abstract_, well, reference to one concrete
object.

No. That might be how you like to think of it, but that's not what it
is. It's an area of malloced memory that Perl maintains a pointer to.
That memory, in trun, can contain a pointer to another SV, and that
target SV is almost always informed of the pointer relationship in
order to allow for reference count-based garbage collection.

On an implementation level it may be implemented as a C
pointer, but it could just as well be an index into some variable list or
whatever. As a Perl programmer you don't know and you don't care.

But as a C programmer, understanding the implementation helps in the
understanding, and this need to abstract away the guts is what
typically gets people into trouble trying to understand high level
languages, when they are coming from low-level langauges.

C forces
the programmer to implement memory management manually while Perl does it
automatically. There is not new(), malloc(), or free() in Perl.

This is also not true. C gives you fewer tools, and provides fewer
implicit behaviors, but just by way of example, the way the C call
stack is managed is most certainly memory management, and if you don't
believe me, try calling alloca(3).

Perl also requires and allows you to do memory management, but it is
all done through SVs, HVs, AVs, GVs, etc. Somtimes this is quite
important. For example, you can control de-allocation of an array in
three ways, each of which has its own behavior: assign an empty list,
"undef" it, take it out of scope. How memory allocation is affected is
not only different in all three cases, but HOW it is different is
documented as part of the language.

References in C	127	Jun 24, 2011
Problem understanding how closures work	3	Dec 12, 2006
'Needless flexibilities' and structured records [very long]	10	Mar 15, 2013
seeking advice on problem difficulty	35	Aug 4, 2011
R-value references and performance	2	Oct 11, 2008
Hash key types and equality of hash keys	2	Mar 1, 2012
Quotes and circular references	3	Sep 8, 2003
How java passes object references?	8	Apr 26, 2008

References - problem understanding them

Martijn Lievaart

David Squire

Uri Guttman

xhoster

xhoster

Martijn Lievaart

David Squire

Uri Guttman

David Squire

Aaron Sherman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads