Regarding pointers and references, I've seen books/websites/etc (I'm
really tring to avoid using the word "references" here) that say that
references and pointers are just different implementations of the same
underlying idea.
I rather disagree with that, but here we're treading on the border of
my knowledge. A pointer is basically an integer memory address. You
can add or subtract values from that value and end up with adjacent (or
nearby) memory location addresses. References, on the other hand refer
to a very specific memory location. You can't get at any other piece
of memory just by manipulating the reference variable.
If I understand your example, $$ref is a scalar variable
.... no. $$ref is not a variable at all. $ref is a scalar variable,
which holds a reference to $x. $$ref is simply the syntactic means by
which we "dereference" this variable, to get at the the value contained
in the variable that $ref references.
whose value is
defined as the value of $x for all time after $$ref is defined. That
is, if $x changes, $$ref changes accordingly. So why introduce $ref
when
my $x = 42;
$x = 35;
print "Dereferenced ref: $x\n";
outputs the same thing? I realize the example you gave was just to
illustrate what a reference does. I'm just not seeing how a reference
can be useful.
You're correct - in that simple example, there was no point whatsoever
for $ref. There are two main areas in which references are useful (or
even vital):
1) multi-dimensional data structures.
Perl has no concept of a "2d array", or "array of arrays". Arrays and
hashes can only hold scalar variables. Therefore, if you want to
simulate a two-dimensional array, you actually create an array which
holds references to more arrays:
my @two_d;
for my $i (0..5){
my @inner_array = (0..5);
push @two_d, \@inner_array;
}
(The above can be written much more succinctly, but I'm trying to
sacrifice brevity for clarity). This creates a single array - @two_d -
and then loops 6 times. In each iteration of the loop, we create a new
array - @inner_array - and then we store a reference to this new array
in the "outer" array. So now, for example: $two_d[0], the first
element of @two_d, is a reference to an array of six integers. I can
therefore access, say, the 3rd element of this inner array using:
$two_d[0][4].
For more on multi-dimensional structures:
perldoc perllol
perldoc perldsc
2) Passing large amounts of data
Say you have a huge hash:
my %big_hash = map ( $_ => $_ * 2 } (1..100_000);
(if you're not familiar with the syntax, this simply creates a hash
whose keys are the integers from 1 to 100,000 and whose values are the
even integers from 2 to 200,000).
Now, you want to write a subroutine which accesses and modifies this
data. An initial approach might be:
sub change_something {
my %hash = @_;
$hash{42} = 'forty-two';
return %hash;
}
%big_hash = change_something(%big_hash);
The problem here is that you've just made three copies of your large
data structure. Three separate instances of all that data had to be
copied, taking up time and space resources in your computer. A better
approach is to store the data once, and have your subroutine just be
given a reference to this data, and modify it:
sub change_something {
my ($hash_ref) = @_;
$hash_ref->{42} = 'forty-two';
}
change_something(\%big_hash);
Now we have only ever created one instance of this large data. When we
call change_something, instead of copying all that data, we simply pass
a reference (a single scalar value) to the data. The subroutine gets
that reference, and then directly modifies the data that the reference
references. When the subroutine has completed, the data has been
altered, without making any needless copies.
2b) Passing multiple structures
It directly follows from the above that it is not possible to pass more
than one structure (such as arrays or hashes) into your subroutine and
let your subroutine be able to access the individual elements:
my @foo = (1..10);
my @bar = (11..20);
sums (@foo, @bar);
sub sums {
my @stuff = @_;
#now what? @stuff contains (1..20). No way to know where
#@foo ended and @bar began
}
Instead of passing the actual arrays, we can pass references to those
arrays, so that the sums() subroutine can access the array locations
individually:
my @foo = (1..10);
my @bar = (11..20);
my $new_array_ref = sums (\@foo, \@bar);
#we can get directly at our new array by dereferencing the above:
my @new_array = @$new_array_ref;
sub sums {
my ($arr1, $arr2) = @_;
my @new_array;
for my $i (0 .. $#{$arr1}){
push @new_array, ($arr1->[$i] + $arr2->[$i]);
}
return \@new_array;
}
Here, rather than lumping all the elements of both arrays together into
the single @_ array, we were able to obtain the two references, and
then used the two references to access the individual data elements.
When we conclue, we return a reference to the new data, just to avoid
copying the new data structure as well (preferring to allow the calling
code decide if the data needs to be copied).
For more info on subroutine parameters:
perldoc perlsub
I hope this rather lengthy post is helpful to you.
Paul Lalli