: [...]
:
: First in the subroutine sort_by_column
:
: sub sort_by_column {
: my $m = shift;
: my $col = shift;
:
: return unless ref($m) && @$m && $col;
:
: my $colidx = find_column_index $m, $col;
: return unless defined $colidx;
:
: @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
: @{$m}[1..$#$m];
: }
:
: sort_by_column \@array, 'L3';
:
: I don't understand the shift operator and how it moves \@array (a
: reference to an array) and 'L3' into $m and $col. I know the input to
: a subroutine are the elements of @_ but what does shift mean?
From the perlfunc documentation on the shift operator:
Shifts the first value of the array off and returns it,
shortening the array by 1 and moving everything down. If
there are no elements in the array, returns the undefined
value. If ARRAY is omitted, shifts the "@_" array within
the lexical scope of subroutines . . .
The shifts are plucking off the subroutine's arguments. To see shift
in action, consider the following:
[16:15] ant% cat try
#! /usr/local/bin/perl
$" = "]["; # separator for interpolating arrays
@a = ('apples', 'oranges', 'bananas');
print "[@a]\n";
$first = shift @a;
print "\$first = [$first], \@a = [@a]\n";
[16:15] ant% ./try
[apples][oranges][bananas]
$first = [apples], @a = [oranges][bananas]
: The statement return unless ref($m) && @$m && $col; tests to see that
: the reference $m and value $col exist but what's @$m mean? An array
: whose pointer reference starts at $m?
Yes, but your terminology could stand polishing. (If I seem picky, I'm
only trying to help you learn.) In Perl parlance, we'd say that we're
making sure -- albeit indirectly -- that $m is an array reference, that
$m's thingy (Perl's pedestrian way of saying 'referent', i.e., the array
to which $m refers) has at least one element, and that we have a column
label to look for. See the perlref manpage.
We might have written the following
return unless ref($m) && @$m && $col;
to be more chatty as
unless ($m && ref($m) eq 'ARRAY') {
warn "'$m' is not an array reference";
return;
}
unless (@$m > 0) {
warn "no rows!";
return;
}
if (!defined($col) || $col eq '') {
warn "no column label!";
return;
}
I wrote the check the way I did because sort_by_column operates
in-place, so, at worst, I'd just leave the data alone. One line was
also a little more appealing than twelve.
There are also lots of hairy philosophical arguments surrounding this
issue such as "defensive programming is bad style because it hides
bugs", but let's not get into all that.
: Also I'm not sure what the expression @{$m}[1..$#$m] means.
: obviously a pointer $m to an array but [1..$#$m]? .
Remember that Perl doesn't have pointers but references.
Perl's .. operator can produce ranges, e.g.,
% perl -le 'print 0..9'
0123456789
Recall from the perldata manpage that $#ARRAY gives the index of the
last element of @ARRAY. For example
% perl -le '@a = (1..10); print $#a'
9
(I might be setting a bad example. mjd, rightly IMHO, says using
$#ARRAY is a red flag[*]. The usage is correct in this case, but
do what I say, not what I do.
[*]
http://groups.google.com/[email protected]
The perlref manpage shows how to dereference arrays, and $#$m yields the
index of the last element in $m's thingy. @{$m}[...] takes a slice of
$m's thingy, i.e., a sublist -- see the perldata manpage.
Don't get bogged down in the low-level details. Think about what we're
trying to do: we want to leave the first row alone (the header) and
sort everything else, i.e., all the rows from index 1 up to the last
index in $m's thingy. We're operating in-place, so we put the rows back
where we got them:
@{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
@{$m}[1..$#$m];
: Next I don't understand some of the code in the subroutine
: find_column_index:
:
: sub find_column_index {
: my $a = shift;
: my $col = shift;
:
: my $header = $a->[0];
: my $colidx = 0;
: for (@$header) {
: last if $_ eq $col;
: ++$colidx;
: }
:
: $colidx >= @$header ? () : $colidx;
: }
:
: I take it that "my $header = $a->[0];" means store the pointer
: reference of the 'th element into $header?
Yes, we're storing a copy of a reference to the array of column headers.
I used a separate variable to show the code's intent.
: "for (@$header)" means for
: each element of the input array do the below? I didn't know "last"
: would end the loop after the last statement if the "if" statement was
: true. Neat.
Yes. Perl's last operator is like break in C but cooler.
: I take it that when you say "for(@$header)" each element
: of the array is stored into $_ one by one in the for loop?
Yes. See the perlsyn manpage.
: Last what does $colidx >= @$header ? () : $colidx; mean? If the array
: element number of 'L3' is greater then or equal to ...then I get lost.
That's the ternary operator as in C, sometimes called an "inline if".
See the perlsyn manpage.
That code is checking whether we found a match. If the condition is
true (no match), then $colidx will be at least as large as the number of
elements in @$header, and we return () or nothing. Otherwise (what's
after the colon), we send back the desired header's index.
Hope this helps,
Greg