How do you sort a 2D array with column headers?

Discussion in 'Perl Misc' started by Dennis@NoSpam.com, Jun 28, 2003.

  1. Guest

    I have a numerical array consisting of 5000 rows and 30 columns. The first row
    consists of 30 ascii column labels for example L1,L2.....L30. I would like to
    sort the column with the header L5 in ascending order leaving the header labels
    intact on the first row.

    I'm familiar with code

    @array =sort { $a->[1] <=> $b->[1]} @array;

    and I have read the perdoc -f sort.

    But the code above doesn't allow me to sort the array by column labels.

    How would I do that?


    Any help would be appreciated.

    Dennis
     
    , Jun 28, 2003
    #1
    1. Advertising

  2. Greg Bacon Guest

    In article <>,
    <> wrote:

    : I have a numerical array consisting of 5000 rows and 30 columns. The
    : first row consists of 30 ascii column labels for example
    : L1,L2.....L30. I would like to sort the column with the header L5 in
    : ascending order leaving the header labels intact on the first row.

    Assuming @array is an array of rows, you could use something similar
    to the code below.

    [14:53] ant% cat try
    #! /usr/local/bin/perl

    use warnings;
    use strict;

    sub find_column_index {
    my $a = shift;
    my $col = shift;

    my $header = $a->[0];
    my $colidx = 0;
    for (@$header) {
    last if $_ eq $col;
    ++$colidx;
    }

    $colidx >= @$header ? () : $colidx;
    }

    sub sort_by_column {
    my $m = shift;
    my $col = shift;

    return unless ref($m) && @$m && $col;

    my $colidx = find_column_index $m, $col;
    return unless defined $colidx;

    @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
    @{$m}[1..$#$m];
    }

    my @array = (
    [qw/ L1 L2 L3 L4 L5 /],
    [9, 8, 7, 6, 5],
    [1, 2, 3, 4, 5],
    [0, 0, 0, 0, 0],
    );

    sort_by_column \@array, 'L3';

    for (@array) {
    printf join(" ", ("%5s") x @$_) . "\n", @$_;
    }
    [14:53] ant% ./try
    L1 L2 L3 L4 L5
    0 0 0 0 0
    1 2 3 4 5
    9 8 7 6 5

    Hope this helps,
    Greg
    --
    Government, even in its best state, is but a necessary evil; in its worst
    state, an intolerable one. Government, like dress, is the badge of lost
    innocence; the palaces of kings are built upon the ruins of the bowers of
    paradise. -- Thomas Paine, "Common Sense"
     
    Greg Bacon, Jun 28, 2003
    #2
    1. Advertising

  3. Guest

    (Greg Bacon) wrote:

    >In article <>,
    > <> wrote:
    >
    >: I have a numerical array consisting of 5000 rows and 30 columns. The
    >: first row consists of 30 ascii column labels for example
    >: L1,L2.....L30. I would like to sort the column with the header L5 in
    >: ascending order leaving the header labels intact on the first row.
    >
    >Assuming @array is an array of rows, you could use something similar
    >to the code below.


    <snipped code for it's shown in above post>

    >Hope this helps,
    >Greg


    Thank you Greg!

    A lot of neat code. Some of the perl syntax is new to me but I'll get to work
    with my Perl books and learn. Thanks again.

    Dennis
     
    , Jun 28, 2003
    #3
  4. Greg Bacon Guest

    In article <>,
    <> wrote:

    : [...]
    :
    : A lot of neat code. Some of the perl syntax is new to me but I'll get
    : to work with my Perl books and learn. Thanks again.

    Anything in particular that gave you trouble? This is a discussion
    group, after all. :) If you'll permit a guess, reading the perlref,
    perllol, and perldsc manpages will help your understanding.

    Greg
    --
    What has transformed the limited war between royal armies into total war,
    the clash between peoples, is not technicalities of military art, but the
    substitution of the welfare state for the laissez-faire state.
    -- Ludwig von Mises, *Human Action*
     
    Greg Bacon, Jun 29, 2003
    #4
  5. Guest

    (Greg Bacon) wrote:

    >In article <>,
    > <> wrote:
    >
    >: [...]
    >:
    >: A lot of neat code. Some of the perl syntax is new to me but I'll get
    >: to work with my Perl books and learn. Thanks again.
    >
    >Anything in particular that gave you trouble? This is a discussion
    >group, after all. :) If you'll permit a guess, reading the perlref,
    >perllol, and perldsc manpages will help your understanding.


    Greg,

    Well I read your above perl manpages and the subroutine section of "Perl
    Cookbook" by Tom Christiansen & N. Torkington.

    Below is the code I don't understand:

    First in the subroutine sort_by_column

    sub sort_by_column {
    my $m = shift;
    my $col = shift;

    return unless ref($m) && @$m && $col;

    my $colidx = find_column_index $m, $col;
    return unless defined $colidx;

    @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
    @{$m}[1..$#$m];
    }

    sort_by_column \@array, 'L3';

    I don't understand the shift operator and how it moves \@array (a reference to
    an array) and 'L3' into $m and $col. I know the input to a subroutine are the
    elements of @_ but what does shift mean?

    The statement return unless ref($m) && @$m && $col; tests to see that the
    reference $m and value $col exist but what's @$m mean? An array whose pointer
    reference starts at $m?

    Also I'm not sure what the expression @{$m}[1..$#$m] means. obviously a
    pointer $m to an array but [1..$#$m]? .

    Next I don't understand some of the code in the subroutine find_column_index:

    sub find_column_index {
    my $a = shift;
    my $col = shift;

    my $header = $a->[0];
    my $colidx = 0;
    for (@$header) {
    last if $_ eq $col;
    ++$colidx;
    }

    $colidx >= @$header ? () : $colidx;
    }

    I take it that "my $header = $a->[0];" means store the pointer reference of the
    0'th element into $header? "for (@$header)" means for each element of the input
    array do the below? I didn't know "last" would end the loop after the last
    statement if the "if" statement was true. Neat. I take it that when you say
    "for(@$header)" each element of the array is stored into $_ one by one in the
    for loop?

    Last what does $colidx >= @$header ? () : $colidx; mean? If the array element
    number of 'L3' is greater then or equal to ...then I get lost.

    Thanks for your help, I'm learning a lot!

    Dennis
     
    , Jun 29, 2003
    #5
  6. Greg Bacon Guest

    In article <>,
    <> wrote:

    : [...]
    :
    : First in the subroutine sort_by_column
    :
    : sub sort_by_column {
    : my $m = shift;
    : my $col = shift;
    :
    : return unless ref($m) && @$m && $col;
    :
    : my $colidx = find_column_index $m, $col;
    : return unless defined $colidx;
    :
    : @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
    : @{$m}[1..$#$m];
    : }
    :
    : sort_by_column \@array, 'L3';
    :
    : I don't understand the shift operator and how it moves \@array (a
    : reference to an array) and 'L3' into $m and $col. I know the input to
    : a subroutine are the elements of @_ but what does shift mean?

    From the perlfunc documentation on the shift operator:

    Shifts the first value of the array off and returns it,
    shortening the array by 1 and moving everything down. If
    there are no elements in the array, returns the undefined
    value. If ARRAY is omitted, shifts the "@_" array within
    the lexical scope of subroutines . . .

    The shifts are plucking off the subroutine's arguments. To see shift
    in action, consider the following:

    [16:15] ant% cat try
    #! /usr/local/bin/perl

    $" = "]["; # separator for interpolating arrays

    @a = ('apples', 'oranges', 'bananas');
    print "[@a]\n";

    $first = shift @a;
    print "\$first = [$first], \@a = [@a]\n";
    [16:15] ant% ./try
    [apples][oranges][bananas]
    $first = [apples], @a = [oranges][bananas]

    : The statement return unless ref($m) && @$m && $col; tests to see that
    : the reference $m and value $col exist but what's @$m mean? An array
    : whose pointer reference starts at $m?

    Yes, but your terminology could stand polishing. (If I seem picky, I'm
    only trying to help you learn.) In Perl parlance, we'd say that we're
    making sure -- albeit indirectly -- that $m is an array reference, that
    $m's thingy (Perl's pedestrian way of saying 'referent', i.e., the array
    to which $m refers) has at least one element, and that we have a column
    label to look for. See the perlref manpage.

    We might have written the following

    return unless ref($m) && @$m && $col;

    to be more chatty as

    unless ($m && ref($m) eq 'ARRAY') {
    warn "'$m' is not an array reference";
    return;
    }

    unless (@$m > 0) {
    warn "no rows!";
    return;
    }

    if (!defined($col) || $col eq '') {
    warn "no column label!";
    return;
    }

    I wrote the check the way I did because sort_by_column operates
    in-place, so, at worst, I'd just leave the data alone. One line was
    also a little more appealing than twelve. :)

    There are also lots of hairy philosophical arguments surrounding this
    issue such as "defensive programming is bad style because it hides
    bugs", but let's not get into all that.

    : Also I'm not sure what the expression @{$m}[1..$#$m] means.
    : obviously a pointer $m to an array but [1..$#$m]? .

    Remember that Perl doesn't have pointers but references.

    Perl's .. operator can produce ranges, e.g.,

    % perl -le 'print 0..9'
    0123456789

    Recall from the perldata manpage that $#ARRAY gives the index of the
    last element of @ARRAY. For example

    % perl -le '@a = (1..10); print $#a'
    9

    (I might be setting a bad example. mjd, rightly IMHO, says using
    $#ARRAY is a red flag[*]. The usage is correct in this case, but
    do what I say, not what I do. :)

    [*] http://groups.google.com/groups?selm=3bd70c54.b10$

    The perlref manpage shows how to dereference arrays, and $#$m yields the
    index of the last element in $m's thingy. @{$m}[...] takes a slice of
    $m's thingy, i.e., a sublist -- see the perldata manpage.

    Don't get bogged down in the low-level details. Think about what we're
    trying to do: we want to leave the first row alone (the header) and
    sort everything else, i.e., all the rows from index 1 up to the last
    index in $m's thingy. We're operating in-place, so we put the rows back
    where we got them:

    @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
    @{$m}[1..$#$m];

    : Next I don't understand some of the code in the subroutine
    : find_column_index:
    :
    : sub find_column_index {
    : my $a = shift;
    : my $col = shift;
    :
    : my $header = $a->[0];
    : my $colidx = 0;
    : for (@$header) {
    : last if $_ eq $col;
    : ++$colidx;
    : }
    :
    : $colidx >= @$header ? () : $colidx;
    : }
    :
    : I take it that "my $header = $a->[0];" means store the pointer
    : reference of the 'th element into $header?

    Yes, we're storing a copy of a reference to the array of column headers.
    I used a separate variable to show the code's intent.

    : "for (@$header)" means for
    : each element of the input array do the below? I didn't know "last"
    : would end the loop after the last statement if the "if" statement was
    : true. Neat.

    Yes. Perl's last operator is like break in C but cooler.

    : I take it that when you say "for(@$header)" each element
    : of the array is stored into $_ one by one in the for loop?

    Yes. See the perlsyn manpage.

    : Last what does $colidx >= @$header ? () : $colidx; mean? If the array
    : element number of 'L3' is greater then or equal to ...then I get lost.

    That's the ternary operator as in C, sometimes called an "inline if".
    See the perlsyn manpage.

    That code is checking whether we found a match. If the condition is
    true (no match), then $colidx will be at least as large as the number of
    elements in @$header, and we return () or nothing. Otherwise (what's
    after the colon), we send back the desired header's index.

    Hope this helps,
    Greg
    --
    When I was a boy of fourteen, my father was so ignorant that I could hardly
    stand to have the old man around. But when I got to be twenty-one, I was
    astonished at how much he'd learned in seven years.
    -- Mark Twain
     
    Greg Bacon, Jun 29, 2003
    #6
  7. In article <>,
    Greg Bacon <> wrote:
    >(I might be setting a bad example. mjd, rightly IMHO, says using
    >$#ARRAY is a red flag[*]. The usage is correct in this case, but
    >do what I say, not what I do. :)
    >[*] http://groups.google.com/groups?selm=3bd70c54.b10$


    In that article, I said I thought I was going to add $#array as a red
    flag. I did add it to the class, but I did not accord it 'red flag'
    status. A 'red flag' is something that is almost always wrong. After
    doing a study, I concluded that although $#array is often wrong, it is
    not 'almost always wrong'.

    The details of the study are at
    http://perl.plover.com/yak/flags/dollar-pound/. Here is the short
    version. $#array is commonly used for five things:

    1. Generating a list of indices for an array. (Your example above is
    one of these; it is @{$m}[1..$#$m].)

    2. The upper bound of a C-style 'for' loop, as

    for ($i=0; $i <= $#array; $i++) {
    do something with $array[$i];
    }

    3. As a boundary check to see if a value is in the proper index range
    for an array. (2) could be considered a special case of this.
    Here's an example:

    if ($last_item >= $#list) {
    $Init_Disp_Limits->();
    }

    4. To pre-extend an array, as with

    $#array = $EXPECTED_NUMBER_OF_ITEMS;

    5. To access the last element of an array, as with $last = $array[$#array].

    In my judgement, all of the class (2) and (5) uses, and many of the
    class (3) uses, would have been better written some other way. For
    example, I think the example in (5) is obviously better as $last = $array[-1].

    Overall, about 20% of the uses of $#array would have been better off
    some other way. Class (1) did not seem to be in this 20%. I don't
    know any better way to write

    %hash = map { $array[$_] => $_ } 0 .. $#array;

    without the $#array, for example.
     
    Mark Jason Dominus, Jun 30, 2003
    #7
  8. Guest

    (Greg Bacon) wrote:

    >In article <>,
    > <> wrote:
    >

    <snip really great code and explanations for all my beginner questions>
    >
    >Hope this helps,
    >Greg


    Thanks Greg for the really great Perl code and explanations on how it works. I
    really appreciate the time and effort you put in to teach me and all the others
    who are reading these posts. I have really learned a lot...much more than the
    Perl books I've been reading.

    Thanks again.

    Dennis
     
    , Jun 30, 2003
    #8
  9. Greg Bacon Guest

    In article <20030630132321.238$>,
    <> wrote:

    : (Mark Jason Dominus) wrote:
    :
    : > The details of the study are at
    : > http://perl.plover.com/yak/flags/dollar-pound/. Here is the short
    : > version. $#array is commonly used for five things:
    : >
    : > 1. Generating a list of indices for an array. (Your example above is
    : > one of these; it is @{$m}[1..$#$m].)
    :
    : I wish the ".." operator, when occuring in a slice, were sufficiently
    : magical to allow @{$m}[1..-1] to replace the above.

    Amen! Almost anything other than big ugly $#{...} dereferences would
    be nice.

    : > 2. The upper bound of a C-style 'for' loop, as
    : >
    : > for ($i=0; $i <= $#array; $i++) {
    : > do something with $array[$i];
    : > }
    :
    : I use this very frequently when I have parallel arrays. Of course,
    : that might not exactly fit in your criteria for inclusion in this
    : category. I also use this when I want to change the length of @array
    : during the loop.

    When I find myself constructing parallel arrays, I almost always merge
    them into arrays of either hashes or arrays.

    Greg
    --
    Laws that forbid the carrying of arms ... make things worse for the assaulted
    and better for the assailants; they serve rather to encourage than to prevent
    homicides, for an unarmed man may be attacked with greater confidence than an
    armed man. -- Thomas Jefferson
     
    Greg Bacon, Jun 30, 2003
    #9
  10. Guest

    Greg Bacon <> wrote:
    >
    > : > 2. The upper bound of a C-style 'for' loop, as
    > : >
    > : > for ($i=0; $i <= $#array; $i++) {
    > : > do something with $array[$i];
    > : > }
    > :
    > : I use this very frequently when I have parallel arrays. Of course,
    > : that might not exactly fit in your criteria for inclusion in this
    > : category. I also use this when I want to change the length of @array
    > : during the loop.
    >
    > When I find myself constructing parallel arrays, I almost always merge
    > them into arrays of either hashes or arrays.


    I do that sometimes, but I tend to not do it as much as I could for three
    reasons. $age[$id] and $sex[$id] take up much, much less room than
    $person[$id]{age} and $person[$id]{sex} if there are a lot of entries. The
    first way gives me compile time errors if I fat-finger "age" or "sex".
    With the first I can easily pass one compartment to general functions
    without using map: median(\@age) rather than
    median([map $_->{age}, @person]). Of course, the converse is that the
    parallel structure makes it harder to pass the whole structure around, but
    if I have to do much of that I tend to encompass it into a class anyway.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service New Rate! $9.95/Month 50GB
     
    , Jul 1, 2003
    #10
  11. In article <20030630132321.238$>, <> wrote:
    >I wish the ".." operator, when occuring in a slice, were sufficiently
    >magical to allow @{$m}[1..-1] to replace the above.


    Someone, I think Simon Cozens, submitted a patch to allow

    @a[1..]

    to do this. But it wasn't accepted; I forget why not.

    >> 2. The upper bound of a C-style 'for' loop, as
    >>
    >> for ($i=0; $i <= $#array; $i++) {
    >> do something with $array[$i];
    >> }

    >
    >I use this very frequently when I have parallel arrays. Of course,
    >that might not exactly fit in your criteria for inclusion in this
    >category.


    It does. Ignoring the fact that parallel arrays are usually a sign of
    misdesign in the program, you can write the code above more simply
    and efficiently as:


    for $i (0 .. $#array) {
    do something with $array[$i];
    }

    >I also use this when I want to change the length of @array
    >during the loop.


    In such cases it makes perfect sense.
     
    Mark Jason Dominus, Jul 1, 2003
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dont bother
    Replies:
    0
    Views:
    812
    dont bother
    Mar 3, 2004
  2. Phil
    Replies:
    4
    Views:
    687
    Gabriel Genellina
    Jan 17, 2010
  3. Navin
    Replies:
    1
    Views:
    705
    Ken Schaefer
    Sep 9, 2003
  4. Replies:
    4
    Views:
    139
    Ted Zlatanov
    Jul 18, 2008
  5. Rick
    Replies:
    0
    Views:
    153
Loading...

Share This Page