Numerically sort a file on a given column where column is a $var

J

joemacbusiness

Hi All,

I want a subroutine that will sort a file on any given column.
So I have a file like this:

300 400 500 600 700
33 2337483 482 78374 4567
10 20 30 40 50
1000 1001 1002 1003 1004
9 8 7 6 5

And I run my numericSortCol() routine on it sorting on
column 1 and should get this:

9 8 7 6 5
10 20 30 40 50
300 400 500 600 700
1000 1001 1002 1003 1004
33 2337483 482 78374 4567

The problem is that I cannot get the bynum() to accept 2 args.
The code will work if I hard-code the $col in bynum, and
tweek the "problem line" a bit but that defeats the purpose of the
"sort on column" feature.

Does anyone have a solution for this?

Thanks, --Joe M.

######################### Here's the code I have so far:
[joe@localhost work]$ cat test18.pl
#!/usr/bin/perl

my $infile = "test18.in";
open(F,"$infile") || print "cannot open $infile $!";
my @array = <F>;
close(F);

numericSortCol(1,\@array);

sub numericSortCol {
my $col = shift;
my $aref = shift;
foreach my $item (sort bynum($col,@{ $aref })){ # <<<< problem
line??
print "item: $item\n";
}
}

sub bynum {
my $col = shift;
@a = split(/\s+/,$a);
@b = split(/\s+/,$b);
$a[$col] <=> $b[$col];
# $a[1] <=> $b[1];
}
[joe@localhost work]$
[joe@localhost work]$
######################### Here's the input file
[joe@localhost work]$ cat test18.in
300 400 500 600 700
33 2337483 482 78374 4567
10 20 30 40 50
1000 1001 1002 1003 1004
9 8 7 6 5
[joe@localhost work]$
[joe@localhost work]$
######################### Here's the runtime:
[joe@localhost work]$ perl test18.pl
item: 1
item: 9 8 7 6 5

item: 10 20 30 40 50

item: 33 2337483 482 78374 4567

item: 300 400 500 600 700

item: 1000 1001 1002 1003 1004

[joe@localhost work]$
 
U

Uri Guttman

j> The problem is that I cannot get the bynum() to accept 2 args.
j> The code will work if I hard-code the $col in bynum, and
j> tweek the "problem line" a bit but that defeats the purpose of the
j> "sort on column" feature.

j> open(F,"$infile") || print "cannot open $infile $!";

useless use of quotes on a scalar. not needed and it can be a bug.

j> my @array = <F>;
j> close(F);

use File::Slurp ;
my @data = read_file( $infile ) ;

j> numericSortCol(1,\@array);

j> sub numericSortCol {
j> my $col = shift;
j> my $aref = shift;
j> foreach my $item (sort bynum($col,@{ $aref })){ # <<<< problem
j> line??

well, that isn't how perl's sort works. it can take a sub name (you have
a sub call) or a code block.


and you should look at Sort::Maker which can do this for you
easily. just create a simple expression to get the column you want based
on $_. that could be something like:

(split( ' ', $_))[$col]

the rest i leave as an exercise.

uri
 
J

Jens Thoms Toerring

I want a subroutine that will sort a file on any given column.
So I have a file like this:
300 400 500 600 700
33 2337483 482 78374 4567
10 20 30 40 50
1000 1001 1002 1003 1004
9 8 7 6 5
And I run my numericSortCol() routine on it sorting on
column 1 and should get this:
9 8 7 6 5
10 20 30 40 50
300 400 500 600 700
1000 1001 1002 1003 1004
33 2337483 482 78374 4567
The problem is that I cannot get the bynum() to accept 2 args.
The code will work if I hard-code the $col in bynum, and
tweek the "problem line" a bit but that defeats the purpose of the
"sort on column" feature.
######################### Here's the code I have so far:
[joe@localhost work]$ cat test18.pl
#!/usr/bin/perl
my $infile = "test18.in";
open(F,"$infile") || print "cannot open $infile $!";
my @array = <F>;
close(F);

sub numericSortCol {
my $col = shift;
my $aref = shift;
foreach my $item (sort bynum($col,@{ $aref })){ # <<<< problem
line??

Look again at the documentation for the sort function. It takes
either a block or a function (that itself expects two arguments)
and, as the second argument, a list to be sorted. Your use of
sort doesn't seem to fit that very well and I guess if you had
used 'use warnings;' you would have been told so...
print "item: $item\n";
}
}
sub bynum {
my $col = shift;
@a = split(/\s+/,$a);
@b = split(/\s+/,$b);
$a[$col] <=> $b[$col];
}

The simplest solution is probaly not to use a function name when
calling sort but instead a block like this:

#!/usr/bin/perl

use strict;
use warnings;

my @array = <DATA>;
numericSortCol( 1, \@array );

sub numericSortCol {
my ( $col, $aref ) = @_;

print "item: $_"
for sort { ( split /\s+/, $a )[ $col ] <=>
( split /\s+/, $b )[ $col ] } @$aref;
}

__DATA__
300 400 500 600 700
33 2337483 482 78374 4567
10 20 30 40 50
1000 1001 1002 1003 1004
9 8 7 6 5
Regards, Jens
 
T

Ted Zlatanov

On Thu, 17 Jul 2008 14:29:52 -0700 (PDT) (e-mail address removed) wrote:

j> I want a subroutine that will sort a file on any given column.
j> So I have a file like this:

j> 300 400 500 600 700
j> 33 2337483 482 78374 4567
j> 10 20 30 40 50
j> 1000 1001 1002 1003 1004
j> 9 8 7 6 5

j> And I run my numericSortCol() routine on it sorting on
j> column 1 and should get this:

j> 9 8 7 6 5
j> 10 20 30 40 50
j> 300 400 500 600 700
j> 1000 1001 1002 1003 1004
j> 33 2337483 482 78374 4567

Use the standard Unix utility `sort':

sort -n +1 FILE
(use "-k 1" instead of +1 if GNU sort is installed)

e.g. sort /etc/passwd by user ID:

sort -t: -n +2 /etc/passwd

Read the docs for `sort' for further information.

You can use Perl for this (as others have shown), but if all you want is
to sort a file by a simple key, `sort' will do just fine. It's also
likely to be much faster in most circumstances.

Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top