multi-field array sort using Sort::Fields method

D

Domenico Discepola

Hello all. My goal is to be able to perform a "multi-field sort on a
multidimensional array". Having read many posts in the newsgroups, I was
unable to find a "straight" answer to this problem. I therefore came up
with this method. My question is, is there a more efficient solution to
this problem or is my method acceptable? Is there another CPAN module that
can be used? I welcome all opinions.

#!perl
use strict;
use warnings;
use Sort::Fields;

my ( @arr01, @arr02, $r, $c, $string, $aref, $delim, @arr_temp,
@arr_final );

@arr01 = ( [1, 'a', 'dom'],
[5, 'g', 'brad'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']);

$delim = "\t";
$string = "";

#Step 1 - combine fields into 1 string so that Sort::Fields will work
for $r ( 0 .. $#arr01 ) {
$aref = $arr01[$r];
for $c ( 0 .. $#{$aref} ) {
if ( $c == $#$aref ) {
$string = $string . $arr01[$r][$c];
} else {
$string = $string . $arr01[$r][$c]. $delim ;
}
}
push @arr02, $string;
$string = "";
}

#Step 2 - sort by field 2, then 3
my @sorted = fieldsort '\t', [2,3], @arr02;

#Step 3 - split sorted strings into mutidim array
foreach my $el ( @sorted ) {
@arr_temp = split $delim, $el;
push @arr_final, [@arr_temp];
}

#Step 4 - final output
print join('|', @$_), "\n" for @arr_final;
 
P

Paul Lalli

Hello all. My goal is to be able to perform a "multi-field sort on a
multidimensional array". Having read many posts in the newsgroups, I was
unable to find a "straight" answer to this problem. I therefore came up
with this method. My question is, is there a more efficient solution to
this problem or is my method acceptable? Is there another CPAN module that
can be used? I welcome all opinions.

You're going backwards. Sort::Fields is what you use when you don't have
multi-dimensional arrays, but rather arrays of delimited strings. If you
have multi-dimensional arrays, you just use 'sort':


#!perl.exe
use strict;
use warnings;

my @arr01 = (
[1, 'a', 'dom'],
[5, 'g', 'brad'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']
);

my @sorted = sort {
$a->[1] cmp $b->[1] #sort ASCIIBetically by 2nd field
or
$a->[2] cmp $b->[2] #if same, sort ASICBettically by 3rd field
} @arr01;

foreach (@sorted){
print "@$_\n";
}

__END__


For more information, please see
perldoc -f sort

Hope this helps,
Paul Lalli
#!perl
use strict;
use warnings;
use Sort::Fields;

my ( @arr01, @arr02, $r, $c, $string, $aref, $delim, @arr_temp,
@arr_final );

@arr01 = ( [1, 'a', 'dom'],
[5, 'g', 'brad'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']);

$delim = "\t";
$string = "";

#Step 1 - combine fields into 1 string so that Sort::Fields will work
for $r ( 0 .. $#arr01 ) {
$aref = $arr01[$r];
for $c ( 0 .. $#{$aref} ) {
if ( $c == $#$aref ) {
$string = $string . $arr01[$r][$c];
} else {
$string = $string . $arr01[$r][$c]. $delim ;
}
}
push @arr02, $string;
$string = "";
}

#Step 2 - sort by field 2, then 3
my @sorted = fieldsort '\t', [2,3], @arr02;

#Step 3 - split sorted strings into mutidim array
foreach my $el ( @sorted ) {
@arr_temp = split $delim, $el;
push @arr_final, [@arr_temp];
}

#Step 4 - final output
print join('|', @$_), "\n" for @arr_final;
 
D

Domenico Discepola

Is there another CPAN module that
For more information, please see
perldoc -f sort

Hope this helps,
Paul Lalli

Thanks for your concrete example - it helps while trying to understand the
existing documentation. Although the example you provided works, I'm
wondering if there is a CPAN module which provides a more "elegant"
interface to a multi-field, multidimensional array sort. What I have in
mind is an interface similar to Sort::Fields:

@arr_order = ( [2, 'a'], [-1,'n'] );

@sorted = sort_module( $arr_input, \@arr_order );

This would translate as sort what's in array $arr_order alphabetically by
position 2, then numerically reversed by position 1.

Input parameter 1 is the array to be sorted, parameter 2 is an array
(@arr_order) containing the index position of the array columns we want to
sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
sign indicates a reverse sort. The benefit of this is to be able to call 1
function that can be passed parameters dynamically (as opposed to
dynamically modifying code with $a and $b, cmp, <=>, etc.).
 
P

Paul Lalli

Thanks for your concrete example - it helps while trying to understand the
existing documentation. Although the example you provided works, I'm
wondering if there is a CPAN module which provides a more "elegant"
interface to a multi-field, multidimensional array sort. What I have in
mind is an interface similar to Sort::Fields:

@arr_order = ( [2, 'a'], [-1,'n'] );

@sorted = sort_module( $arr_input, \@arr_order );

This would translate as sort what's in array $arr_order alphabetically by
position 2, then numerically reversed by position 1.

Input parameter 1 is the array to be sorted, parameter 2 is an array
(@arr_order) containing the index position of the array columns we want to
sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
sign indicates a reverse sort. The benefit of this is to be able to call 1
function that can be passed parameters dynamically (as opposed to
dynamically modifying code with $a and $b, cmp, <=>, etc.).

I'm not especially sure I agree that your method would be more 'elegant'.
Really what you've suggested is to write a wrapper around sort() that
would restrict its functionality. I'm not convinced that's a good idea.
(I'm also not sure I understand what you mean by your last sentence - $a
and $b are two predefined variables common to every single sort
subroutine, and cmp vs <=> simply means asciibettical vs numerical)

Disregarding all that, however, it probably wouldn't be too hard to write
such a wrapper:

use strict;
use warnings;

my ($array, $config); #'global', because will be used in two functions

sub sort_module(\@\@){
($array, $config) = @_;
#error checking here to make sure arrays are what you want
sort with_elegance (@$array);
}

sub with_elegance{
my $return;
foreach my $dimension (@$config){
my $pos = (abs $$dimension[0]) - 1;
my $compare;
if ($$dimension[1] eq 'a'){
$compare = 'cmp';
} elsif($$dimension[1] eq 'n') {
$compare = '<=>';
} else {
die "Invalid comparison marker: $$dimension[1] ".
"(only 'a' and 'n' allowed\n";
}
if ($$dimension[0] >= 0){
eval '$return = $$a[$pos] '.$compare.' $$b[$pos]';
} else {
eval '$return = $$b[$pos] '.$compare.' $$a[$pos]';
}
last unless $return == 0;
}
return $return;
}

my @arr_input = ( [1, 'a', 'dom'],
[5, 'g', 'brad'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']
);

my @arr_order = ( [2, 'a'], [-1,'n'] );
my @sorted = sort_module( @arr_input, @arr_order );

foreach (@sorted) {
print "@$_\n";
}

__END__


Give that a shot and see if it does what you want. Note that I whipped
this up in just a few moments, and it shows. Among the things that should
probably be fixed are the use of global variables, die() should become
carp or croak if this were put into an actual module, and the use of the
eval function. Not to mention there's probably more than a few ways to
optimize it....

Paul Lalli
 
P

Paul Lalli

Thanks for your concrete example - it helps while trying to understand the
existing documentation. Although the example you provided works, I'm
wondering if there is a CPAN module which provides a more "elegant"
interface to a multi-field, multidimensional array sort. What I have in
mind is an interface similar to Sort::Fields:

@arr_order = ( [2, 'a'], [-1,'n'] );

@sorted = sort_module( $arr_input, \@arr_order );

This would translate as sort what's in array $arr_order alphabetically by
position 2, then numerically reversed by position 1.

Input parameter 1 is the array to be sorted, parameter 2 is an array
(@arr_order) containing the index position of the array columns we want to
sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
sign indicates a reverse sort. The benefit of this is to be able to call 1
function that can be passed parameters dynamically (as opposed to
dynamically modifying code with $a and $b, cmp, <=>, etc.).

I'm not especially sure I agree that your method would be more 'elegant'.
Really what you've suggested is to write a wrapper around sort() that
would restrict its functionality. I'm not convinced that's a good idea.
(I'm also not sure I understand what you mean by your last sentence - $a
and $b are two predefined variables common to every single sort
subroutine, and cmp vs <=> simply means asciibettical vs numerical)

Disregarding all that, however, it probably wouldn't be too hard to write
such a wrapper:

And of course, I did things bass-ackwards and coded before checking CPAN.
I wonder if this would suffice for what you're looking for:
http://search.cpan.org/~evo/Data-Sorting-0.9/Sorting.pm

It doesn't have quite the same interface you wanted, but it's similar.

Paul Lalli
 
A

Anno Siegel

Domenico Discepola said:
Hello all. My goal is to be able to perform a "multi-field sort on a
multidimensional array". Having read many posts in the newsgroups, I was
unable to find a "straight" answer to this problem. I therefore came up
with this method. My question is, is there a more efficient solution to
this problem or is my method acceptable? Is there another CPAN module that
can be used? I welcome all opinions.

Okay, the code is a bit pointless, because sorting an array of arrays via
Sort::Fields is backwards. Still, a few remarks:
#!perl
use strict;
use warnings;
Good.

use Sort::Fields;

my ( @arr01, @arr02, $r, $c, $string, $aref, $delim, @arr_temp,
@arr_final );

It is usually preferred to declare variables where they first appear.
Some languages don't support this and force you to lump all declarations
together. That doesn't mean it's good style.
@arr01 = ( [1, 'a', 'dom'],
[5, 'g', 'brad'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']);

$delim = "\t";
$string = "";

#Step 1 - combine fields into 1 string so that Sort::Fields will work
for $r ( 0 .. $#arr01 ) {
$aref = $arr01[$r];

You don't need the index ($r) in the loop, so you could have iterated
over @arr01 directly:

for my $aref ( @arr01 ) {
for $c ( 0 .. $#{$aref} ) {
if ( $c == $#$aref ) {
$string = $string . $arr01[$r][$c];
} else {
$string = $string . $arr01[$r][$c]. $delim ;
}
}
push @arr02, $string;

Uh, oh! Perl has a function for what that loop does, it's called join:

push @arr2, join $delim, @$aref;

....does exactly the same thing.
$string = "";

If you had declared $string inside the loop body you wouldn't have to
worry about clearing it. "my" does that at run time.

Your outer loop does nothing but collect the results of a calculation
into a list. Again, Perl has a function for that, called "map":

my @arr02 = map join( $delim, @$_) => @arr01;

is a more idiomatic replacement for your code to this point.
}

#Step 2 - sort by field 2, then 3
my @sorted = fieldsort '\t', [2,3], @arr02;

#Step 3 - split sorted strings into mutidim array
foreach my $el ( @sorted ) {

Ah, good. An index-free loop where no index is needed.
@arr_temp = split $delim, $el;
push @arr_final, [@arr_temp];
}

Why the intermediate @arr_temp? "[ split $delim, $el]" works as well.
The loop could again be replaced by map:

@arr_final = map [ split $delim, $_], @sorted;
#Step 4 - final output
print join('|', @$_), "\n" for @arr_final;

Hey, you know about "join". Why did you re-invent it up there? In
general, the second half of your code looks smoother than the part
before "fieldsort ...".

Having reduced all of the loops to map, the whole thing can be written
as one statement (untested, as is the above):

print join( '|', @$_), "\n" for
map [ split $delim, $_] =>
fieldsort '\t', [2,3] =>
map join( $delim, @$_) => @arr01;

This is not everyone's idea of good style, but I think in this case it's
quite readable.

Anno
 
U

Uri Guttman

PL> I'm not especially sure I agree that your method would be more 'elegant'.
PL> Really what you've suggested is to write a wrapper around sort() that
PL> would restrict its functionality. I'm not convinced that's a good idea.
PL> (I'm also not sure I understand what you mean by your last sentence - $a
PL> and $b are two predefined variables common to every single sort
PL> subroutine, and cmp vs <=> simply means asciibettical vs numerical)

PL> Disregarding all that, however, it probably wouldn't be too hard to write
PL> such a wrapper:


PL> sub with_elegance{
PL> my $return;
PL> foreach my $dimension (@$config){
PL> my $pos = (abs $$dimension[0]) - 1;
PL> my $compare;
PL> if ($$dimension[1] eq 'a'){
PL> $compare = 'cmp';
PL> } elsif($$dimension[1] eq 'n') {
PL> $compare = '<=>';
PL> } else {
PL> die "Invalid comparison marker: $$dimension[1] ".
PL> "(only 'a' and 'n' allowed\n";
PL> }
PL> if ($$dimension[0] >= 0){
PL> eval '$return = $$a[$pos] '.$compare.' $$b[$pos]';
PL> } else {
PL> eval '$return = $$b[$pos] '.$compare.' $$a[$pos]';
PL> }
PL> last unless $return == 0;
PL> }
PL> return $return;
PL> }

PL> Give that a shot and see if it does what you want. Note that I whipped
PL> this up in just a few moments, and it shows. Among the things that should
PL> probably be fixed are the use of global variables, die() should become
PL> carp or croak if this were put into an actual module, and the use of the
PL> eval function. Not to mention there's probably more than a few ways to
PL> optimize it....

wait for Sort::Maker which is mostly developed and will do all that and
more and much faster and with a better api. i expect to cpan the .01
version in mid june before yapc. i could let people have beta copies
earlier than that if anyone wants it.

more on this soon as i do some more pod editing. i will post that soon
as it is presentable

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top