comparing a 2D array

Discussion in 'Perl Misc' started by Rose, Mar 13, 2008.

  1. Rose

    Rose Guest

    For the following 2D array comparison codes modified from perllol google
    search, I wonder why the output generated is not what I expected. Could
    anybody tell me what i should modify in order to have an exact match of all
    the attributes of a row from an individual file? Thanks a lot~



    #!/usr/bin/perl

    use warnings;

    $usage='prog t1 t2 cmpname';
    die "Usage: $usage\n" if $#ARGV < 1;

    $outname = $ARGV[2];
    $file1 = $ARGV[0];
    $file2 = $ARGV[1];
    $cmpcout = $outname . ".cmpofcmp.xls";

    open(FP1, $file1);
    open(FP2, $file2);
    open(CMP, ">$cmpcout");

    $line = <FP1>; #get header
    while ($line ne "") {
    $line = <FP1>;
    @attr1 = split(/[\t ]+/, $line);
    push @row1, [ @attr1 ];
    }

    $line = <FP2>; #get header
    while ($line ne "") {
    $line = <FP2>;
    @attr2 = split(/[\t ]+/, $line);
    push @row2, [ @attr2 ];
    }

    for $aref1 (@row1) {
    for $aref2 (@row2) {
    if (@$aref1 == @$aref2) {
    print "@$aref1\n";
    }
    }
    }

    ========================
    File t1:
    a1 a2 a3
    1 AAAA 99
    0 TT 88
    99 what -888
    File t2:
    a1 a2 a3
    1 AAAAA 99
    0 TCCT 88
    -10 TT 88
    99 what 888

    ==========================
    Output:
    1 AAAA 99

    1 AAAA 99

    1 AAAA 99

    1 AAAA 99

    0 TT 88

    0 TT 88

    0 TT 88

    0 TT 88

    99 what -888

    99 what -888

    99 what -888

    99 what -888
     
    Rose, Mar 13, 2008
    #1
    1. Advertising

  2. Rose

    J. Gleixner Guest

    Rose wrote:
    > For the following 2D array comparison codes modified from perllol google
    > search, I wonder why the output generated is not what I expected. Could
    > anybody tell me what i should modify in order to have an exact match of all
    > the attributes of a row from an individual file? Thanks a lot~
    >
    >
    >
    > #!/usr/bin/perl
    >
    > use warnings;


    Forgot:

    use strict;

    >
    > $usage='prog t1 t2 cmpname';
    > die "Usage: $usage\n" if $#ARGV < 1;
    >
    > $outname = $ARGV[2];
    > $file1 = $ARGV[0];
    > $file2 = $ARGV[1];
    > $cmpcout = $outname . ".cmpofcmp.xls";
    >
    > open(FP1, $file1);
    > open(FP2, $file2);
    > open(CMP, ">$cmpcout");


    Check that they were successful.

    >
    > $line = <FP1>; #get header
    > while ($line ne "") {
    > $line = <FP1>;
    > @attr1 = split(/[\t ]+/, $line);

    No need for @attr1, just put the split in the [ ], below.
    > push @row1, [ @attr1 ];
    > }
    >
    > $line = <FP2>; #get header
    > while ($line ne "") {
    > $line = <FP2>;
    > @attr2 = split(/[\t ]+/, $line);
    > push @row2, [ @attr2 ];
    > }
    >
    > for $aref1 (@row1) {
    > for $aref2 (@row2) {
    > if (@$aref1 == @$aref2) {


    perldoc -q "How do I test whether two arrays or hashes are equal"

    > print "@$aref1\n";
    > }
    > }
    > }


    [...]
     
    J. Gleixner, Mar 13, 2008
    #2
    1. Advertising

  3. Rose

    Rose Guest

    "J. Gleixner" <> wrote in message
    news:47d988ba$0$89867$...

    >> $line = <FP1>; #get header
    >> while ($line ne "") {
    >> $line = <FP1>;
    >> @attr1 = split(/[\t ]+/, $line);

    > No need for @attr1, just put the split in the [ ], below.
    >> push @row1, [ @attr1 ];
    >> }


    put the split in the []? I don't quite get what you mean....

    After modifying the codes in perlfaq as:

    for ($i = 0; $i < @$aref1; $i++) {
    if ($aref1->[$i] ne $aref2->[$i]){
    $match = 0;
    last;
    }
    }

    if ($match == 1) {
    print "@$aref1\n";
    }
    }

    this time it successfully finds the matches, but it prints out extra new
    lines, why does that happen?
     
    Rose, Mar 13, 2008
    #3
  4. Rose

    Rose Guest

    > Untested code:
    >
    > LOOP:
    > my $equal = 1;
    > foreach my $row1 (@row1) {
    > foreach my $row2 (@row2) {
    > unless (@$row1 == @$row2) {
    > $equal = 0;
    > last LOOP;
    > }
    > for (my $i = 0; $i < @$row1; $i ++) {
    > my $elem1 = $$row1 [$i];
    > my $elem2 = $$row2 [$i];
    > unless ($elem1 eq $elem2) {
    > $equal = 0;
    > last LOOP;
    > }
    > }
    > }
    > }
    >
    > say "Arrays are equal" if $equal;
    >
    >
    >
    > Abigail
    > --
    > A perl rose: perl -e '@}>-`-,-`-%-'



    Thanks, Abigail. But it seems that your untested code is testing the whole
    array of array, but not individual rows, maybe the last code should enter
    after the last but second curly bracket.
     
    Rose, Mar 13, 2008
    #4
  5. Rose wrote:
    > "J. Gleixner" <> wrote in message
    > news:47d988ba$0$89867$...
    >
    >>> $line = <FP1>; #get header
    >>> while ($line ne "") {
    >>> $line = <FP1>;
    >>> @attr1 = split(/[\t ]+/, $line);

    >> No need for @attr1, just put the split in the [ ], below.
    >>> push @row1, [ @attr1 ];
    >>> }

    >
    > put the split in the []? I don't quite get what you mean....


    push @row1, [ split /[\t ]+/, $line ];


    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
     
    John W. Krahn, Mar 13, 2008
    #5
  6. Rose

    J. Gleixner Guest

    Rose wrote:
    > "J. Gleixner" <> wrote in message
    > news:47d988ba$0$89867$...
    >
    >>> $line = <FP1>; #get header
    >>> while ($line ne "") {
    >>> $line = <FP1>;

    [...]
    > this time it successfully finds the matches, but it prints out extra new
    > lines, why does that happen?


    Because you're not removing it from your input.

    perldoc -f chomp

    Eithe chomp() or remove the "\n" from your print.
     
    J. Gleixner, Mar 13, 2008
    #6
  7. Rose

    Rose Guest

    After taking your advice, the codes to compare 2 files have been modified
    (the followings), the problem is that I am unable to attach a tag to a
    particular row. I once try using

    for $aref2 (@row2) {
    $f1m[$aref2] = 0;
    }

    instead of $f1m[$i++] = 0; but this does not help because I find have to
    label rows from both files in order to print out two lists: one for the same
    and the other for the difference. Is there any good way to achieve that?


    #!/usr/bin/perl

    #use warnings;

    $usage='prog outname1.xls outname2.xls';
    die "Usage: $usage\n" if $#ARGV < 1;

    $outname = $ARGV[2];
    $file1 = $ARGV[0];
    $file2 = $ARGV[1];
    $cmpsout = $outname . ".cmpofcmpsame.xls";
    $cmpdout = $outname . ".cmpofcmpdiff.xls";

    open(FP1, $file1);
    open(FP2, $file2);
    open(CMPS, ">$cmpsout");
    open(CMPD, ">$cmpdout");

    $line = <FP1>; #get header
    while ($line ne "") {
    $line = <FP1>;
    push @row1, [ split(/[\t ]+/, $line) ];
    }

    $line = <FP2>; #get header
    while ($line ne "") {
    $line = <FP2>;
    push @row2, [ split(/[\t ]+/, $line) ];
    }

    $i=0;
    for $aref2 (@row2) {
    $f1m[$i++] = 0;
    }

    print CMPD "$file1 statistics\n";
    for $aref1 (@row1) {
    for $aref2 (@row2) {

    $match = 1;

    for ($i = 0; $i < @$aref1; $i++) {
    if ($aref1->[$i] ne $aref2->[$i]) {
    $match = 0;
    last;
    }
    }

    if ($match == 1) {
    print CMPS "@$aref1";
    $flm[$aref2]=1;
    } else {
    print CMPD "@$aref1";
    }
    }
    }

    print CMPD "\n$file2 statistics\n";
    $i=0;
    for $aref2 (@row2) {
    if ($f1m[$i] == 0) {
    print CMPD "@$aref2";
    }
    $i++;
    }

    #########
    File 1:

    a1 a2 a3
    1 AAAAA 99
    0 TCCT 88
    99 what -888

    File 2:
    a1 a2 a3
    1 AAAAA 99
    0 TCCT 88
    -10 TT 88
    99 what 888
     
    Rose, Mar 14, 2008
    #7
  8. Rose <> wrote:
    > After taking your advice, the codes to compare 2 files have been modified
    > (the followings), the problem is that I am unable to attach a tag to a
    > particular row. I once try using
    >
    > for $aref2 (@row2) {
    > $f1m[$aref2] = 0;
    > }
    >
    > instead of $f1m[$i++] = 0; but this does not help because I find have to
    > label rows from both files in order to print out two lists: one for the same
    > and the other for the difference. Is there any good way to achieve that?
    >
    >
    > #!/usr/bin/perl
    >
    > #use warnings;



    You lose the benfits of warnings when you comment out the
    use warnings pragma.

    You should also be using the

    use strict;

    pragma as well.


    > $usage='prog outname1.xls outname2.xls';
    > die "Usage: $usage\n" if $#ARGV < 1;



    If you want to require exactly two command line arguments, then

    die "Usage: $usage\n" unless @ARGV == 2;

    says it much more clearly.


    > $outname = $ARGV[2];
    > $file1 = $ARGV[0];
    > $file2 = $ARGV[1];



    Err, your Usage message says that it should be called with 2 arguments,
    so why are you expecting 3 arguments?

    If your usage message is wrong, and you really do want those 3
    variables loaded from the command line arguments, then

    my($file1, $file2, $outname) = @ARGV;

    will do that.


    > $cmpsout = $outname . ".cmpofcmpsame.xls";
    > $cmpdout = $outname . ".cmpofcmpdiff.xls";



    Using variable names that differ by a single interior character is
    a very bad idea.

    Interpolation is nearly always easier to read and understand
    than explicit concatenations.

    my $same_out = "$outname.cmpofcmpsame.xls";
    my $diff_out = "$outname.cmpofcmpdiff.xls";


    > open(FP1, $file1);



    You should always, yes *always*, check the return value from open():

    open(FP1, $file1) or die "could not open '$file1' $!";

    Even better, use a lexical filehandle and the 3-arg form of open():

    open( my $FP1, '<', $file1) or die "could not open '$file1' $!";


    > open(CMPS, ">$cmpsout");


    open my $SAME, '>', $same_out or die "could not open '$same_out' $!";


    > $line = <FP1>; #get header


    If you plan to discard the line anyway, then there is not much
    point in saving it in a variable.

    <FP1>; # discard header


    > while ($line ne "") {
    > $line = <FP1>;



    That is a truly horrid way of reading lines from a file.

    while ( $line = <FP1> ) }

    even better, with a lexical filehandle as above:

    while ( $line = <$FP1> ) }


    > for ($i = 0; $i < @$aref1; $i++) {



    foreach my $i ( 0 .. $#$aref1 ) { # less error-prone, does the same thing


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Mar 14, 2008
    #8
  9. Rose

    Rose Guest

    "Tad J McClellan" <> wrote in message
    news:...
    > Rose <> wrote:
    >> After taking your advice, the codes to compare 2 files have been modified
    >> (the followings), the problem is that I am unable to attach a tag to a
    >> particular row. I once try using
    >>
    >> for $aref2 (@row2) {
    >> $f1m[$aref2] = 0;
    >> }
    >>
    >> instead of $f1m[$i++] = 0; but this does not help because I find have to
    >> label rows from both files in order to print out two lists: one for the
    >> same
    >> and the other for the difference. Is there any good way to achieve that?
    >>
    >>
    >> #!/usr/bin/perl
    >>
    >> #use warnings;

    >
    >
    > You lose the benfits of warnings when you comment out the
    > use warnings pragma.
    >
    > You should also be using the
    >
    > use strict;
    >
    > pragma as well.
    >
    >
    >> $usage='prog outname1.xls outname2.xls';
    >> die "Usage: $usage\n" if $#ARGV < 1;

    >
    >
    > If you want to require exactly two command line arguments, then
    >
    > die "Usage: $usage\n" unless @ARGV == 2;
    >
    > says it much more clearly.
    >
    >
    >> $outname = $ARGV[2];
    >> $file1 = $ARGV[0];
    >> $file2 = $ARGV[1];

    >
    >
    > Err, your Usage message says that it should be called with 2 arguments,
    > so why are you expecting 3 arguments?
    >
    > If your usage message is wrong, and you really do want those 3
    > variables loaded from the command line arguments, then
    >
    > my($file1, $file2, $outname) = @ARGV;
    >
    > will do that.
    >
    >
    >> $cmpsout = $outname . ".cmpofcmpsame.xls";
    >> $cmpdout = $outname . ".cmpofcmpdiff.xls";

    >
    >
    > Using variable names that differ by a single interior character is
    > a very bad idea.
    >
    > Interpolation is nearly always easier to read and understand
    > than explicit concatenations.
    >
    > my $same_out = "$outname.cmpofcmpsame.xls";
    > my $diff_out = "$outname.cmpofcmpdiff.xls";
    >
    >
    >> open(FP1, $file1);

    >
    >
    > You should always, yes *always*, check the return value from open():
    >
    > open(FP1, $file1) or die "could not open '$file1' $!";
    >
    > Even better, use a lexical filehandle and the 3-arg form of open():
    >
    > open( my $FP1, '<', $file1) or die "could not open '$file1' $!";
    >
    >
    >> open(CMPS, ">$cmpsout");

    >
    > open my $SAME, '>', $same_out or die "could not open '$same_out' $!";
    >
    >
    >> $line = <FP1>; #get header

    >
    > If you plan to discard the line anyway, then there is not much
    > point in saving it in a variable.
    >
    > <FP1>; # discard header
    >
    >
    >> while ($line ne "") {
    >> $line = <FP1>;

    >
    >
    > That is a truly horrid way of reading lines from a file.
    >
    > while ( $line = <FP1> ) }
    >
    > even better, with a lexical filehandle as above:
    >
    > while ( $line = <$FP1> ) }
    >
    >
    >> for ($i = 0; $i < @$aref1; $i++) {

    >
    >
    > foreach my $i ( 0 .. $#$aref1 ) { # less error-prone, does the same
    > thing
    >
    >
    > --
    > Tad McClellan
    > email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


    Dear Mr. Tad McClellan,

    Thanks a lot for pointing out all the problems I have! ;)
     
    Rose, Mar 14, 2008
    #9
  10. Rose

    szr Guest

    Abigail wrote:
    > _
    > Rose () wrote on VCCCVIII September MCMXCIII in
    > <URL:news:frc0n7$ror$>:
    > ?? For the following 2D array comparison codes modified from perllol
    > google ?? search, I wonder why the output generated is not what I
    > expected. Could ?? anybody tell me what i should modify in order to
    > have an exact match of all ?? the attributes of a row from an
    > individual file? Thanks a lot~ ??
    > ??
    > ??
    > ?? #!/usr/bin/perl
    > ??
    > ?? use warnings;
    > ??
    > ?? $usage='prog t1 t2 cmpname';
    > ?? die "Usage: $usage\n" if $#ARGV < 1;
    > ??
    > ?? $outname = $ARGV[2];
    > ?? $file1 = $ARGV[0];
    > ?? $file2 = $ARGV[1];
    > ?? $cmpcout = $outname . ".cmpofcmp.xls";
    > ??
    > ?? open(FP1, $file1);
    > ?? open(FP2, $file2);
    > ?? open(CMP, ">$cmpcout");
    >
    > You are blindly assuming that the open will succeed. What if an input
    > file doesn't exist, or if you don't have permission to open the output
    > file?
    >
    > ??
    > ?? $line = <FP1>; #get header
    > ?? while ($line ne "") {
    >
    > Eh, did you by any chance follow a Perl course organized by HP
    > education?
    > They have the only books I've encountered that don't use the
    > canonical way
    > of iterating over a file:
    >
    > while ($line = <FP1>) {
    > ...
    > }
    >
    > But I wouldn't use bare file handles any more. With any Perl from the
    > current century, you could write:
    >
    > open my $fh, '<', $file or die "open: $!";
    > while (my $line = <$fh>) {
    > # Do something with $line
    > }
    > close $fh or die "close: $!";
    >
    > ?? $line = <FP1>;
    > ?? @attr1 = split(/[\t ]+/, $line);
    > ?? push @row1, [ @attr1 ];
    > ?? }
    >
    >
    > ??
    > ?? $line = <FP2>; #get header
    > ?? while ($line ne "") {
    > ?? $line = <FP2>;
    > ?? @attr2 = split(/[\t ]+/, $line);
    > ?? push @row2, [ @attr2 ];
    > ?? }
    > ??
    > ?? for $aref1 (@row1) {
    > ?? for $aref2 (@row2) {
    > ?? if (@$aref1 == @$aref2) {
    > ?? print "@$aref1\n";
    >
    > This just compares whether $aref1 and $aref2 point to arrays of the
    > same length.
    >
    > You should compare each element individually.
    >
    > ?? }
    > ?? }
    > ?? }
    >
    > Untested code:
    >
    > LOOP:
    > my $equal = 1;
    > foreach my $row1 (@row1) {
    > foreach my $row2 (@row2) {
    > unless (@$row1 == @$row2) {
    > $equal = 0;
    > last LOOP;
    > }
    > for (my $i = 0; $i < @$row1; $i ++) {
    > my $elem1 = $$row1 [$i];
    > my $elem2 = $$row2 [$i];
    > unless ($elem1 eq $elem2) {
    > $equal = 0;
    > last LOOP;
    > }
    > }
    > }
    > }
    >
    > say "Arrays are equal" if $equal;
    >
    >
    >
    > Abigail
    > --
    > A perl rose: perl -e '@}>-`-,-`-%-'



    How about this for comparign a "2D" array?

    use strict;

    my @a1 = ( [qw/A B C/], [qw/D E F/], [qw/G H I/] );
    my @a2 = ( [qw/A B C/], [qw/D E F/], [qw/G H I/] );

    my $matches =
    @a1 == grep { my $i = $_;
    @{$a1[$i]} == grep {
    $a1[$i][$_] eq $a2[$i][$_];
    } (0..$#{$a1[$i]})
    } (0..$#a1);

    print $matches ? 'Yep' : 'Nope';


    $match is true if ALL the elements match, false if not.


    This can easily be expanded to handle more "dimentions" as well:

    use strict;

    my @a1 = ( [ [qw/A B C/], [qw/D E F/] ], [ [qw/G H I/], [qw/J K
    L/] ] );
    my @a2 = ( [ [qw/A B C/], [qw/D E F/] ], [ [qw/G H I/], [qw/J K
    L/] ] );

    my $matches =
    @a1 == grep { my $i = $_;
    @{$a1[$i]} == grep { my $j = $_;
    @{$a1[$i][$j]} == grep {
    $a1[$i][$j][$_] eq $a2[$i][$j][$_];
    } (0..$#{$a1[$i][$j]})
    } (0..$#{$a1[$i]})
    } (0..$#a1);

    print $matches ? 'Yep' : 'Nope';


    And so on, for as many levels as you need. Best of all, it's technically
    a one liner, though it is obviously more readable written as above :)


    --
    szr
     
    szr, Mar 15, 2008
    #10
  11. Rose

    Dr.Ruud Guest

    Rose schreef:

    > @attr1 = split(/[\t ]+/, $line);



    You could write /[\t ]+/ as /[[:blank:]]+/

    but most often this is what you want:

    @attr1 = split " ", $line;

    See perldoc -f split.

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Mar 16, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. srinukasam

    comparing the array in parallel

    srinukasam, Jun 28, 2005, in forum: VHDL
    Replies:
    3
    Views:
    487
  2. srinukasam
    Replies:
    3
    Views:
    540
    Andy Peters
    Jun 30, 2005
  3. Raphael Iloh

    Comparing Array Objects

    Raphael Iloh, Feb 22, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    382
    Alvin Bruney [MVP]
    Feb 22, 2004
  4. Ying Yang
    Replies:
    9
    Views:
    645
    Mike Wahler
    Sep 27, 2003
  5. Varun Sinha

    Comparing elements of an array to a char

    Varun Sinha, Nov 24, 2003, in forum: C Programming
    Replies:
    9
    Views:
    431
    Irrwahn Grausewitz
    Nov 25, 2003
Loading...

Share This Page