question about hash ordering

Discussion in 'Perl Misc' started by John, Oct 2, 2003.

  1. John

    John Guest

    HI,

    I have this.


    @arr = qw(ab a b ab ba ba ab ba);

    and I would like to classify the index with the values:

    @output = ([0, 3, 6],
    [1],
    [2],
    [4, 5, 7]
    );
    or in hash

    %output = (
    0 => [0, 3, 6],
    1 => [1],
    2 => [2],
    4 => [4, 5, 7]
    );


    on this situation I don't know which approach is the best the array
    one or the hash, in order to classify then and also in performance.


    What will be you approach?


    Thanks for your help

    J
     
    John, Oct 2, 2003
    #1
    1. Advertisements

  2. John

    Sam Holden Guest

    It depends on what you want to do with it, and how big the data
    will be, and numerous other factors.

    For example, if you want to know what the nth unique element is
    (where, for example, the 4th unique element in that example is 'ba'),
    then the array will allow you to do so in constant time:

    $arr[$output[$i][0]]

    While the hash will require O(NlogN) time (you'd have to sort the keys).

    Then again, if you know the index of the first occurance of an item you
    wish to find the list of occurances for (or want to know if that index
    is the first occurance of the item), then the hash approach will
    allow you to do so in constant time, while the array approach will
    require O(logN).

    So performance depends on the operations you need (as always).
     
    Sam Holden, Oct 2, 2003
    #2
    1. Advertisements


  3. My approach would depend on how I plan to _use_ the data structures,
    which you have not shared with us...

    If performance really matters (it probably doesn't), then try it
    both ways and benchmark them.


    ------------------------
    #!/usr/bin/perl
    use strict;
    use warnings;
    use Data::Dumper;

    my @arr = qw(ab a b ab ba ba ab ba);

    my @array = by_array(@arr);
    print Dumper \@array;

    my %hash = by_hash(@arr);
    print Dumper \%hash;

    sub by_array {
    my @array;

    my %seen;
    foreach my $i ( 0 .. $#_ ) {
    $seen{$_[$i]} = $i unless exists $seen{$_[$i]};
    push @{ $array[$seen{$_[$i]}] }, $i;
    }

    return @array;
    }

    sub by_hash {
    my %hash;

    my %seen;
    foreach my $i ( 0 .. $#_ ) {
    $seen{$_[$i]} = $i unless exists $seen{$_[$i]};
    push @{ $hash{$seen{$_[$i]}} }, $i;
    }

    return %hash;
    }
     
    Tad McClellan, Oct 2, 2003
    #3
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.