Maximum length of hash key

Discussion in 'Perl Misc' started by Robert Manea, Dec 8, 2004.

  1. Robert Manea

    Robert Manea Guest

    Hello,

    I'm searching for different code snippets (Golfers welcome) to solve the
    following problem:

    #v+

    %category = {
    'Mozilla' => 20,
    'Forte Agent' => 12,
    'slrn' => 45
    };
    #v-

    Find the longest key (most characters) and return
    '$longest_key_length + 1;'.

    Since the hash can get pretty big, efficiency should be an issue.

    Currently I'm using this:
    #v+

    my @arr; for ( keys %category ) {$arr[length $_] = defined;};
    my $longest_key_length = $#arr+1;

    #v-

    in favor over the more golfish one liner:

    #v+

    my $longest_key_length =
    1 + length +(sort { length $b <=> length $a } keys %category)[0];

    #v-


    Any other suggestions are highly appreciated.


    Thx & Greets, Rob

    --
    The Enterprise meets God, and it's a child, a computer, or a C program.
     
    Robert Manea, Dec 8, 2004
    #1
    1. Advertising

  2. Robert Manea

    Craig Dunn Guest


    > Find the longest key (most characters) and return
    > '$longest_key_length + 1;'.


    [snip]

    >
    > in favor over the more golfish one liner:
    >
    > my $longest_key_length =
    > 1 + length +(sort { length $b <=> length $a } keys %category)[0];
    >




    Or another way:

    my $l = (sort{$b<=>$a} map{length($_)} keys(%category))[0]+1;

    Craig
     
    Craig Dunn, Dec 8, 2004
    #2
    1. Advertising

  3. Robert Manea <> wrote:
    > I'm searching for different code snippets (Golfers welcome) to solve the
    > following problem:


    > %category = {
    > 'Mozilla' => 20,
    > 'Forte Agent' => 12,
    > 'slrn' => 45
    > };


    I don't think that means what you think it means.

    > Find the longest key (most characters) and return
    > '$longest_key_length + 1;'.


    If we're playing golf, I'm going to offer:

    my $l;
    $l|=$_ foreach keys %category; $l=1+length $l;

    --
    PGP key ID E85DC776 - finger for full key
     
    Peter Corlett, Dec 8, 2004
    #3
  4. Robert Manea

    Paul Lalli Guest

    "Robert Manea" <> wrote in message
    news:...
    > Hello,
    >
    > I'm searching for different code snippets (Golfers welcome) to solve

    the
    > following problem:
    >
    > #v+
    >
    > %category = {
    > 'Mozilla' => 20,
    > 'Forte Agent' => 12,
    > 'slrn' => 45
    > };
    > #v-
    >


    Do you know you just created a hash with one key - a hash ref - that
    contains no value?

    > Find the longest key (most characters) and return
    > '$longest_key_length + 1;'.
    >
    > Since the hash can get pretty big, efficiency should be an issue.


    Then sorting is definately not something you want to be doing.

    > Currently I'm using this:
    > my @arr; for ( keys %category ) {$arr[length $_] = defined;};


    Why 'defined'? Why bother checking the return value of defined($_),
    instead of just using 1?

    > my $longest_key_length = $#arr+1;
    >
    >
    > in favor over the more golfish one liner:
    >
    > my $longest_key_length =
    > 1 + length +(sort { length $b <=> length $a } keys %category)[0];


    Again, sorting will be a costly operation if the point is just to find
    the maximal value. An iterative solution would seem to be preferred:

    > Any other suggestions are highly appreciated.


    %hash = ('foo'=>1, 'foobar'=>2, 'a'=>3, 'asdfasd'=>4, 'ab'=>5);
    length($_) > $len and $len = length($_) for keys %hash;
    print "Max Length plus 1 = " . ($len + 1) . "\n";


    Paul Lalli
     
    Paul Lalli, Dec 8, 2004
    #4
  5. Robert Manea

    Uri Guttman Guest

    >>>>> "CD" == Craig Dunn <> writes:

    >> Find the longest key (most characters) and return
    >> '$longest_key_length + 1;'.


    CD> [snip]

    >>
    >> in favor over the more golfish one liner:
    >>
    >> my $longest_key_length =
    >> 1 + length +(sort { length $b <=> length $a } keys %category)[0];
    >>


    CD> Or another way:

    CD> my $l = (sort{$b<=>$a} map{length($_)} keys(%category))[0]+1;

    ewww! O(N log N) for an O(N) problem? use List::Util::max

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Dec 8, 2004
    #5
  6. Robert Manea

    Craig Dunn Guest

    > "Robert Manea" <> wrote in message
    > news:...
    > Again, sorting will be a costly operation if the point is just to find
    > the maximal value. An iterative solution would seem to be preferred:
    >
    >> Any other suggestions are highly appreciated.

    >
    > %hash = ('foo'=>1, 'foobar'=>2, 'a'=>3, 'asdfasd'=>4, 'ab'=>5);
    > length($_) > $len and $len = length($_) for keys %hash;
    > print "Max Length plus 1 = " . ($len + 1) . "\n";


    Or back to the golf...

    $len = length>$len?length:$len for keys %hash;$len++;
     
    Craig Dunn, Dec 8, 2004
    #6
  7. Robert Manea

    Robert Manea Guest

    Segfault in module "Peter Corlett" - dump details are as follows:

    > Robert Manea <> wrote:
    >> I'm searching for different code snippets (Golfers welcome) to solve the
    >> following problem:


    >> %category = {
    >> 'Mozilla' => 20,
    >> 'Forte Agent' => 12,
    >> 'slrn' => 45
    >> };


    > I don't think that means what you think it means.


    You're pretty much thinking duly. Obviously I meant to mean '(' and ')'.

    >> Find the longest key (most characters) and return
    >> '$longest_key_length + 1;'.


    > If we're playing golf, I'm going to offer:


    > my $l;
    > $l|=$_ foreach keys %category; $l=1+length $l;


    That's a real nice one. Thanks for it.


    Greets, Rob

    --
    The Enterprise meets God, and it's a child, a computer, or a C program.
     
    Robert Manea, Dec 8, 2004
    #7
  8. Robert Manea

    Uri Guttman Guest

    >>>>> "RM" == Robert Manea <> writes:

    RM> Find the longest key (most characters) and return
    RM> '$longest_key_length + 1;'.

    RM> Since the hash can get pretty big, efficiency should be an issue.

    RM> my @arr; for ( keys %category ) {$arr[length $_] = defined;};
    ^^^^^^^

    what is 'defined' in this case? it may actually work since keys are
    always defined and $_ is aliased to each key and defined works on $_ by
    default. but that is a FUGLY way to get a simple true value. just use 1.

    RM> my $longest_key_length = $#arr+1;

    @arr is better.

    use List::Util qw( max ) ;

    my $max_len = max( map length, keys %category ) + 1 ;

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Dec 8, 2004
    #8
  9. Robert Manea

    Uri Guttman Guest

    >>>>> "RM" == Robert Manea <> writes:

    >> my $l;
    >> $l|=$_ foreach keys %category; $l=1+length $l;


    RM> That's a real nice one. Thanks for it.

    should be very slow as it has to copy and or each of the keys and you
    said this is large hash. the solution i posted aliases to each key (so
    no copying) and length and max are fast. for your education benchmark
    them (on a large hash) and post the results.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Dec 8, 2004
    #9
  10. Robert Manea

    Big and Blue Guest

    Craig Dunn wrote:

    > Or back to the golf...
    >
    > $len = length>$len?length:$len for keys %hash;$len++;


    Shouldn't that be ... for $len keys ....

    And there is this:

    $k^=$_ for keys%hash;length($k)+1


    However, this needs to read every character of every key - I would
    expect the original idea (set an array with length(key) and then check the
    length of the array) to be much quicker.


    --
    -*- Just because I've written it here doesn't -*-
    -*- mean that you should, or I do, believe it. -*-
     
    Big and Blue, Dec 8, 2004
    #10
  11. Robert Manea

    Craig Dunn Guest

    >> Or back to the golf...
    >>
    >> $len = length>$len?length:$len for keys %hash;$len++;

    >
    > Shouldn't that be ... for $len keys ....


    Nope. You're setting $len to $_ if the length of $_ is greater than $len.

    > $k^=$_ for keys%hash;length($k)+1


    Nicer :)
     
    Craig Dunn, Dec 8, 2004
    #11
  12. Robert Manea

    Robert Manea Guest

    Segfault in module "Uri Guttman" - dump details are as follows:

    [...]
    > for your education benchmark them (on a large hash) and post the
    > results.


    Well, here it goes...

    The test code:
    #v+

    #!/usr/bin/perl
    #

    use strict;
    use warnings;
    use Benchmark qw:)all);

    our %category;
    for( 1 .. 5000 ) {
    my $rand_key = 'x' x (int(rand(30)) + 1);
    $category{$rand_key} = 1;
    }

    timethese(100000, {
    'Array_Indices' => sub {
    my @arr;
    for ( keys %category ) {$arr[length $_] = 1;}
    my $longest_key_length = $#arr+1;
    },
    'Sorting' => sub {
    my $longest_key_length =
    1 + length +(sort { length $b <=> length $a } keys %category)[0];
    },
    'ORing' => sub {
    my $longest_key_len;
    $longest_key_len |=$_ foreach keys %category;
    $longest_key_len = 1 + length $longest_key_len;
    },
    'XORing' => sub {
    my $longest_key_len;
    $longest_key_len ^= $_ for keys %category;
    $longest_key_len = length($longest_key_len)+1;
    },
    'Length_compare' => sub {
    my $len=0;
    $len = length>$len?length:$len for keys %category;
    $len++;
    }
    });

    #v-

    And the results on my machine:

    Benchmark: timing 100000 iterations of Array_Indices, Length_compare, ORing, Sorting, XORing...
    Array_Indices: 3 wallclock secs ( 4.38 usr + 0.01 sys = 4.39 CPU) @ 22779.04/s (n=100000)
    Length_compare: 4 wallclock secs ( 3.57 usr + 0.02 sys = 3.59 CPU) @ 27855.15/s (n=100000)
    ORing: 4 wallclock secs ( 3.70 usr + 0.01 sys = 3.71 CPU) @ 26954.18/s (n=100000)
    Sorting: 6 wallclock secs ( 7.01 usr + 0.03 sys = 7.04 CPU) @ 14204.55/s (n=100000)
    XORing: 4 wallclock secs ( 3.68 usr + 0.03 sys = 3.71 CPU) @ 26954.18/s (n=100000)


    Same snippets as above and 'cmpthese':

    Rate Sorting Array_Indices ORing XORing Length_compare
    Sorting 14306/s -- -37% -47% -47% -49%
    Array_Indices 22883/s 60% -- -15% -16% -18%
    ORing 26954/s 88% 18% -- -1% -4%
    XORing 27174/s 90% 19% 1% -- -3%
    Length_compare 27933/s 95% 22% 4% 3% --


    Quite suprisingly for me the 'Array_Indeces' method runs slower than the
    'Length_compare' one which obviously the fastets.

    No that surprising is the 'Sorting' case.

    And again OR/XOR quite fast although the need to scan each string char by char.
    (But, isn't this needed to compute the strings' length with 'length()' anyway?)

    Well, all in all not quite the results I expected.


    > uri



    Greets, Rob

    --
    The Enterprise meets God, and it's a child, a computer, or a C program.
     
    Robert Manea, Dec 8, 2004
    #12
  13. Robert Manea

    Robert Manea Guest

    Segfault in module "Robert Manea" - dump details are as follows:

    > Segfault in module "Uri Guttman" - dump details are as follows:


    > [...]
    >> for your education benchmark them (on a large hash) and post the
    >> results.


    > Well, here it goes...


    > The test code:


    > #!/usr/bin/perl
    > #


    > use strict;
    > use warnings;
    > use Benchmark qw:)all);


    > our %category;
    > for( 1 .. 5000 ) {
    > my $rand_key = 'x' x (int(rand(30)) + 1);
    > $category{$rand_key} = 1;
    > }


    General brains overload. Please use this instead of the (sensless) above
    code:

    our %category;
    my $cnt;
    for( 0 .. 1000 ) {
    my $rand_key = $cnt++ . 'x' x (int(rand(30)) + 1);
    $category{$rand_key} = 1;
    }

    [...]

    > And the results on my machine:

    [ Now only 5000 iterations have been made ]

    Benchmark: timing 5000 iterations of Array_Indices, Length_compare, ORing, Sorting, XORing...
    Array_Indices: 7 wallclock secs ( 6.25 usr + 0.05 sys = 6.30 CPU) @ 793.65/s (n=5000)
    Length_compare: 6 wallclock secs ( 5.33 usr + 0.03 sys = 5.36 CPU) @ 932.84/s (n=5000)
    ORing: 7 wallclock secs ( 5.80 usr + 0.04 sys = 5.84 CPU) @ 856.16/s (n=5000)
    Sorting: 24 wallclock secs (22.79 usr + 0.14 sys = 22.93 CPU) @ 218.05/s (n=5000)
    XORing: 6 wallclock secs ( 5.73 usr + 0.03 sys = 5.76 CPU) @ 868.06/s (n=5000)

    > Same snippets as above and 'cmpthese':


    Rate Sorting XORing Array_Indices ORing Length_compare
    Sorting 219/s -- -75% -75% -75% -76%
    XORing 864/s 294% -- -1% -2% -7%
    Array_Indices 873/s 298% 1% -- -1% -6%
    ORing 877/s 301% 2% 1% -- -6%
    Length_compare 931/s 325% 8% 7% 6% --


    > Greets, Rob




    Greets, Rob

    --
    The Enterprise meets God, and it's a child, a computer, or a C program.
     
    Robert Manea, Dec 8, 2004
    #13
  14. Uri Guttman <> wrote:
    [...]
    > should be very slow as it has to copy and or each of the keys and
    > you said this is large hash. the solution i posted aliases to each
    > key (so no copying) and length and max are fast.


    It turns out that copying in Perl isn't always as expensive as you
    might think. What was that saying about premature optimisation? ;)

    > for your education benchmark them (on a large hash) and post the
    > results.


    #!/usr/bin/perl -w
    use strict;

    use Benchmark ':all';

    foreach my $keys (2, 10, 100, 1000, 1e4) {
    srand 1;
    my %hash;
    for (1..$keys) {
    my $key=join '',map { chr(32+rand 95) } (0..rand 40);
    $hash{$key}++;
    }
    print "\nFor $keys keys\n\n";
    cmpthese(1e6/$keys, {
    or_keys => sub {
    my $l;
    $l|=$_ foreach keys %hash; $l=1+length $l;
    },
    or_each => sub {
    my($l, $k);
    $l|=$k while($k=each %hash); $l=1+length $l;
    },
    uri_keys => sub {
    my @arr; for ( keys %hash ) {$arr[length $_] = defined;};
    my $l = $#arr+1;
    }
    });
    }
    __END__

    And its output (on a 400MHz PII):

    For 2 keys

    Rate uri_keys or_keys or_each
    uri_keys 27778/s -- -26% -55%
    or_keys 37707/s 36% -- -39%
    or_each 61728/s 122% 64% --

    For 10 keys

    Rate uri_keys or_keys or_each
    uri_keys 11669/s -- -37% -42%
    or_keys 18450/s 58% -- -8%
    or_each 20000/s 71% 8% --

    For 100 keys

    Rate uri_keys or_each or_keys
    uri_keys 2004/s -- -15% -25%
    or_each 2370/s 18% -- -12%
    or_keys 2681/s 34% 13% --

    For 1000 keys

    Rate or_each uri_keys or_keys
    or_each 241/s -- -7% -12%
    uri_keys 260/s 8% -- -5%
    or_keys 275/s 14% 6% --

    For 10000 keys

    Rate or_keys uri_keys or_each
    or_keys 21.8/s -- -2% -2%
    uri_keys 22.3/s 2% -- -0%
    or_each 22.3/s 2% 0% --


    It appears that my cute golfing exercise/hack is actually faster than
    your array-based approach for small hashes, but you win on larger
    hashes. I'm also somewhat surprised that an each-based approach is
    slower on a relatively-large 1000 key hash.

    --
    PGP key ID E85DC776 - finger for full key
     
    Peter Corlett, Dec 9, 2004
    #14
  15. Peter Corlett <> wrote:

    > If we're playing golf, I'm going to offer:
    >
    > my $l;
    > $l|=$_ foreach keys %category; $l=1+length $l;



    If you're playing golf you never use "foreach",
    it costs 4 strokes for no gain.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Dec 9, 2004
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rp
    Replies:
    1
    Views:
    555
    red floyd
    Nov 10, 2011
  2. Une bévue
    Replies:
    5
    Views:
    155
    Une bévue
    Aug 10, 2006
  3. Timothy Baron

    Key Associated w/ Maximum Value in Hash

    Timothy Baron, Sep 9, 2010, in forum: Ruby
    Replies:
    10
    Views:
    210
    Benoit Daloze
    Sep 12, 2010
  4. Jim Burgess

    Maximum length of array#hash

    Jim Burgess, Oct 7, 2010, in forum: Ruby
    Replies:
    6
    Views:
    143
    Robert Klemme
    Oct 7, 2010
  5. phanhuyich
    Replies:
    4
    Views:
    291
Loading...

Share This Page