Sorting an associative array

Discussion in 'Perl Misc' started by Nathan Olson, Jun 27, 2004.

  1. Nathan Olson

    Nathan Olson Guest

    I've got an associative array whose keys are movie titles. I'd like to sort
    the keys in an order that takes into account titles beginning with "A" or
    "The." In other words, "The Graduate" ought to be sorted with the 'g's, not
    the 't's. Is there any way to do this that doesn't involve sorting manually?


    Thanks in advance,
    Nate Olson
    Nathan Olson, Jun 27, 2004
    #1
    1. Advertising

  2. Nathan Olson

    BZ Guest

    Nathan Olson wrote in comp.lang.perl.misc:
    > I've got an associative array whose keys are movie titles. I'd like to sort
    > the keys in an order that takes into account titles beginning with "A" or
    > "The." In other words, "The Graduate" ought to be sorted with the 'g's, not
    > the 't's. Is there any way to do this that doesn't involve sorting manually?


    Something like this should work:

    sort {
    $a =~ s/^(a|the)\s+//;
    $b =~ s/^(a|the)\s+//;
    $a <=> $b
    } keys %hash;

    --
    BZ
    BZ, Jun 27, 2004
    #2
    1. Advertising

  3. Nathan Olson wrote:
    > I've got an associative array whose keys are movie titles. I'd like
    > to sort the keys in an order that takes into account titles
    > beginning with "A" or "The." In other words, "The Graduate" ought
    > to be sorted with the 'g's, not the 't's.


    my @sortedtitles = sort {
    ($a =~ /(?i:the|a)?\s*(.+)/)[0]
    cmp
    ($b =~ /(?i:the|a)?\s*(.+)/)[0]
    } keys %movies;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #3
  4. Nathan Olson

    John Bokma Guest

    Nathan Olson wrote:

    > I've got an associative array whose keys are movie titles. I'd like to sort
    > the keys in an order that takes into account titles beginning with "A" or
    > "The." In other words, "The Graduate" ought to be sorted with the 'g's, not
    > the 't's. Is there any way to do this that doesn't involve sorting manually


    create a look up table (array) consisting of arrays with the first
    element the title, and the second one the title with "A " and "The " etc
    removed. Sort it on the *second* element.

    Next use the first element to index your assoc array.

    Google for 'Schwartzian Transform' for some nice examples.
    for example: http://www.stonehenge.com/merlyn/UnixReview/col06.html

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #4
  5. Nathan Olson

    John Bokma Guest

    BZ wrote:

    > Nathan Olson wrote in comp.lang.perl.misc:
    >
    >> I've got an associative array whose keys are movie titles. I'd like to sort
    >> the keys in an order that takes into account titles beginning with "A" or
    >> "The." In other words, "The Graduate" ought to be sorted with the 'g's, not
    >> the 't's. Is there any way to do this that doesn't involve sorting manually?

    >
    > Something like this should work:
    >
    > sort {
    > $a =~ s/^(a|the)\s+//;
    > $b =~ s/^(a|the)\s+//;
    > $a <=> $b
    > } keys %hash;


    Which does O(n log n) replacements. It might be faster to create a
    look-up table, with O(n) replacements, and use that to sort.

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #5
  6. Gunnar Hjalmarsson wrote:
    >
    > my @sortedtitles = sort {
    > ($a =~ /(?i:the|a)?\s*(.+)/)[0]
    > cmp
    > ($b =~ /(?i:the|a)?\s*(.+)/)[0]
    > } keys %movies;


    Correction: Make that

    my @sortedtitles = sort {
    ($a =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    cmp
    ($b =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    } keys %movies;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #6
  7. Nathan Olson

    John Bokma Guest

    Gunnar Hjalmarsson wrote:

    > Nathan Olson wrote:
    >
    >> I've got an associative array whose keys are movie titles. I'd like
    >> to sort the keys in an order that takes into account titles
    >> beginning with "A" or "The." In other words, "The Graduate" ought
    >> to be sorted with the 'g's, not the 't's.

    >
    >
    > my @sortedtitles = sort {
    > ($a =~ /(?i:the|a)?\s*(.+)/)[0]


    This fails (?) with Thesomething and Asomething. Yeah, I know that the
    specs said A and The., but you missed the dot after The too, so that's
    not an excuse :-D

    It also strips spaces in front of titles starting with spaces (ok there
    probably are none like that)

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #7
  8. Nathan Olson

    John Bokma Guest

    Gunnar Hjalmarsson wrote:

    > Gunnar Hjalmarsson wrote:
    >
    >>
    >> my @sortedtitles = sort {
    >> ($a =~ /(?i:the|a)?\s*(.+)/)[0]
    >> cmp
    >> ($b =~ /(?i:the|a)?\s*(.+)/)[0]
    >> } keys %movies;

    >
    > Correction: Make that
    >
    > my @sortedtitles = sort {
    > ($a =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    > cmp
    > ($b =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    > } keys %movies;


    "Someone and the thingy"

    (don't you need ^ ?)

    And how to index %movies? I guess that the OP needs to access the info
    in the %movies assoc.

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #8
  9. John Bokma wrote:
    > Gunnar Hjalmarsson wrote:
    >>
    >> my @sortedtitles = sort {
    >> ($a =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >> cmp
    >> ($b =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >> } keys %movies;

    >
    > "Someone and the thingy"
    >
    > (don't you need ^ ?)


    No. Unless a key starts with 'A ' or 'The ', (.+) captures the whole
    key. (But ^ wouldn't have hurted, for the sake of clarity...)

    > And how to index %movies? I guess that the OP needs to access the
    > info in the %movies assoc.


    Not sure what you mean. @sortedtitles can now be used to access the
    info in %movies in the desired order.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #9
  10. BZ wrote:
    > Nathan Olson wrote in comp.lang.perl.misc:
    >> I've got an associative array whose keys are movie titles. I'd
    >> like to sort the keys in an order that takes into account titles
    >> beginning with "A" or "The." In other words, "The Graduate" ought
    >> to be sorted with the 'g's, not the 't's. Is there any way to do
    >> this that doesn't involve sorting manually?

    >
    > Something like this should work:
    >
    > sort {
    > $a =~ s/^(a|the)\s+//;
    > $b =~ s/^(a|the)\s+//;
    > $a <=> $b
    > } keys %hash;


    Did you try it?

    - It does not replace case insensitively.
    - It sorts strings numerically.

    Besides that, since all the elements in the returned list are no
    longer an (exact) key in the hash, how would you use the list?

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #10
  11. Nathan Olson

    John Bokma Guest

    Gunnar Hjalmarsson wrote:
    > John Bokma wrote:
    >
    >> Gunnar Hjalmarsson wrote:
    >>
    >>>
    >>> my @sortedtitles = sort {
    >>> ($a =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >>> cmp
    >>> ($b =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >>> } keys %movies;

    >>
    >>
    >> "Someone and the thingy"
    >>
    >> (don't you need ^ ?)

    >
    > No. Unless a key starts with 'A ' or 'The ', (.+) captures the whole
    > key. (But ^ wouldn't have hurted, for the sake of clarity...)


    Ah, grmbl, indeed.

    >> And how to index %movies? I guess that the OP needs to access the
    >> info in the %movies assoc.

    >
    > Not sure what you mean. @sortedtitles can now be used to access the info
    > in %movies in the desired order.


    Indeed, my mistake again, it's a match :-(

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #11
  12. Nathan Olson wrote:
    >
    > I've got an associative array whose keys are movie titles. I'd like to sort
    > the keys in an order that takes into account titles beginning with "A" or
    > "The." In other words, "The Graduate" ought to be sorted with the 'g's, not
    > the 't's. Is there any way to do this that doesn't involve sorting manually?


    my @sorted_movie_titles =
    map { s/^[^\0]+\0//; $_ }
    sort
    map { (my $x = $_) =~ s/^(?:a|the)\s*//i; "$x\0$_" }
    keys %hash;



    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Jun 27, 2004
    #12
  13. John Bokma wrote:
    >
    > Ah, grmbl, indeed.


    <snip>

    > Indeed, my mistake again, it's a match :-(


    Time to turn your attention from Purl Gurl to Perl?

    (Sorry, couldn't resist.)

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #13
  14. John W. Krahn wrote:
    >
    > map { (my $x = $_) =~ s/^(?:a|the)\s*//i; "$x\0$_" }

    ------------------------------------------^

    Seems as if you made a mistake similar to the one I made in my first post.

    map { (my $x = $_) =~ s/^(?:a|the)\s+//i; "$x\0$_" }


    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #14
  15. Nathan Olson

    John Bokma Guest

    John Bokma, Jun 27, 2004
    #15
  16. Nathan Olson

    John Bokma Guest

    Purl Gurl wrote:

    > Gunnar Hjalmarsson wrote:
    >
    >>John Bokma wrote:

    >
    >>>Indeed, my mistake again

    >
    >>Time to turn your attention from Purl Gurl to Perl?

    >
    > I wish he would. He does need to learn Perl.


    True, every day I learn more and more Perl. Started 10 years ago, and
    wish I had more time to study, now, and the past 10 years. That I stick
    with it for 10 years is because there is so much to learn and the
    language keeps amazing me.

    > He is about the only one left harassing our family
    > on a daily basis. I have noted he has spent around
    > four to six hours today, probing our server trying
    > to find a way in.


    Funny, I am awake for a few hours, had breakfast, spend time with my
    partner, and read a bit on the Usenet.

    > Psychotic obsession is a type of mental disturbance.


    You keep on proving to no zero about even the basics of
    psychology/psychiatry.

    > Personally, I would much rather discuss Perl


    Then why don't you start learning this language?

    > I am a bit disturbed


    Yes, I think by now the entire Perl community is aware of that fact.

    > Perhaps he will take your advice. Highly doubtful considering
    > the type of psychosis he and others display.


    Most people who suffer from psychosis are able to think correct and
    valid within their reality, which is not that far removed from "reality"

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
    John Bokma, Jun 27, 2004
    #16
  17. Gunnar Hjalmarsson wrote:
    >
    > John W. Krahn wrote:
    > >
    > > map { (my $x = $_) =~ s/^(?:a|the)\s*//i; "$x\0$_" }

    > ------------------------------------------^
    >
    > Seems as if you made a mistake similar to the one I made in my first post.
    >
    > map { (my $x = $_) =~ s/^(?:a|the)\s+//i; "$x\0$_" }


    Ah yes, thanks.


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Jun 27, 2004
    #17
  18. Purl Gurl wrote:
    > Gunnar Hjalmarsson wrote:
    >>
    >> my @sortedtitles = sort {
    >> ($a =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >> cmp
    >> ($b =~ /(?:(?i:the|a)\s+)?(.+)/)[0]
    >> } keys %movies;

    >
    > à beintôt will give you fits, but hardly a point of critque.
    >
    > Both Perl 5.6 and Perl 5.8 choke on accented characters
    > on my system.


    What?? Are you implying that not all movies are made in the US and
    have all American titles? ;-)

    Assuming that you are right about that, what exactly do you mean by
    "choke"? I notice that "use locale;" makes a difference on my box with
    respect to the sort order (I'm Swedish, so it probably turns on a
    Swedish locale), but I don't get any errors or warnings.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 27, 2004
    #18
  19. Purl Gurl wrote:
    > I am always right.


    Sorry, forgot that.

    > Returned sort order is wrong for accented characters. This is
    > assuming "a" and "à" should be grouped together, but which should
    > be first, is a "who's on first" question.


    Then it has nothing to do with the sorting code and it has everything
    to do with locales. Choose a suitable locale, and you're done.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 28, 2004
    #19
  20. Purl Gurl wrote:
    > Strikes me a and à should be grouped together at the beginning
    > of a sort list. Which should be first, I do not have a clue. I
    > suppose for you, à should be before a in a list. Maybe second
    > because à is a more busy letter?


    Actually, 'à' is not part of the Swedish alphabet (we have å, ä and ö
    besides the ASCII letters), but intuitively I'd say that 'a' should
    come first.

    > Does your locale group those two together?


    Yes.

    Playing with the list you posted in another message:

    $, = ' ';
    print sort qw(a ab b bc à àb);

    does not change the order, which is as expected since Perl ignores all
    locales by default. But

    use locale;
    $, = ' ';
    print sort qw(a ab b bc à àb);

    outputs:
    a à ab àb b bc

    on my box.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 28, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. desktop
    Replies:
    5
    Views:
    376
    James Kanze
    Jun 26, 2007
  2. Yvon Thoraval
    Replies:
    5
    Views:
    200
    Jason Creighton
    Sep 17, 2003
  3. VK
    Replies:
    47
    Views:
    522
    Thomas 'PointedEars' Lahn
    Jul 13, 2005
  4. Replies:
    5
    Views:
    151
    Randy Webb
    Jun 21, 2005
  5. VK
    Replies:
    36
    Views:
    624
    Martin Honnen
    Aug 3, 2005
Loading...

Share This Page