Re: Ordering python sets

Discussion in 'Python' started by Tim Chase, Oct 22, 2008.

  1. Tim Chase

    Tim Chase Guest

    > I think that using Python sets would be the best choice, but I also
    > need integers to be ordered inside the set and I've just found out
    > that, instead, Python sets are unordered collections.


    Sets (both in Python, and their mathematical definition) are
    unordered. However, some simple testing in my current python
    seems to indicate that both their repr() and their __iter__()
    seem to traverse in sorted order (though that may be a niceness
    that Python performs in 2.3 and above). You can just order them
    when you go to operate on them:

    import random as r

    # to test in <2.4
    from sets import Set as set
    s = set([r.randint(1,1000) for _ in range(1000)])
    # for 2.4+
    s = set(r.randint(1,1000) for _ in range(1000))

    print s
    for i in sorted(list(s)):
    do_something(i)

    Though for each test, in 2.3, 2.4, and 2.5 that I've got
    installed on my local machine, they each printed "s" in-order,
    and the iteration occurred in-order as well, even without the
    added "sorted(list(s))" code.

    -tkc
     
    Tim Chase, Oct 22, 2008
    #1
    1. Advertising

  2. Tim Chase

    Peter Otten Guest

    Tim Chase wrote:

    > Though for each test, in 2.3, 2.4, and 2.5 that I've got
    > installed on my local machine, they each printed "s" in-order,
    > and the iteration occurred in-order as well, even without the
    > added "sorted(list(s))" code.


    You need more tests then ;)

    >>> list(set([1,1000]))

    [1000, 1]

    By the way, sorted(s) is sufficient; sorted() accepts arbitrary iterables.

    Peter
     
    Peter Otten, Oct 22, 2008
    #2
    1. Advertising

  3. Tim Chase

    Mr.SpOOn Guest

    On Wed, Oct 22, 2008 at 3:37 PM, Peter Otten <> wrote:
    > Tim Chase wrote:
    >
    >> Though for each test, in 2.3, 2.4, and 2.5 that I've got
    >> installed on my local machine, they each printed "s" in-order,
    >> and the iteration occurred in-order as well, even without the
    >> added "sorted(list(s))" code.

    >
    > You need more tests then ;)
    >
    >>>> list(set([1,1000]))

    > [1000, 1]


    It seems to me that it orders elements when you add using the add()
    method, but if you create a set starting from a list, it may result
    unordered.

    Anyway, the add() fails to order if you try to add a float in a set of integers.

    > By the way, sorted(s) is sufficient; sorted() accepts arbitrary iterables.


    I'll try with this.
     
    Mr.SpOOn, Oct 22, 2008
    #3
  4. Tim Chase

    Roy Smith Guest

    In article <>,
    Mr.SpOOn <> wrote:

    > It seems to me that it orders elements when you add using the add()
    > method, but if you create a set starting from a list, it may result
    > unordered.


    Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
    object is an unordered collection". If you write code which depends on a
    set preserving order, are going to get burned.

    If you want something that preserves order, use a list. If the O(n) lookup
    time of a list is too slow, you can get O(log n) with heapq.

    If you truly need O(1) lookup time AND need to preserve order, you might
    consider writing a class which does something like stores the values in a
    list, but use a dict to map value -> list_index, and maintains both data
    structures in parallel. Or, roll your own tree structure. Or wait for
    true STL-style trees to be added to Python :)
     
    Roy Smith, Oct 22, 2008
    #4
  5. Tim Chase

    Mr.SpOOn Guest

    On Wed, Oct 22, 2008 at 4:30 PM, Roy Smith <> wrote:
    > In article <>,
    > Mr.SpOOn <> wrote:
    >
    >> It seems to me that it orders elements when you add using the add()
    >> method, but if you create a set starting from a list, it may result
    >> unordered.

    >
    > Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
    > object is an unordered collection". If you write code which depends on a
    > set preserving order, are going to get burned.


    Yes, of course :D
    I wasn't going to count on that.

    > If you want something that preserves order, use a list. If the O(n) lookup
    > time of a list is too slow, you can get O(log n) with heapq.


    Well, maybe I can just do it using sets and the sorted() method. If
    this doesn't satisfy me, I think I'll just use lists.
     
    Mr.SpOOn, Oct 22, 2008
    #5
  6. On Wed, 22 Oct 2008 15:37:03 +0200, Peter Otten <> wrote:
    > Tim Chase wrote:
    >
    >> Though for each test, in 2.3, 2.4, and 2.5 that I've got
    >> installed on my local machine, they each printed "s" in-order,
    >> and the iteration occurred in-order as well, even without the
    >> added "sorted(list(s))" code.

    >
    > You need more tests then ;)
    >
    >>>> list(set([1,1000]))

    > [1000, 1]


    So one wonders, of course, why Mr. Otten chose 1000; and one explores:

    $ python
    Python 2.4.3 (#2, Jul 31 2008, 21:56:52)
    [snip]
    >>> for x in 2, 100, 199, 200, 300, 400, 500, 600:

    .... list( set( [ 1, x ] ) )
    ....
    [1, 2]
    [1, 100]
    [1, 199]
    [200, 1]
    [1, 300]
    [400, 1]
    [1, 500]
    [600, 1]
    >>>



    --
    To email me, substitute nowhere->spamcop, invalid->net.
     
    Peter Pearson, Oct 22, 2008
    #6
  7. On approximately 10/22/2008 7:48 AM, came the following characters from
    the keyboard of Mr.SpOOn:
    > On Wed, Oct 22, 2008 at 4:30 PM, Roy Smith <> wrote:
    >
    >> In article <>,
    >> Mr.SpOOn <> wrote:
    >>
    >>
    >>> It seems to me that it orders elements when you add using the add()
    >>> method, but if you create a set starting from a list, it may result
    >>> unordered.
    >>>

    >> Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
    >> object is an unordered collection". If you write code which depends on a
    >> set preserving order, are going to get burned.
    >>

    >
    > Yes, of course :D
    > I wasn't going to count on that.
    >
    >
    >> If you want something that preserves order, use a list. If the O(n) lookup
    >> time of a list is too slow, you can get O(log n) with heapq.
    >>

    >
    > Well, maybe I can just do it using sets and the sorted() method. If
    > this doesn't satisfy me, I think I'll just use lists.
    >

    Be aware of the bisect module if you decide to go to (sorted) lists.

    --
    Glenn -- http://nevcal.com/
    ===========================
    A protocol is complete when there is nothing left to remove.
    -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
     
    Glenn Linderman, Oct 22, 2008
    #7
  8. Tim Chase

    Peter Otten Guest

    Peter Pearson wrote:

    > On Wed, 22 Oct 2008 15:37:03 +0200, Peter Otten <> wrote:
    >> Tim Chase wrote:
    >>
    >>> Though for each test, in 2.3, 2.4, and 2.5 that I've got
    >>> installed on my local machine, they each printed "s" in-order,
    >>> and the iteration occurred in-order as well, even without the
    >>> added "sorted(list(s))" code.

    >>
    >> You need more tests then ;)
    >>
    >>>>> list(set([1,1000]))

    >> [1000, 1]

    >
    > So one wonders, of course, why Mr. Otten chose 1000; and one explores:
    >
    > $ python
    > Python 2.4.3 (#2, Jul 31 2008, 21:56:52)
    > [snip]
    >>>> for x in 2, 100, 199, 200, 300, 400, 500, 600:

    > ... list( set( [ 1, x ] ) )
    > ...
    > [1, 2]
    > [1, 100]
    > [1, 199]
    > [200, 1]
    > [1, 300]
    > [400, 1]
    > [1, 500]
    > [600, 1]
    >>>>


    Here's another one:

    >>> set([1,9])

    set([1, 9])
    >>> set([9,1])

    set([9, 1])

    This time I did indeed search systematically...

    Peter
     
    Peter Otten, Oct 22, 2008
    #8
  9. Tim Chase

    Peter Otten Guest

    Duncan Booth wrote:

    > Peter Otten <> wrote:
    >
    >> Here's another one:
    >>
    >>>>> set([1,9])

    >> set([1, 9])
    >>>>> set([9,1])

    >> set([9, 1])
    >>
    >> This time I did indeed search systematically...
    >>

    > You missed one with smaller values:
    >>>> set([8,0])

    > set([8, 0])
    >>>> set([0,8])

    > set([0, 8])


    I searched the minimal combination with one...

    > You can work some of it out quite easily instead of searching. The hash
    > value for an integer is itself, the initial space allocated for the set is
    > 8 slots so each value is reduced modulo 8. If the values you insert don't
    > clash then the order of insertion doesn't matter. If there are values
    > which coincide on the same slot then the second one inserted will go into
    > a different slot so the order may vary.


    I guess I have to move the goal posts to beat you:

    >>> set([-1,-2]), set([-2,-1])

    (set([-2, -1]), set([-1, -2]))

    For that one the number of slots doesn't matter because

    >>> hash(-1), hash(-2)

    (-2, -2)

    Peter
     
    Peter Otten, Oct 22, 2008
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Grzegorz Dostatni

    Python sets.

    Grzegorz Dostatni, May 4, 2004, in forum: Python
    Replies:
    4
    Views:
    1,224
    Dave Reed
    May 5, 2004
  2. sapsi

    Sets in Python

    sapsi, Sep 19, 2007, in forum: Python
    Replies:
    33
    Views:
    916
    sapsi
    Sep 22, 2007
  3. Mr.SpOOn

    Ordering python sets

    Mr.SpOOn, Oct 22, 2008, in forum: Python
    Replies:
    31
    Views:
    1,057
    Mr.SpOOn
    Nov 6, 2008
  4. nbigaouette

    Z-Ordering (Morton ordering) question

    nbigaouette, Nov 5, 2009, in forum: C Programming
    Replies:
    2
    Views:
    2,286
  5. Eduardo Schettino
    Replies:
    3
    Views:
    269
    Lie Ryan
    Apr 25, 2010
Loading...

Share This Page