removing duplicates, or, converting Set() to string

Discussion in 'Python' started by maphew@gmail.com, Jul 27, 2006.

  1. Guest

    Hello,

    I have some lists for which I need to remove duplicates. I found the
    sets.Sets() module which does exactly this, but how do I get the set
    back out again?

    # existing input: A,B,B,C,D
    # desired result: A,B,C,D

    import sets
    dupes = ['A','B','B','C','D']
    clean = sets.Set(dupes)

    out = open('clean-list.txt','w')
    out.write(clean)
    out.close

    ---
    out.write(clean) fails with "TypeError: argument 1 must be string or
    read-only character buffer, not Set" and out.write( str(clean) )
    creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.

    thanks in advance for your time,

    -matt
    , Jul 27, 2006
    #1
    1. Advertising

  2. Guest

    The write accepts strings only, so you may do:

    out.write( repr(list(clean)) )

    Notes:
    - If you need the strings in a nice order, you may sort them before
    saving them:
    out.write( repr(sorted(clean)) )
    - If you need them in the original order you need a stable method, you
    can extract the relevant code from this large blob:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/438599

    Bye,
    bearophile
    , Jul 27, 2006
    #2
    1. Advertising

  3. Simon Forman Guest

    wrote:
    > Hello,
    >
    > I have some lists for which I need to remove duplicates. I found the
    > sets.Sets() module which does exactly this, but how do I get the set
    > back out again?
    >
    > # existing input: A,B,B,C,D
    > # desired result: A,B,C,D
    >
    > import sets
    > dupes = ['A','B','B','C','D']
    > clean = sets.Set(dupes)
    >
    > out = open('clean-list.txt','w')
    > out.write(clean)
    > out.close
    >
    > ---
    > out.write(clean) fails with "TypeError: argument 1 must be string or
    > read-only character buffer, not Set" and out.write( str(clean) )
    > creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.
    >
    > thanks in advance for your time,
    >
    > -matt


    Do ','.join(clean) to make a single string with commas between the
    items in the set. (If the items aren't all strings, you'll need to
    convert them to strings first.)

    Peace,
    ~Simon
    Simon Forman, Jul 27, 2006
    #3
  4. John Machin Guest

    wrote:
    > Hello,
    >
    > I have some lists for which I need to remove duplicates. I found the
    > sets.Sets() module which does exactly this


    I think you mean that you found the sets.Set() constructor in the set
    module.
    If you are using Python 2.4, use the built-in set() function instead.
    If you are using Python 2.3, consider upgrading if you can.

    > but how do I get the set
    > back out again?



    >
    > # existing input: A,B,B,C,D
    > # desired result: A,B,C,D
    >
    > import sets
    > dupes = ['A','B','B','C','D']
    > clean = sets.Set(dupes)
    >
    > out = open('clean-list.txt','w')
    > out.write(clean)
    > out.close
    >
    > ---
    > out.write(clean) fails with "TypeError: argument 1 must be string or
    > read-only character buffer, not Set"


    as expected

    > and out.write( str(clean) )
    > creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.


    again as expected.

    BTW, in practice you'd probably want to append '\n' to the string that
    you're writing.

    You should be able to get a (possibly unsorted) list of the contents of
    *any* container like this (but note that dictionaries divulge only
    their keys):

    >>> dupes = ['A','B','B','C','D']
    >>> clean = set(dupes) # the Python 2.4+ way
    >>> clean

    set(['A', 'C', 'B', 'D'])
    >>> [x for x in clean]

    ['A', 'C', 'B', 'D']
    >>>


    list(clean) would work as well.

    If you want the output sorted, then use the list.sort() method. Details
    in the manual.

    HTH,
    John
    John Machin, Jul 27, 2006
    #4
  5. John Machin Guest

    Simon Forman wrote:

    >
    > Do ','.join(clean) to make a single string with commas between the
    > items in the set. (If the items aren't all strings, you'll need to
    > convert them to strings first.)
    >


    And if the items themselves could contain commas, or quote characters,
    you might like to look at the csv module.
    John Machin, Jul 27, 2006
    #5
  6. maphew Guest

    thank you everybody for your help! That worked perfectly. :) I really
    appreciate the time you spent answering what is probably a pretty basic
    question for you. It's nice not to be ignored.

    be well,

    -matt
    maphew, Jul 27, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. William F. Robertson, Jr.

    Re: Removing duplicates from a DropdownList

    William F. Robertson, Jr., Aug 4, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    583
    brian richards
    Aug 4, 2003
  2. makthar
    Replies:
    0
    Views:
    365
    makthar
    Aug 4, 2003
  3. brian richards

    Re: Removing duplicates from a DropdownList

    brian richards, Aug 4, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    360
    brian richards
    Aug 4, 2003
  4. Fred
    Replies:
    15
    Views:
    70,998
    Archer
    Mar 12, 2005
  5. M B HONG 20

    Removing Duplicates from a string

    M B HONG 20, Oct 18, 2005, in forum: Javascript
    Replies:
    6
    Views:
    116
    Mark Szlazak
    Oct 19, 2005
Loading...

Share This Page