removing duplicates, or, converting Set() to string

M

maphew

Hello,

I have some lists for which I need to remove duplicates. I found the
sets.Sets() module which does exactly this, but how do I get the set
back out again?

# existing input: A,B,B,C,D
# desired result: A,B,C,D

import sets
dupes = ['A','B','B','C','D']
clean = sets.Set(dupes)

out = open('clean-list.txt','w')
out.write(clean)
out.close

---
out.write(clean) fails with "TypeError: argument 1 must be string or
read-only character buffer, not Set" and out.write( str(clean) )
creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.

thanks in advance for your time,

-matt
 
B

bearophileHUGS

The write accepts strings only, so you may do:

out.write( repr(list(clean)) )

Notes:
- If you need the strings in a nice order, you may sort them before
saving them:
out.write( repr(sorted(clean)) )
- If you need them in the original order you need a stable method, you
can extract the relevant code from this large blob:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/438599

Bye,
bearophile
 
S

Simon Forman

Hello,

I have some lists for which I need to remove duplicates. I found the
sets.Sets() module which does exactly this, but how do I get the set
back out again?

# existing input: A,B,B,C,D
# desired result: A,B,C,D

import sets
dupes = ['A','B','B','C','D']
clean = sets.Set(dupes)

out = open('clean-list.txt','w')
out.write(clean)
out.close

---
out.write(clean) fails with "TypeError: argument 1 must be string or
read-only character buffer, not Set" and out.write( str(clean) )
creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.

thanks in advance for your time,

-matt

Do ','.join(clean) to make a single string with commas between the
items in the set. (If the items aren't all strings, you'll need to
convert them to strings first.)

Peace,
~Simon
 
J

John Machin

Hello,

I have some lists for which I need to remove duplicates. I found the
sets.Sets() module which does exactly this

I think you mean that you found the sets.Set() constructor in the set
module.
If you are using Python 2.4, use the built-in set() function instead.
If you are using Python 2.3, consider upgrading if you can.
but how do I get the set
back out again?


# existing input: A,B,B,C,D
# desired result: A,B,C,D

import sets
dupes = ['A','B','B','C','D']
clean = sets.Set(dupes)

out = open('clean-list.txt','w')
out.write(clean)
out.close

as expected
and out.write( str(clean) )
creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.

again as expected.

BTW, in practice you'd probably want to append '\n' to the string that
you're writing.

You should be able to get a (possibly unsorted) list of the contents of
*any* container like this (but note that dictionaries divulge only
their keys):
dupes = ['A','B','B','C','D']
clean = set(dupes) # the Python 2.4+ way
clean set(['A', 'C', 'B', 'D'])
[x for x in clean] ['A', 'C', 'B', 'D']

list(clean) would work as well.

If you want the output sorted, then use the list.sort() method. Details
in the manual.

HTH,
John
 
J

John Machin

Simon said:
Do ','.join(clean) to make a single string with commas between the
items in the set. (If the items aren't all strings, you'll need to
convert them to strings first.)

And if the items themselves could contain commas, or quote characters,
you might like to look at the csv module.
 
M

maphew

thank you everybody for your help! That worked perfectly. :) I really
appreciate the time you spent answering what is probably a pretty basic
question for you. It's nice not to be ignored.

be well,

-matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top