sets and subsets

B

Bart Nessux

By using lists, I can create sets of number. Suppose I have three lists.
One list is the super-set, one is a set that contains all the numbers
(just like the super-set) and the last is sub-set of the super-set. For
example:

a = [1,2,3,4,5] # The super-set.
b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
c = [2,4] # A sub-set

I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
this would make the sets look like this:

a = [1,2,3,4,5]
b = [1,3,5]
c = [2,4]

How do I test set c to find what it contains and then look at set b to
see if it contains any of those same numbers, and if so, remove them.
 
A

Amy G

I am sure there is a much more elegant way to do this, but here is one
solution.

for item in c:
if b.count(item) > 0:
b.remove(item)
 
E

Elaine Jackson

b=[x for x in b if x not in c]

| By using lists, I can create sets of number. Suppose I have three lists.
| One list is the super-set, one is a set that contains all the numbers
| (just like the super-set) and the last is sub-set of the super-set. For
| example:
|
| a = [1,2,3,4,5] # The super-set.
| b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
| c = [2,4] # A sub-set
|
| I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
| this would make the sets look like this:
|
| a = [1,2,3,4,5]
| b = [1,3,5]
| c = [2,4]
|
| How do I test set c to find what it contains and then look at set b to
| see if it contains any of those same numbers, and if so, remove them.
|
 
L

Lee Harr

By using lists, I can create sets of number. Suppose I have three lists.
One list is the super-set, one is a set that contains all the numbers
(just like the super-set) and the last is sub-set of the super-set. For
example:

a = [1,2,3,4,5] # The super-set.
b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
c = [2,4] # A sub-set

I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
this would make the sets look like this:

a = [1,2,3,4,5]
b = [1,3,5]
c = [2,4]

How do I test set c to find what it contains and then look at set b to
see if it contains any of those same numbers, and if so, remove them.

from sets import Set
a = Set([1, 2, 3, 4, 5])
b = Set([1, 2, 3, 4, 5])
c = Set([2, 4])
s = b - c
s
Set([1, 3, 5])
 
B

Bart Nessux

Works great too. Thanks to all for the info.

Amy said:
There you go... list comprehension. That is definatly nicer to look at.


b=[x for x in b if x not in c]

| By using lists, I can create sets of number. Suppose I have three lists.
| One list is the super-set, one is a set that contains all the numbers
| (just like the super-set) and the last is sub-set of the super-set. For
| example:
|
| a = [1,2,3,4,5] # The super-set.
| b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
| c = [2,4] # A sub-set
|
| I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
| this would make the sets look like this:
|
| a = [1,2,3,4,5]
| b = [1,3,5]
| c = [2,4]
|
| How do I test set c to find what it contains and then look at set b to
| see if it contains any of those same numbers, and if so, remove them.
|
 
P

Peter Otten

Bart said:
By using lists, I can create sets of number. Suppose I have three lists.
One list is the super-set, one is a set that contains all the numbers
(just like the super-set) and the last is sub-set of the super-set. For
example:

a = [1,2,3,4,5] # The super-set.
b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
c = [2,4] # A sub-set

I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
this would make the sets look like this:

a = [1,2,3,4,5]
b = [1,3,5]
c = [2,4]

How do I test set c to find what it contains and then look at set b to
see if it contains any of those same numbers, and if so, remove them.

You want set operations, so why would you use lists?
from sets import Set
a = Set([1,2,3,4,5])
c = Set([2,4])
b = a - c
b
Set([1, 3, 5])

Peter
 
B

Bart Nessux

Peter said:
Bart Nessux wrote:

By using lists, I can create sets of number. Suppose I have three lists.
One list is the super-set, one is a set that contains all the numbers
(just like the super-set) and the last is sub-set of the super-set. For
example:

a = [1,2,3,4,5] # The super-set.
b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
c = [2,4] # A sub-set

I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
this would make the sets look like this:

a = [1,2,3,4,5]
b = [1,3,5]
c = [2,4]

How do I test set c to find what it contains and then look at set b to
see if it contains any of those same numbers, and if so, remove them.


You want set operations, so why would you use lists?

All my data are in lists:

inputFile = file('ips.txt', 'r') #Super-set
include = inputFile.readlines()
inputFile.close()

# The file below is compiled manually by hand... add IPs to it
# whenever you want to exclude them from IP_protection.
readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = readFile.readlines()
readFile.close()

include = [x for x in include if x not in exclude] #Magic of Elaine

outputFile = file('pruned_ips.txt' , 'w')
for i in include:
print>> outputFile, i,
outputFile.close()
 
A

Aahz

[content at *bottom*]

Works great too. Thanks to all for the info.

Amy said:
There you go... list comprehension. That is definatly nicer to look at.


b=[x for x in b if x not in c]

| By using lists, I can create sets of number. Suppose I have three lists.
| One list is the super-set, one is a set that contains all the numbers
| (just like the super-set) and the last is sub-set of the super-set. For
| example:
|
| a = [1,2,3,4,5] # The super-set.
| b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
| c = [2,4] # A sub-set
|
| I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
| this would make the sets look like this:
|
| a = [1,2,3,4,5]
| b = [1,3,5]
| c = [2,4]
|
| How do I test set c to find what it contains and then look at set b to
| see if it contains any of those same numbers, and if so, remove them.
|

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet?
 
P

Peter Otten

Bart said:
Peter said:
Bart Nessux wrote:

By using lists, I can create sets of number. Suppose I have three lists.
One list is the super-set, one is a set that contains all the numbers
(just like the super-set) and the last is sub-set of the super-set. For
example:

a = [1,2,3,4,5] # The super-set.
b = [1,2,3,4,5] # Looks just like the super-set, but it's not.
c = [2,4] # A sub-set

I'd like to remove 2 & 4 from set b BECAUSE they are present in set c...
this would make the sets look like this:

a = [1,2,3,4,5]
b = [1,3,5]
c = [2,4]

How do I test set c to find what it contains and then look at set b to
see if it contains any of those same numbers, and if so, remove them.


You want set operations, so why would you use lists?

All my data are in lists:

All my beer is in sieves.
inputFile = file('ips.txt', 'r') #Super-set
include = inputFile.readlines()
inputFile.close()

# The file below is compiled manually by hand... add IPs to it
# whenever you want to exclude them from IP_protection.
readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = readFile.readlines()
readFile.close()

include = [x for x in include if x not in exclude] #Magic of Elaine

outputFile = file('pruned_ips.txt' , 'w')
for i in include:
print>> outputFile, i,
outputFile.close()

(untested)
from sets import set

inputFile = file('ips.txt', 'r') #Super-set
include = Set(inputFile.readlines())
inputFile.close()

readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = Set(readFile.readlines())
readFile.close()

# No Magic of Elaine

outputFile = file('pruned_ips.txt' , 'w')
for i in include - exclude:
print >> outputFile, i,
outputFile.close()

Peter
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

[Peter Otten]
(untested)
from sets import set
inputFile = file('ips.txt', 'r') #Super-set
include = Set(inputFile.readlines())
inputFile.close()
readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = Set(readFile.readlines())
readFile.close()
# No Magic of Elaine
outputFile = file('pruned_ips.txt' , 'w')
for i in include - exclude:
print >> outputFile, i,
outputFile.close()

Here is an equivalent, shorter algorithm (tested):

from sets import Set
file('pruned_ips.txt', 'w').writelines(
Set(file('ips.txt')) - Set(file('excluded_ips.txt')))

This code relies on `writelines' accepting an iterable, sets returning
their members whenever iterated, Set constructors accepting an iterable,
and files returning their lines whenever iterated. And of course, on
`close' rarely being needed in Python! :)

The order of lines in the produced file is kind of random, however.
 
B

Bart Nessux

Peter Otten wrote:
outputFile = file('pruned_ips.txt' , 'w')
for i in include - exclude:
print >> outputFile, i,
outputFile.close()

Wow! That makes a lot more sense than the list comprehension stuff. I think
I'll use it. Thanks!
 
B

Bart Nessux

François Pinard said:
[Peter Otten]

(untested)
from sets import set

inputFile = file('ips.txt', 'r') #Super-set
include = Set(inputFile.readlines())
inputFile.close()

readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = Set(readFile.readlines())
readFile.close()

# No Magic of Elaine

outputFile = file('pruned_ips.txt' , 'w')
for i in include - exclude:
print >> outputFile, i,
outputFile.close()


Here is an equivalent, shorter algorithm (tested):

from sets import Set
file('pruned_ips.txt', 'w').writelines(
Set(file('ips.txt')) - Set(file('excluded_ips.txt')))

This code relies on `writelines' accepting an iterable, sets returning
their members whenever iterated, Set constructors accepting an iterable,
and files returning their lines whenever iterated. And of course, on
`close' rarely being needed in Python! :)

The order of lines in the produced file is kind of random, however.

Wow! Sets are awesome. I was thinking in terms of lists. A list is like
a set and a set is like a list, but depending on the task at hand, they
have very different applications. Sets work great when one has a
super-set and two sub-sets that need to be compared and modified based
on what they contain and what the super-set contains.

Sets are straight-forward and easy to use too... I can always tell when
I'm trying to do something with a tool that wasn't designed to do what
I'm attempting to do (in this case lists). The task becomes complex and
tedious. Forget about trying to read the code a couple of weeks from now.

Thanks to all for the info on sets!
 
D

Dave K

[Peter Otten]
(untested)
from sets import set
inputFile = file('ips.txt', 'r') #Super-set
include = Set(inputFile.readlines())
inputFile.close()
readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
exclude = Set(readFile.readlines())
readFile.close()
# No Magic of Elaine
outputFile = file('pruned_ips.txt' , 'w')
for i in include - exclude:
print >> outputFile, i,
outputFile.close()

Here is an equivalent, shorter algorithm (tested):

from sets import Set
file('pruned_ips.txt', 'w').writelines(
Set(file('ips.txt')) - Set(file('excluded_ips.txt')))

This code relies on `writelines' accepting an iterable, sets returning
their members whenever iterated, Set constructors accepting an iterable,
and files returning their lines whenever iterated. And of course, on
`close' rarely being needed in Python! :)

The order of lines in the produced file is kind of random, however.

That's very compact and neat, but for completeness I'd like to point
out that it could also be written (more clumsily) in one line with
list comprehensions, retaining the same order of elements as in the
original list:

file('pruned_ips.txt', 'w').writelines([ip for ip in file('ips.txt')
if ip not in file('excluded_ips.txt')])

Of course, your example using sets is much clearer, so I prefer that.

Dave
 
T

Tim Roberts

[content at *bottom*]
...
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet?

You know, it may be time for this crusade to end. What you say is entirely
valid if a single Usenet post is read out of context. However, when using
a threaded newsreader, as 90% of us do, top-posting allows me to read the
new content without moving my eyes. I just press N, N, N to move to the
next message and scan the new content at the top of the message. In this
particular thread, the content quickly grew larger than my newsreader's
preview pane, so bottom-posting requires me to move the focus to the
preview pane and scroll down to read.

Besides, the MOST annoying thing on Usenet is HTML posts.
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

from sets import Set
[Dave K]
file('pruned_ips.txt', 'w').writelines([ip for ip in file('ips.txt')
if ip not in file('excluded_ips.txt')])

The Set solution above swallows both files in memory, but executes
rather quickly. The list comprehension solution uses much less memory,
but as the second file is wholly read for each line of the first file,
it may get prohibitive when files are not small. For very big files,
both solutions are wrong anyway: one should likely disk-sort both files
and do a simultaneous read of the sorted results.
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

[Tim Roberts]
(e-mail address removed) (Aahz) wrote:
[content at *bottom*]
...
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet?
You know, it may be time for this crusade to end.

I'm glad that Aahz has the courage of crusading, so sparing some of
mine! :) In my case at least, crusades are not fun, despite necessary
at times. This particular crusade should only end once people are
educated enough to spontaneously do the proper thing.
In this particular thread, the content quickly grew larger than my
newsreader's preview pane, so bottom-posting requires me to move the
focus to the preview pane and scroll down to read.

People are likely over-quoting, then. Proper quoting is an art, and
when done correctly, quoted material is not an annoyance. On the Python
list, most messages do well in that respect and hopefully, most people
get good habits by mere observation and imitation. It is not bad to
remind people, once in a while, for those who are more slow to get it.

P.S. - Mail or news readers may have an option to hide quoted material.
I sometimes use it in Mutt for messages which are not crafted correctly.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top