list of unique non-subset sets

les_ander · Mar 17, 2005

Hi,
I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

Lu=[] # unique list of sets
Lu.append( L[0] )
for i in range(1,len(L)):
s=L
# check for equality
if s in L:
continue
# check for subseting
for j in len(Lu):
s1=Lu[j]
if len (s-s1)==0:
continue
elif len(s1-s)==0:
Lu=s1 # replace with the bigger

But this does not work since I get
Lu=[set(['a', 'c', 'b'])]

What am I doing wrong?
thanks

Steven Bethard · Mar 17, 2005

Hi,
I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

py> s1 = set(['a','b','c'])
py> s2 = set(['a','c'])
py> s3 = set(['a','d','e','f'])
py> s4 = set(['r','k','l'])
py> s5 = set(['r','k','l'])
py> lst = [s1, s2, s3, s4, s5]
py> uniqlst = []
py> for lstset in lst:
.... for uniqlstset in uniqlst:
.... if lstset.issubset(uniqlstset):
.... break
.... else:
.... uniqlst.append(lstset)
....
py> uniqlst
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l'])]

Not horribly efficient, mind you. But I believe it's correct.

I basically took your description "Given a list of sets, I need to get a
list of unique sets such that non[e] of the set is an subset of
another" and translated that to code. So I iterate through each element
in lst, and and only add that element to uniqlst if it's not a subset of
any of the items already in uniqlst.

STeVe

Raymond Hettinger · Mar 17, 2005

[[email protected]]

I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

This should do the trick:

result = []
for s1 in L:
for s2 in result:
if s1 <= s2:
break
else:
result.append(s1)

print result

Raymond Hettinger

Kent Johnson · Mar 17, 2005

Raymond said:
[[email protected]]

I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

Click to expand...

This should do the trick:

result = []
for s1 in L:
for s2 in result:
if s1 <= s2:
break
else:
result.append(s1)

print result

If I understand the original post correctly, you also need to check for an existing set being a
subset of the set you are adding. A better test case is
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
# ----------------------- > cleaned-up list should contain s1, s3, s5, s8

which both Raymond and STeVe's proposals fail.

Kent

Steven Bethard · Mar 17, 2005

Kent said:
Raymond said:

[[email protected]]

I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

Click to expand...

This should do the trick:

result = []
for s1 in L:
for s2 in result:
if s1 <= s2:
break
else:
result.append(s1)

print result

Click to expand...

If I understand the original post correctly, you also need to check for
an existing set being a subset of the set you are adding. A better test
case is
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
# ----------------------- > cleaned-up list should contain s1, s3, s5, s8

which both Raymond and STeVe's proposals fail.

Can you just do:

py> def uniq(lst):
.... result = []
.... for s1 in sorted(lst, reverse=True):
.... for s2 in result:
.... if s1 <= s2:
.... break
.... else:
.... result.append(s1)
.... return result
....
py> lst = [set(['a','b','c']),
.... set(['a','c']),
.... set(['a','d','e','f']),
.... set(['r','k','l']),
.... set(['r','k','l']),
.... set(['g', 'h']),
.... set(['h', 'i']),
.... set(['g', 'h', 'i'])]
py> uniq(lst)
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l']),
set(['i', 'h', 'g'])]

Haven't thought about it too hard though...

STeVe

Peter Otten · Mar 17, 2005

Steven said:
Can you just do:

py> def uniq(lst):
...     result = []
...     for s1 in sorted(lst, reverse=True):
...         for s2 in result:
...             if s1 <= s2:
...                 break
...         else:
...             result.append(s1)
...     return result
...
py> lst = [set(['a','b','c']),
...        set(['a','c']),
...        set(['a','d','e','f']),
...        set(['r','k','l']),
...        set(['r','k','l']),
...        set(['g', 'h']),
...        set(['h', 'i']),
...        set(['g', 'h', 'i'])]
py> uniq(lst)
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l']),
set(['i', 'h', 'g'])]

No, you can't:

"""Since sets only define partial ordering (subset relationships), the
output of the list.sort() method is undefined for lists of sets.
"""

Maybe you could just change the above code to

....
for s1 in sorted(lst, key=len, reverse=True):
....

Haven't thought about it too hard though...

Same here.

Peter

Raymond Hettinger · Mar 17, 2005

Kent Johnson said:
Raymond said:

[[email protected]]

I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

Click to expand...

This should do the trick:

result = []
for s1 in L:
for s2 in result:
if s1 <= s2:
break
else:
result.append(s1)

print result

Click to expand...

If I understand the original post correctly, you also need to check for an existing set being a
subset of the set you are adding. A better test case is
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
# ----------------------- > cleaned-up list should contain s1, s3, s5, s8

Try this:

result = []
seen = set()
for s1 in L:
for s2 in L:
if s1 < s2:
break
else:
if s1 not in seen:
result.append(s1)
seen.add(frozenset(s1))
print result

Raymond Hettinger

bearophileHUGS · Mar 18, 2005

s1 = set(['a','b','c'])
s2 = set(['a','c'])
s3 = set(['a','d','e','f'])
s4 = set(['r','k','l'])
s5 = set(['r','k','l'])
ls = [s1,s2,s3,s4,s5]
result1 = [s1, s3, s5]

A result can contain s4 or s5 at random. This problem looks like the
one of searching the correct hyperedges for a hypergraph. s1-5 are the
hyperedges, and "a", "b",... are the nodes. Probably this problem is
already solved somewhere.

.. # You can start removing the duplicated sets (slow operation):
.. ls2 = list(set(map(frozenset, ls)))
.. # Then I've found a solution similar to the first Raymond Hettinger
one:
.. result = []
.. for si in ls2:
.. for sj in result:
.. if si <= sj:
.. break
.. else:
.. result.append(si)
.. print result

This can be slow, but it probably can be made a bit faster version, but
it's not easy. You can put the nodes in a undirected graph that
represents the hypergraph, with other "extra nodes" that represent the
hyperedges already in result).

.. print list(enumerate(map(list,ls2)))
.. # Output:
[(0,['k','r','l']),(1,['a','c','b']),(2,['a','e','d','f']),(3,['a','c'])]
.. # Probably you can work on ls too, avoiding the slow creation of ls2,
but you have to cheek
.. # more things later.
.. import graph
.. g = graph.Graph()
.. for n1,s in enumerate(ls2):
.. for n2 in s:
.. g.addBiArc(n1, n2) # add edges and nodes as needed
..
.. # then you can find the connected components of the graph
.. print g # Output: 0-k,l,r 1-a,b,c 2-a,d,e,f 3-a,c
.. cc = g.connectedComponents()
.. # now cc = [[0, 'k', 'r', 'l'], [1, 'a', 'c', 'b', 2, 3, 'e', 'd',
'f']]
.. # and you can filter the hyperedges. This can be improved a lot:
.. ccf = [ [n for n in c if type(n)==int] for c in cc ]
.. # now ccf = [[0], [1, 2, 3]]

Eppstein unionFind data structure can also be used for a similar
purpose. 0-th element of ls2 is set(['r','k','l']). It's totally
disjoint from the others. Now you don't have to look every sj in
result, but only the sj of result that are inside its connected ones.
For example, you don't have to test set(['a','c']) against
set(['r','k','l']). I don't know how much faster this can be compared
to the simple O(len(ls)^2) program seen before... But it can be useful
for situations with lots of sets.

Bye,
Bearophile

bearophileHUGS · Mar 18, 2005

Looking at all the hyperedges in the connected component is a big
waste... You can look at just the hyperedges that share one or more
nodes.
(Nodes are the original letters contained in the sets, and they must be
hashable).

If nodes aren't integers in [0, len(l)) then you can use this simpler
code:

.. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
.. set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
.. from graph import Graph
.. # http://www.fantascienza.net/animalia/graph.zip
.. # you can something similar with Boost
..
.. debug = True
.. g = Graph()
.. for n1,s in enumerate(l):
.. g.addNode(n1, nodeData=1)
.. for n2 in s:
.. g.addBiArc(n1, n2)
.. if debug: print g # ****
..
.. result = []
.. for i,s in enumerate(l):
.. ss = set()
.. for n1 in s:
.. ss.update(g.xoutNodes(n1))
.. ss.remove(i)
.. if debug: print i, ss # ****
..
.. for sj in ss:
.. if s <= l[sj]:
.. break
.. else:
.. result.append(s)
.. print result

If nodes are hashable, but they can cointain integers in [0, len(l)),
then you can use this other code with nodeID translations (in a
different graph implementation, like Gato one, such translations can be
automatic):

.. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
.. set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
.. from graph import Graph
..
.. debug = True
.. nodes = set()
.. for s in l:
.. nodes.update(s)
.. lenl = len(l)
.. nodes = dict( (i+lenl, n) for i,n in enumerate(nodes) )
.. if debug: print "nodes:", nodes # ****
.. nodesid = dict( (n,i) for i,n in nodes.iteritems() )
.. l2 = [ set( nodesid[n] for n in s ) for s in l ]
.. if debug: print "l2:", l2 # ****
..
.. g = Graph()
.. for n1,s in enumerate(l2):
.. g.addNode(n1)
.. for n2 in s:
.. g.addBiArc(n1, n2)
.. if debug: print g # ****
..
.. result = []
.. for i,s in enumerate(l2):
.. ss = set()
.. for n1 in s:
.. ss.update(g.xoutNodes(n1))
.. ss.remove(i)
.. if debug: print "i, ss:", i, ss # ****
..
.. for sj in ss:
.. if s <= l2[sj]:
.. break
.. else:
.. result.append(s)
.. if debug: print "result:", result # ****
.. result2 = [ set( nodes[n] for n in s ) for s in result ]
.. print result2

If the hyperedges define a sparse hypergraph, then this code can be
quite faster than the original quadratic one (if len(l) is big enough).

Bye,
Bearophile

Bengt Richter · Mar 18, 2005

Kent Johnson said:
Kent Johnson said:

Raymond said:

[[email protected]]

I have many set objects some of which can contain same group of object
while others can be subset of the other. Given a list of sets,
I need to get a list of unique sets such that non of the set is an
subset of another or contain exactly the same members.

Tried to do the following:
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
L=[s1,s2,s3,s4,s5]
----------------------- > cleaned-up list should contain s1, s3, s5

This should do the trick:

result = []
for s1 in L:
for s2 in result:
if s1 <= s2:
break
else:
result.append(s1)

print result

Click to expand...

If I understand the original post correctly, you also need to check for an existing set being a
subset of the set you are adding. A better test case is
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
# ----------------------- > cleaned-up list should contain s1, s3, s5, s8

Click to expand...

Try this:

result = []
seen = set()
for s1 in L:
for s2 in L:
if s1 < s2:
break
else:
if s1 not in seen:
result.append(s1)
seen.add(frozenset(s1))
print result

Actually, ISTM there are more than one set of sets that can be drawn from the
original set that internally satisfy the criteria of no internal duplicates or subsets. E.g.,
here are some lists of sets selected from L above. I believe (unless I goofed) the requirement
""" """
is satisfied internally within each list below.

[set(['h', 'g']), set(['i', 'h'])]
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
[set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
[set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f'])]
[]
[set(['a', 'c', 'b']), set(['h', 'g']), set(['i', 'h'])]
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['h', 'g'])]
[set(['a', 'c', 'b'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
[set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
[set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h', 'g'])]
[set(['a', 'c'])]
[set(['k', 'r', 'l']), set(['a', 'c']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['h', 'g']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
[set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h'])]
[set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c'])]
[set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
[set(['a', 'e', 'd', 'f']), set(['a', 'c'])]
[set(['k', 'r', 'l']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
[set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['i', 'h'])]
[set(['a', 'c', 'b']), set(['i', 'h', 'g'])]
[set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
[set(['a', 'c', 'b']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['i', 'h', 'g'])]
[set(['a', 'c', 'b']), set(['i', 'h'])]
[set(['a', 'c']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c']), set(['i', 'h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f'])]
[set(['a', 'c']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['i', 'h', 'g'])]
[set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g'])]
[set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
[set(['a', 'c']), set(['i', 'h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'c']), set(['i', 'h'])]
[set(['k', 'r', 'l']), set(['a', 'c'])]
[set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
[set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['h', 'g']), set(['i', 'h'])]
[set(['a', 'e', 'd', 'f'])]
[set(['i', 'h', 'g'])]

Regards,
Bengt Richter

les_ander · Mar 18, 2005

OK, so I need to be more precise.
Given a list of sets, output the largest list of sets (from this list,
order does not matter) such that:
1) there is no set that is a PROPER subset of another set in this list
2) no two sets have exactly the same members (100% overlap)

Seems like this problem is much harder than I thought...

Bengt Richter · Mar 19, 2005

OK, so I need to be more precise.
Given a list of sets, output the largest list of sets (from this list,
order does not matter) such that:
1) there is no set that is a PROPER subset of another set in this list
2) no two sets have exactly the same members (100% overlap)

Seems like this problem is much harder than I thought...

two from the above come out length 5:

5: [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
5: [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]

How do you define "largest" ? ;-)
I guess you could sum(map(len, setlist)) as a measure, but what if that makes
a tie between two setlists (as I think it could, in general)?

Regards,
Bengt Richter

Raymond Hettinger · Mar 19, 2005

[[email protected] ]
[Bengt Richter]

two from the above come out length 5:

5: [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']),

set(['h', 'g']), set(['i', 'h'])]

5: [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']),

set(['h', 'g']), set(['i', 'h'])]

How do you define "largest" ? ;-)
I guess you could sum(map(len, setlist)) as a measure, but what if that makes
a tie between two setlists (as I think it could, in general)?

With multiple outputs satisfying the constraints, I would suspect that there is
something wrong with the original spec (not as a stand-alone problem, but as
component of a real application).

Raymond Hettinger

les_ander · Mar 19, 2005

Once again my specs were incomplete.
By largest I mean exactly what you pointed out as in sum(map(len,
setlist)).

I think this might work--sorting of the initial list should do the
trick.
1) sort the sets by size (in decending order)
2) put the first (largest) into a new list (Lu)
for s in Lnew[1:]:
keep=True
for i in range(len( Lun) ):
if len(s)==len( s & Lun ):
keep=False
break
if keep==True:
Lun.append( s )

----------------- here is the complete code
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
length=[len(s) for s in L]
L2= sorted(zip(length,range(len(L))))
Lnew=[L[j] for (i,j) in L2]
Lnew.reverse()
Lun=[Lnew[0]] # list with the result
for s in Lnew[1:]:
keep=True
for i in range(len( Lun) ):
if len(s)==len( s & Lun ):
keep=False
break
if keep==True:
Lun.append( s )

#----------------[set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g']), set(['k', 'r', 'l']),
set(['a', 'c', 'b'])]

Seems like I got it.

Bengt Richter · Mar 19, 2005

Once again my specs were incomplete.
By largest I mean exactly what you pointed out as in sum(map(len,
setlist)).

But that will not necessarily yield a single setlist taken from the source set list,
so you still need a selection amongst equals. You can make it explicit, or
you can say you'll take whatever a sort puts at the sorted list extreme, which you
seem to have done ;-)

I think this might work--sorting of the initial list should do the
trick.
1) sort the sets by size (in decending order)
2) put the first (largest) into a new list (Lu)
for s in Lnew[1:]:
keep=True
for i in range(len( Lun) ):
if len(s)==len( s & Lun ):
keep=False
break
if keep==True:
Lun.append( s )

----------------- here is the complete code
s1=set(['a','b','c'])
s2=set(['a','c'])
s3=set(['a','d','e','f'])
s4=set(['r','k','l'])
s5=set(['r','k','l'])
s6=set(['g', 'h'])
s7=set(['h', 'i'])
s8=set(['g', 'h', 'i'])

L=[s1,s2,s3,s4,s5,s6,s7,s8]
length=[len(s) for s in L]
L2= sorted(zip(length,range(len(L))))
Lnew=[L[j] for (i,j) in L2]
Lnew.reverse()
Lun=[Lnew[0]] # list with the result
for s in Lnew[1:]:
keep=True
for i in range(len( Lun) ):
if len(s)==len( s & Lun ):
keep=False
break
if keep==True:
Lun.append( s )

#----------------[set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g']), set(['k', 'r', 'l']),
set(['a', 'c', 'b'])]

Seems like I got it.

Does it work on
L = [set('abc'), set('def'), set('abcdef')]
?
I get:
0: []
6: [set(['a', 'c', 'b']), set(['e', 'd', 'f'])]
6: [set(['a', 'c', 'b', 'e', 'd', 'f'])]

Your code above:

>>> L = [set('abc'), set('def'), set('abcdef')]
>>> length=[len(s) for s in L]
>>> L2= sorted(zip(length,range(len(L))))
>>> Lnew=[L[j] for (i,j) in L2]
>>> Lnew.reverse()
>>> Lun=[Lnew[0]] # list with the result
>>> for s in Lnew[1:]:

Click to expand...

Click to expand...

Click to expand...

... keep=True
... for i in range(len( Lun) ):
... if len(s)==len( s & Lun ):
... keep=False
... break
... if keep==True:
... Lun.append( s )
... [set(['a', 'c', 'b', 'e', 'd', 'f'])]

But your successful set list is not unique in its success value (6) ;-)

Regards,
Bengt Richter

Dirk Thierbach · Mar 19, 2005

Raymond Hettinger said:
[[email protected] ]

[Bengt Richter]

two from the above come out length 5:

Click to expand...

With multiple outputs satisfying the constraints, I would suspect
that there is something wrong with the original spec (not as a
stand-alone problem, but as component of a real application).

I don't know what the application is trying to do, but if I understand
it correctly, he is asking for something called a "maximal anti-chain".

For any partial order (e.g., subset relation), a chain is a set of
elements which are all comparable (and hence there is a linear or
total order on this set). In a similar way, an anti-chain is a set of
elements consisting of elements that are incomparable to each
other. Anti-chains themselves can be ordered by subset inclusion, and
thefore maximal ("largest") anti-chains can be found, but in general
there is no unique "greatest" anti-chain.

So the spec is perfectly ok and makes a lot of sense mathematically,
it's just that there is no unique solution.

Maybe googling for "maximal anti-chain" digs up some more information
(I haven't tried).

- Dirk

How to speed this code	3	Nov 16, 2022
2 JK Circuit in VHDL	0	Mar 29, 2019
Multiple disjoint sample sets?	3	Jan 11, 2013
Range / empty list issues??	1	Dec 11, 2023
Code help please	4	May 19, 2023
dynamic list of structures	2	Feb 13, 2009
how to find the longst element list of lists	16	Jan 7, 2007
pls help me ; vhdl;	0	Jun 21, 2016

list of unique non-subset sets

les_ander

Steven Bethard

Raymond Hettinger

Kent Johnson

Steven Bethard

Peter Otten

Raymond Hettinger

bearophileHUGS

bearophileHUGS

Bengt Richter

les_ander

Bengt Richter

Raymond Hettinger

les_ander

Bengt Richter

Dirk Thierbach

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads