More user feedback on Sets.py

  • Thread starter Raymond Hettinger
  • Start date
R

Raymond Hettinger

For Py2.4, I'm working on a C implementation of Sets.py with the possibility of
having a set() type as a builtin. There is a question as whether the current
design for set of sets has been useful.

To support sets-of-sets, the current design includes an automatic conversion to
an immutable set class. The original use case was for implementing graph
algorithms.

Has anyone used sets-of-sets to solve any non-toy problems?
Has anyone successfully applied sets-of-sets to graph algorithms?
If the need has arisen, is the current design sufficient?


Raymond Hettinger
 
D

David Eppstein

"Raymond Hettinger said:
For Py2.4, I'm working on a C implementation of Sets.py with the possibility
of
having a set() type as a builtin. There is a question as whether the current
design for set of sets has been useful.

To support sets-of-sets, the current design includes an automatic conversion
to
an immutable set class. The original use case was for implementing graph
algorithms.

Has anyone used sets-of-sets to solve any non-toy problems?
Has anyone successfully applied sets-of-sets to graph algorithms?
If the need has arisen, is the current design sufficient?

I've certainly been doing some implementations of graph algorithms in
which the graph vertices represent sets of objects (but I've been
representing them as bitvectors since I am still running 2.2 and don't
have Sets.py yet). It would be awkward if I couldn't have sets of
vertices...but I don't see any reason for the sets in these applications
to ever be mutable, so automatic conversion to immutability doesn't seem
to be critical.

For something concrete (that I thought about doing recently but haven't
actually done): conversion of NFA to DFA and then state minimization of
the resulting DFA. The converted DFA's states are sets of states of the
NFA, and the state minimization involves sets of DFA states.
 
R

Raymond Hettinger

[Raymond Hettinger]
[David Eppstein]
I've certainly been doing some implementations of graph algorithms in
which the graph vertices represent sets of objects (but I've been
representing them as bitvectors since I am still running 2.2 and don't
have Sets.py yet).

I have added some backwards compatability code so that Sets.py will run
Python 2.2. Please give it a test drive:

http://cvs.sourceforge.net/viewcvs..../sets.py?content-type=text/plain&rev=1.44.8.4

It would be awkward if I couldn't have sets of
vertices...but I don't see any reason for the sets in these applications
to ever be mutable, so automatic conversion to immutability doesn't seem
to be critical.

That was useful feedback.

For something concrete (that I thought about doing recently but haven't
actually done): conversion of NFA to DFA and then state minimization of
the resulting DFA. The converted DFA's states are sets of states of the
NFA, and the state minimization involves sets of DFA states.

That would be a perfect way to test drive the sets module.


Raymond
 
D

David Eppstein

"Raymond Hettinger said:
I have added some backwards compatability code so that Sets.py will run
Python 2.2. Please give it a test drive:

http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/dist/src/Lib/se
ts.py?content-type=text%2Fplain&rev=1.44.8.4

Thanks! Doesn't quite work out of the box, though...

Python 2.2 (#1, 07/14/02, 23:25:09)
[GCC Apple cpp-precomp 6.14] on darwin
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.2/site-packages/sets.py", line 79, in ?
class BaseSet(object):
File "/usr/lib/python2.2/site-packages/sets.py", line 109, in BaseSet
def _repr(self, sorted=False):
NameError: name 'False' is not defined
Here's a patch:

*** sets.py Sun Nov 9 09:48:47 2003
--- /usr/lib/python2.2/site-packages/sets.py Sun Nov 9 11:09:35 2003
***************
*** 74,79 ****
--- 74,83 ----
if not predicate(x):
yield x

+ if 'True' not in globals():
+ globals()['True'] = not None
+ globals()['False'] = not True
+
__all__ = ['BaseSet', 'Set', 'ImmutableSet']

class BaseSet(object):
 
D

David Eppstein

"Raymond Hettinger said:
For Py2.4, I'm working on a C implementation of Sets.py with the possibility
of
having a set() type as a builtin. There is a question as whether the current
design for set of sets has been useful.

To support sets-of-sets, the current design includes an automatic conversion
to
an immutable set class. The original use case was for implementing graph
algorithms.

Has anyone used sets-of-sets to solve any non-toy problems?
Has anyone successfully applied sets-of-sets to graph algorithms?
If the need has arisen, is the current design sufficient?

Now that I have sets working, here's a response to the original question.

I think, now that we have sets, the natural update to Guido's graph
representation essay <http://www.python.org/doc/essays/graphs.html>
is to use the following representation of a graph:
Represent G as a dictionary, where the keys are vertices and the values
are sets of the (outgoing) neighbors of each vertex.
This representation allows the following operations to be performed
efficiently:

- iterate through all vertices in G:
for v in G: suite

- iterate through all neighbors of a vertex v:
for w in G[v]: suite

- test whether a vertex belongs to G:
if v in G: suite

- test whether an edge belongs to G:
if w in G[v]: suite

If one uses this representation, then any time one has sets as vertices,
one will have sets of sets in the representation. No automatic coercion
from mutable to immutable seems to be necessary here, or in any graph
algorithm using this representation -- if you are going to use sets as
vertices, they have to already be immutable.

Here's a quick example involving sets of sets of sets:

from sets import ImmutableSet

def powerset(U):
"""Generates all subsets of a set or sequence U."""
U = iter(U)
try:
x = ImmutableSet([U.next()])
for S in powerset(U):
yield S
yield S | x
except StopIteration:
yield ImmutableSet()

def cube(n):
"""Graph of n-dimensional hypercube."""
singletons = [ImmutableSet([x]) for x in range(n)]
return dict([(x, ImmutableSet([x^s for s in singletons]))
for x in powerset(range(n))])

def linegraph(G):
"""Graph, the vertices of which are edges of G,
with two vertices being adjacent iff the corresponding
edges share a vertex."""
L = {}
for x in G:
for y in G[x]:
nx = [ImmutableSet([x,z]) for z in G[x] if z != y]
ny = [ImmutableSet([y,z]) for z in G[y] if z != x]
L[ImmutableSet([x,y])] = ImmutableSet(nx+ny)
return L

cuboctahedron = linegraph(cube(3))
 
D

David Eppstein

David Eppstein said:
def powerset(U):
"""Generates all subsets of a set or sequence U."""
U = iter(U)
try:
x = ImmutableSet([U.next()])
for S in powerset(U):
yield S
yield S | x
except StopIteration:
yield ImmutableSet()

By the way, I realized later that a little modification of this nethod
allows powerset to work with infinite sequences:

def powerset(U):
"""Generates all subsets of a set or sequence U.
U may be infinite, in which case the powerset is
also infinite, and the first 2^i items generated
will be the sets of the first i items in U.
"""
U = iter(U)
yield ImmutableSet()
x = ImmutableSet([U.next()])
yield x
it = powerset(U)
it.next() # skip empty set, already generated
for S in it:
yield S
yield S | x

I am a little concerned that "powerset" is not the right name, because
its output is a sequence not a set, but can't think of a better.
 
D

David Eppstein

David Eppstein said:
I am a little concerned that "powerset" is not the right name, because
its output is a sequence not a set, but can't think of a better.

New name: "finitesets" since it is not truly the powerset of an infinite
set -- it does not include the infinite subsets. Here's an example of
sets of sets in code I was using for creating a certain combinatorial
representation of families of weak orders: each dichotomy is a set, and
each state is a set of mutually comparable dichotomies. As before with
the graph examples, sets of sets are useful but the automatic
immutabilization protocol isn't.

One more thing that would be useful in the set package, although it's
easy enough to roll your own: a "singleton" function that converts a
single item to an (immutable) set. I don't find continually having to
write ImmutableSet([expression]) to be very readable; for instance, it
isn't obvious from that syntax that there's only one thing in the
expression.

After:
def singleton(x):
"""One-element set containing x."""
return ImmutableSet([x])

def finitesets(U):
"""Generates all finite subsets of a set or sequence U."""
U = iter(U)
yield ImmutableSet()
x = singleton(U.next())
yield x
it = finitesets(U)
it.next() # skip empty set, already generated
for S in it:
yield S
yield S | x

class weak(medium):
"""weak orders on an n-element set"""

def __init__(self, n):
self.initialState = ImmutableSet()
dichotomies = [S for S in finitesets(range(n)) if 0<len(S)<n]
self.tokens = [(self.add,o) for o in dichotomies] + \
[(self.remove,o) for o in dichotomies]

def transition(self, state, token):
op,dichotomy = token
return op(state,dichotomy)

def add(self,state,dichotomy):
for x in state:
if not x < dichotomy and not x > dichotomy:
return state
return state | singleton(dichotomy)

def remove(self,state,dichotomy):
return state - singleton(dichotomy)

Before:
class weakOrder(tuple):
"""Representation of weak order medium state.
We represent a weak order as a sorted sequence of integers,
representing masks for binary splits.
The nonzero bits of each integer must be a superset of the
bits in the previous integer. The last item in the sequence
is an integer with all bits set.
"""

def split(self,mask):
"""Attempt to add the mask to our sorted sequence."""
for i in range(len(self)):
if self >= mask:
break
if self & mask != self:
return self
if self == mask or self & mask != mask:
return self
return weakOrder(self[:i] + (mask,) + self[i:])

def join(self,mask):
"""Attempt to remove the mask from our sorted sequence."""
if mask not in self:
return self
i = list(self).index(mask)
return weakOrder(self[:i] + self[i+1:])

class weak(medium):
"""weak orders on an n-element set"""

def __init__(self, n):
everything = (1<<n) - 1
self.initialState = weakOrder((everything,))
positive = range(1,everything)
self.tokens = [-x for x in positive] + positive

def transition(self, state, token):
if token > 0:
return state.split(token)
else:
return state.join(-token)
 
F

Francis Avila

David Eppstein said:
I am a little concerned that "powerset" is not the right name, because
its output is a sequence not a set, but can't think of a better.

isubsets()? There seems to be an emerging convention that 'i'-prefix
functions return a generator/iterator. Dictionaries use an 'iter' prefix,
though.
 
W

William Trenker

Raymond said:
For Py2.4, I'm working on a C implementation of Sets.py

I was exploring Aaron Watters' kjbuckets module today (Phillip Eby's PEAK framework uses it).

Now that a C implementation of Sets.py is in the works, I wonder if this is the time to consider merging at least some of the directed-graph functionality of kjbuckets into Sets.c? The support in kjbuckets for graphs includes methods like: neighbors, reachable, and tclosure (transitive closure).

For those who haven't looked at it, the kjbuckets module provides objects for sets, directed-graphs, and algebraically-operable-dicts in an implementation that is "tightly coupled at the level of C, allowing fast and powerful algebraic combinations of container objects." It is documented quite nicely here, http://gadfly.sourceforge.net/kjbuckets.html

Just food for thought,
Bill
 
D

David Eppstein

For something concrete (that I thought about doing recently but haven't
actually done): conversion of NFA to DFA and then state minimization of
the resulting DFA. The converted DFA's states are sets of states of the
NFA, and the state minimization involves sets of DFA states.

That would be a perfect way to test drive the sets module.[/QUOTE]

Here is the first half of this. I didn't do the state minimization
part. (For the example shown here, the output is already minimal.)

"""automata.py
Convert nondeterministic to deterministic finite automata.

D. Eppstein, UC Irvine, November 2003.
"""

from __future__ import generators
from sets import ImmutableSet,Set

if 'True' not in globals():
globals()['True'] = not None
globals()['False'] = not True

class InputError(Exception): pass

class DFA:
def __init__(self,Sigma,delta,S0,F):
"""Create deterministic finite automaton with alphabet Sigma,
transition function delta(state,letter)->state, initial state
S0, and predicate F(state)->boolean. The full sets of states and
final states are determined implicitly from the delta and F
functions.
"""
self.alphabet = ImmutableSet(Sigma)
self.transition = delta
self.initialState = S0
self.isfinal = F

def states(self):
"""Generate all states of the DFA."""
explored = Set()
unexplored = Set([self.initialState])
while unexplored:
s = unexplored.pop()
explored.add(s)
yield s
for c in self.alphabet:
t = self.transition(s,c)
if t not in explored:
unexplored.add(t)

def __call__(self,input):
"""Test whether input is accepted by the DFA."""
state = self.initialState
for c in input:
if c not in self.alphabet:
raise InputError("Symbol " + repr(c) +
" not in input alphabet")
state = self.transition(state,c)
return self.isfinal(state)

class NFA:
def __init__(self,Sigma,delta,S0,F):
"""Create nondeterministic finite automaton (without
epsilon-transitions). Arguments are the same as for a
deterministic finite automaton, except that the initial state
and result of the transition function are both sets.
"""
self.alphabet = ImmutableSet(Sigma)
self.transition = delta
self.initialStates = ImmutableSet(S0)
self.isfinal = F

def setTransition(self,states,c):
"""States reachable from input set by input c."""
result = Set()
for s in states:
result |= self.transition(s,c)
return ImmutableSet(result)

def finalSet(self,states):
"""Test whether any of given set of states is final."""
for s in states:
if self.isfinal(s):
return True
return False

def makeDeterministic(self):
"""Convert NFA to DFA."""
return DFA(self.alphabet,self.setTransition,
self.initialStates,self.finalSet)

def __call__(self,input):
"""Test whether input is accepted by the NFA."""
return self.makeDeterministic()(input)

# Example from Sipser, _Introduction to the Theory of Computation_
# (Preliminary Edition), Example 1.13, pages 47-48
if __name__ == "__main__":

# state:(states for transition on 0, states for transition on 1)
Sipser_1_13 = {
1: ([1], [1,2]),
2: ([3], [3]),
3: ([4], [4]),
4: ([], []),
}
def delta(s,i): return ImmutableSet(Sipser_1_13[int(i)])
def final(s): return s == 4
N2 = NFA("01",delta,[1],final)

def test(input):
print input,(N2(input) and "is" or "is not"),"in L(N2)"
test("000100")
test("0011")

print
print "Conversion to DFA:"
print
D2 = N2.makeDeterministic()
for s in D2.states():
print s
for c in "01":
print " --[" + c + "]-->", D2.transition(s,c)
 
D

David Eppstein

I had a natural example of sets of sets of sets of sets today... I
wanted to explore the trees having n labeled items at their leaves
<http://www.research.att.com/projects/OEIS?Anum=A000311>.

Each edge of such a tree forms a bipartition of the set of leaves, so I
represented the tree as the set of its bipartitions. Each bipartition
is itself a set {S, U-S} where U is the set of all leaves and S is some
subset of U. So, each tree is represented as a set of sets of sets. I
ran some algorithms that involved sets of trees, so these algorithms
constructed sets of sets of sets of sets...

This was all relatively easy using sets.py (23 new lines added to the
code I was already using for exploring other combinatorial objects) and
would have been much more cumbersome if I had to represent sets
explicitly as bit vectors or dictionaries.

So, thanks for sets, and thanks for making it work under python 2.2!
 
R

Raymond Hettinger

[Raymond]
[Dave Eppstein]
Thanks! Doesn't quite work out of the box, though...

Python 2.2 (#1, 07/14/02, 23:25:09)
[GCC Apple cpp-precomp 6.14] on darwin
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.2/site-packages/sets.py", line 79, in ?
class BaseSet(object):
File "/usr/lib/python2.2/site-packages/sets.py", line 109, in BaseSet
def _repr(self, sorted=False):
NameError: name 'False' is not defined

Okay, I've added code to make this run with older versions of Py2.2.

Still, it's worth downloading the bugfix release 2.2.3 which has
True/False in it and a number of important fixups.


Raymond Hettinger
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,174
Latest member
BlissKetoACV
Top