clever programming solution wanted

Discussion in 'Python' started by Scott Rifkin, Jul 15, 2004.

1. Scott RifkinGuest

If anyone has a good, quick solution to the following in python2.3
I'd appreciate it:

I have a dictionary with keys K1, K2, ..., K10000
Each entry looks like:

dict[K1]=[T1,T4,T500]

where the T's are indexed from, say, 1 to 20000 and each K has some number
of Ts (not always 3).

I want to create groups of Ks which share Ts such the groups are maximal,
for instance:

dict[K1]=[T1,T4,T500]
dict[K2]=[T1]
dict[K3]=[T500,T7000]

these would be grouped [K1,K2,K3].

At the end, most of the Ks will be by themselves, but some will be grouped
with some number of other Ks.

Thanks for any help,

Scott Rifkin

scott dot rifkin at yale dot edu

Scott Rifkin, Jul 15, 2004

2. Larry BatesGuest

Maybe someone else can decipher this. I just don't
know what "such that the groups are maximal" could
possibly mean.

Larry Bates

"Scott Rifkin" <> wrote in message
news...
> If anyone has a good, quick solution to the following in python2.3
> I'd appreciate it:
>
> I have a dictionary with keys K1, K2, ..., K10000
> Each entry looks like:
>
> dict[K1]=[T1,T4,T500]
>
> where the T's are indexed from, say, 1 to 20000 and each K has some number
> of Ts (not always 3).
>
> I want to create groups of Ks which share Ts such the groups are maximal,
> for instance:
>
> dict[K1]=[T1,T4,T500]
> dict[K2]=[T1]
> dict[K3]=[T500,T7000]
>
> these would be grouped [K1,K2,K3].
>
> At the end, most of the Ks will be by themselves, but some will be grouped
> with some number of other Ks.
>
>
> Thanks for any help,
>
> Scott Rifkin
>
> scott dot rifkin at yale dot edu
>
>

Larry Bates, Jul 15, 2004

3. Jeff EplerGuest

It sounds to me like you have a somewhat odd representation of an
undirected graph and you want to find connected components.

http://www.cs.sunysb.edu/~algorith/files/dfs-bfs.shtml
http://www.csee.usf.edu/~maurer/graphs/sld024.htm
etc

In an edge-list representation, I think the solution looks something
like (untested):
all_components = []
while input:
initial_element = input.pop()
visit = [initial_element]
component = Set([initial_element])
while visit:
v = visit.pop()
vs = neighbors[v] - component:
component |= vs
visit.extend(vs)
all_components.append(component)
input -= component
This is a O(V+E) algorithm unless I messed something up.

Converting from your d[Ki] = [Tj, Tk, ...] to edge-list representation
is probably O(V^2). Something like (untested):
neighbors = dict([(v, Set()) for v in d])
ds = dict([(k, Set(v)) for k, v in d.iteritems()])
for v1 in d:
for v2 in d:
if ds[v1] & ds[v2]:
neighbors[v1].add(v2)
neighbors[v2].add(v1)

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA9wsdJd01MZaTXX0RAlLiAJ0W0EAqyLnZMffTsenA0/SDkIpivwCff0FN
TLB2hvR3Xb+bvhUWTMKHrMw=
=WNp+
-----END PGP SIGNATURE-----

Jeff Epler, Jul 15, 2004
4. David EppsteinGuest

In article <>,
"Larry Bates" <> wrote:

> Maybe someone else can decipher this. I just don't
> know what "such that the groups are maximal" could
> possibly mean.
>
> Larry Bates
>
> "Scott Rifkin" <> wrote in message
> news...
> > If anyone has a good, quick solution to the following in python2.3
> > I'd appreciate it:
> >
> > I have a dictionary with keys K1, K2, ..., K10000
> > Each entry looks like:
> >
> > dict[K1]=[T1,T4,T500]
> >
> > where the T's are indexed from, say, 1 to 20000 and each K has some number
> > of Ts (not always 3).
> >
> > I want to create groups of Ks which share Ts such the groups are maximal,
> > for instance:
> >
> > dict[K1]=[T1,T4,T500]
> > dict[K2]=[T1]
> > dict[K3]=[T500,T7000]

The way I interpreted it: draw an undirected bipartite graph from the
K's to the T's, find connected components, and report the K's in each
connected component.

Something like:

from sets import Set

def keygroups(dictKT):
dictTK = {}
for k,tlist in dictKT.iteritems():
for t in tlist:
dictTK.setdefault(t,[]).append(k)
visitedK = Set()
visitedT = Set()
for k in dictKT:
if k not in visitedK:
component = Set([k])
unexplored = [k]
while unexplored:
k = unexplored.pop()
component.add(k)
for t in dictKT[k]:
if t not in visitedT:
visitedT.add(t)
for k in dictTK[t]:
if k not in visitedK:
visitedK.add(k)
unexplored.append(k)
yield list(component)

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science

David Eppstein, Jul 15, 2004

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.