Making a tree out of a 2 column list

S

Sebastian Bassi

I have a two column list like:

2,131
6,335
7,6
8,9
10,131
131,99
5,10

And I want to store it in a tree-like structure.
So if I request 131, it should return all the child of 131, like 2, 10
and 5 (since 5 is child of 10).
If I request 335, it should return: 6 and 7.
If I request 9, it should return 8.
I guess I could use tuples or dictionaries to do it, but I can't figure out how.

Best,
SB.
 
M

mensanator

I have a two column list like:

2,131
6,335
7,6
8,9
10,131
131,99
5,10

And I want to store it in a tree-like structure.
So if I request 131, it should return all the child of 131, like 2, 10
and 5 (since 5 is child of 10).
If I request 335, it should return: 6 and 7.
If I request 9, it should return 8.
I guess I could use tuples or dictionaries to do it, but I can't figure out how.

There are probably better ways.

def tree_path(key,tree,indent):
print '\t'*indent,key
if tree.has_key(key):
for m in tree[key]:
tree_path(m,tree,indent+1)
return

print 'original data'
print

source = [(2,131),(6,335),(7,6),(8,9),(10,131),(131,99),(5,10)]

tree = {}
for s in source:
if tree.has_key(s[1]):
tree[s[1]].append(s[0])
else:
tree[s[1]] = [s[0]]

for t in tree:
print '%3d ' % (t),tree[t]

print
print
tree_path(99,tree,0)

print
print 'extended data'
print

source_extend = [(666,2),(777,2),(888,2)]

for s in source_extend:
if tree.has_key(s[1]):
tree[s[1]].append(s[0])
else:
tree[s[1]] = [s[0]]

for t in tree:
print '%3d ' % (t),tree[t]

print
print
tree_path(99,tree,0)

## original data
##
## 99 [131]
## 6 [7]
## 9 [8]
## 10 [5]
## 335 [6]
## 131 [2, 10]
##
##
## 99
## 131
## 2
## 10
## 5
##
## extended data
##
## 2 [666, 777, 888]
## 99 [131]
## 6 [7]
## 9 [8]
## 10 [5]
## 335 [6]
## 131 [2, 10]
##
##
## 99
## 131
## 2
## 666
## 777
## 888
## 10
## 5
 
M

martinskou

Hope this helps

# list of pairs [child,parent]
list=[[2,131],[6,335],[7,6],[8,9],[10,131],[131,99],[5,10]]

# list with loop
#list=[[2,131],[6,335],[7,6],[8,9],[10,131],[131,99],[5,10],[3,10],
[131,3]]

# put the pairs in a dictionary, for easy retrieval
d={}
for c,p in list:
# must be able to hold multiple values for each key
if not(d.has_key(p)):
d[p]=[c]
else:
d[p].append(c)


# function to retrieve children using recursion. max_depth to ensure
termination
def retrieve_recursive(key,result=[],max_depth=10):
if d.has_key(key) and max_depth>0:
for i in d[key]:
result.append(i)
retrieve_recursive(i,result,max_depth-1)
return result

print retrieve_recursive(131)
 
B

bearophileHUGS

The solution I have found is quite similar to the (e-mail address removed)
one (it uses a defaultdict, and it yields the result lazily), but it
lacks the quite useful max_depth (it can be added):

from collections import defaultdict

def finder(el, stor):
if el in stor:
for v in stor[el]:
yield v
for v2 in finder(v, stor):
yield v2

data = [[131, 2], [335, 6], [6, 7], [9, 8], [131, 10], [99, 131], [10,
5]]

storage = defaultdict(list)
for k, v in data:
storage[k].append(v)

test_data = ([131, [2, 10, 5]],
[335, [6, 7]],
[9, [8]],
)

print storage
for k, vals in test_data:
assert set(finder(k, storage)) == set(vals)

Bye,
bearophile
 
S

Sebastian Bassi

def tree_path(key,tree,indent):
print '\t'*indent,key
if tree.has_key(key):
for m in tree[key]:
tree_path(m,tree,indent+1)
return

Thank you. It worked!.
I changed it a bit to return a list with the results:

def tree_path(key,tree,hijos):
hijos.append(key)
if tree.has_key(key):
for m in tree[key]:
tree_path(m,tree,hijos)
return hijos

Then I call it like this:

MyList=tree_path(9608,tree,[])

Best,
SB.
 
P

Peter Otten

Sebastian said:
def tree_path(key,tree,indent):
     print '\t'*indent,key
     if tree.has_key(key):
         for m in tree[key]:
             tree_path(m,tree,indent+1)
     return

Thank you. It worked!.
I changed it a bit to return a list with the results:

def tree_path(key,tree,hijos):
        hijos.append(key)
        if tree.has_key(key):
                for m in tree[key]:
                        tree_path(m,tree,hijos)
        return hijos

Then I call it like this:

MyList=tree_path(9608,tree,[])

Depending on your input data you may need to add some cycle detection.
For example, try it with

tree_path(1, {1:[2], 2:[1]}, [])

Peter
 
S

Sebastian Bassi

Depending on your input data you may need to add some cycle detection.
For example, try it with
tree_path(1, {1:[2], 2:[1]}, [])

I guess this should make the program enter into a endless loop. But
the data won't have such a redundancy, because it was taken from a
philogenetic tree.

Best,
SB.
 
B

bearophileHUGS

Sebastian Bassi:
I guess this should make the program enter into a endless loop. But
the data won't have such a redundancy, because it was taken from a
philogenetic tree.

But errors and bugs do happen, inside data too; so often it's better
to be on safe side if the safe code is fast enough.

Bye,
bearophile
 
P

Paul Rubin

But errors and bugs do happen, inside data too; so often it's better
to be on safe side if the safe code is fast enough.

If you think the data may be wrong it's probably better to use a
traditional graph algorithm to check it for cycles, than rely on some
arbitrary recursion depth limit.
 
S

Sebastian Bassi

But errors and bugs do happen, inside data too; so often it's better
to be on safe side if the safe code is fast enough.

Yes, I agree since I've seen lot of errors in data. But this data
comes from a taxonomy tree made by the NCBI, that is why I assume the
data is right.
 
J

Jorgen Grahn

I have a two column list like:

2,131
6,335
7,6
8,9
10,131
131,99
5,10

And I want to store it in a tree-like structure.

CS side note: your "columns" describe a graph;
http://en.wikipedia.org/wiki/Graph_(mathematics)
So if I request 131, it should return all the child of 131, like 2, 10
and 5 (since 5 is child of 10).

OK, so it is a directed graph.

You might want to think about whether it is also a DAG. If not, you
could be asked for the children of 1 in
2, 1
1, 2

As I interpret your specification, the correct answer is [1, 2] -- the
set of children for 1.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,438
Messages
2,571,699
Members
48,796
Latest member
Greg L.
Top