which data structure to use?

Robert Voigtländer · Jan 21, 2014

Hi,

which would be the best data structure to use for the following case?

I have objects like this:

class Node(object):
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

I need to build a "list" of these objects. The amount is unknown.
On this list I need to regularly

1. check if a specific item - identified by Node.pos - is in the list.
2. find the object with the lowest Node.f attribute and update or remove it

What would be a good approach. Using a list I always need to traverse the whole list to do one of the above actions.

Thanks
Robert

Chris Angelico · Jan 21, 2014

1. check if a specific item - identified by Node.pos - is in the list.
2. find the object with the lowest Node.f attribute and update or remove it

Are both those values constant once the Node is added? If so, the
easiest way would be to maintain a dictionary mapping pos to the Node
(or to a list of Nodes, if you can have multiple with the same pos),
and probably heapq for the f values. But if they change, you'll have
to update both data structures. If they change often, it's probably
not worth maintaining index structures - just search for what you
want, when you want it. And if you're working with a small list (say,
less than a thousand items), performance isn't going to matter at all,
so just do whatever looks cleanest in your code - probably you have
that already.

ChrisA

Robert Voigtländer · Jan 21, 2014

On Tue, Jan 21, 2014 at 03:17:43AM -0800, Robert Voigtlï¿½nder wrote:

First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.
What's clear to me is the difference between object and instance of an object. Just didn't explain it well.

Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I have a chance to understand it.

I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed.. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.

Here some working code. For one function I sill need a solution. Any bettersolution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

Robert Voigtländer · Jan 21, 2014

Am Dienstag, 21. Januar 2014 14:38:34 UTC+1 schrieb Robert Voigtländer:

First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.

What's clear to me is the difference between object and instance of an object. Just didn't explain it well.

Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I have a chance to understand it.

I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.

Here some working code. For one function I sill need a solution. Any better solution is welcome.

Thanks

Robert

Sorry - found a bug right after my post.
Here the corrected version.

class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF:
lowestF = item.f
lowestFItem = item
return lowestFItem

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

Oscar Benjamin · Jan 21, 2014

First thanks to all who responded. Although I only partially understand your answers as I am a Python starter.
What's clear to me is the difference between object and instance of an object. Just didn't explain it well.

Maybe I give some more info on what I need / want to do. I will also provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I have a chance to understand it.

I would like to implement a class for a A* pathfinding algorithm. (there are ready libraries out there but I would like to learn it myself) This requires to maintain a list of nodes to be processed and nodes already processed. For new nodes I need to check if they are on one of the lists. I also need to regularly pick the node with the lowest value f from the list.

Here some working code. For one function I sill need a solution. Any better solution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

The function above is incorrect. I think it should be:

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF:
lowestF = item.f # Note the .f here
return lowestF

def deleteItemWithPos(pos):
## need this function or different approach
pass

def deleteItemWithPos(pos):
for n, item in enumerate(openlist):
if item.pos == pos:
del openlist[n]
return
else:
raise RuntimeError('Not in the list!')

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

My first suggestion would be to use a dict instead of a list:

class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(pos):
return pos in opendict

def lowestF():
return min(opendict.values(), key=lambda x: x.f)

def deleteItemWithPos(pos):
del opendict[pos]

nodes = [
Node((1,1),None,1,5),
Node((1,2),(1,1),4,6),
Node((1,3),(1,2),9,10),
]

opendict = {}
for node in nodes:
opendict[node.pos] = node

for item in opendict.values():
print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

The above is more efficient and simpler. It is still O(N) for the lowestF()
function. Changing data structure could make that more efficient.

Oscar

Peter Otten · Jan 21, 2014

Robert said:
First thanks to all who responded. Although I only partially understand
your answers as I am a Python starter. What's clear to me is the
difference between object and instance of an object. Just didn't explain
it well.

Maybe I give some more info on what I need / want to do. I will also
provide a working code example. Should have done this before.

I would very much appreciate a working example of what you mean. Then I
have a chance to understand it.

I would like to implement a class for a A* pathfinding algorithm. (there
are ready libraries out there but I would like to learn it myself) This
requires to maintain a list of nodes to be processed and nodes already
processed. For new nodes I need to check if they are on one of the lists.
I also need to regularly pick the node with the lowest value f from the
list.

Here some working code. For one function I sill need a solution. Any
better solution is welcome.

Thanks
Robert

---------------
class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h

def isinlist(nodeToSeatch):
for item in openlist:
if item.pos == nodeToSeatch: return True
return False

def lowestF():
lowestF = ''
for item in openlist:
if item.f < lowestF: lowestF = item
return lowestF

def lowestF():
return min(openlist, key=operator.attrgetter("f"))

def deleteItemWithPos(pos):
## need this function or different approach
pass

openlist=[]
openlist.append(Node((1,1),None,1,5))
openlist.append(Node((1,2),(1,1),4,6))
openlist.append(Node((1,3),(1,2),9,10))

for item in openlist: print item.pos, item.f

print isinlist((1,1))
print isinlist((1,5))

nextNode = lowestF()
print nextNode.pos, nextNode.f

Here is an OO implementation of Chris Angelico's suggestion:

import heapq

class Node:
def __init__(self, pos, parent, g , h):
self.pos = pos
self.parent = parent
self.g = g
self.h = h
self.f = g+h
def __str__(self):
return "Node(pos={!r}, f={!r})".format(self.pos, self.f)

class Nodes():
def __init__(self):
self.lookup = {}
self.heap = []
def add(self, node):
self.lookup[node.pos] = node
heapq.heappush(self.heap, (node.f, node))
def __iter__(self):
return iter(self.lookup.values())
def __contains__(self, pos):
return pos in self.lookup
def lowest(self):
return self.heap[0][1]
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node

nodes = Nodes()
nodes.add(Node((1,1), None, 1, 5))
nodes.add(Node((1,2), (1,1), 4, 6))
nodes.add(Node((1,3), (1,2), 9, 10))

for node in nodes:
print(node)

print((1,1) in nodes)
print((1,5) in nodes)

print(nodes.lowest())

Peter Otten · Jan 21, 2014

Peter said:
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node

That should be

def pop(self):
f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node

Robert Voigtländer · Jan 21, 2014

Am Dienstag, 21. Januar 2014 15:19:54 UTC+1 schrieb Peter Otten:

Peter Otten wrote:

def pop(self):

Click to expand...

f, node = heapq.heappop()

Click to expand...

del lookup[node.pos]

Click to expand...

return node

Click to expand...

That should be

def pop(self):

f, node = heapq.heappop(self.heap)

del self.lookup[node.pos]

return node

Hi Peter,

this works great. I will try to find some info about the functionality you used and to understnad what you did.
Maybe you can add a function to remove a node?

Thanks for your support and all tthe swift answers.

Robert

Robert Voigtländer · Jan 21, 2014

def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node

Click to expand...

That should be

Click to expand...

def pop(self):

Click to expand...

f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node

Click to expand...

Hi Peter,
this works great. I will try to find some info about the functionality you used and to understnad what you did.
Maybe you can add a function to remove a node?
Thanks for your support and all tthe swift answers.

Robert

Just realized what the pop function is for. I changed it from:
def pop(self):
to
def pop(self,pos):

and it works.

Now I go on with trying to understand it.

Mark Lawrence · Jan 21, 2014

On 21/01/2014 13:43, Robert Voigtländer wrote:

[double spaced google disease snipped]

I'm pleased to see the regular contributors helping out as usual. In
response would you please be kind enough to read and action this
https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the
double line spacing that google inserts, thanks.

Peter Otten · Jan 21, 2014

Robert said:
def pop(self):
f, node = heapq.heappop()
del lookup[node.pos]
return node
That should be

Click to expand...

def pop(self):

Click to expand...

f, node = heapq.heappop(self.heap)
del self.lookup[node.pos]
return node

Click to expand...

Hi Peter,
this works great. I will try to find some info about the functionality
you used and to understnad what you did. Maybe you can add a function to
remove a node? Thanks for your support and all tthe swift answers.

Robert

Click to expand...

Just realized what the pop function is for. I changed it from:
def pop(self):
to
def pop(self,pos):

and it works.

Unlikely. Are you sure that .heap and .lookup contents are still in sync
with your modification?

Now I go on with trying to understand it.

The pop() method as posted can only remove the "lowest" node. If contrary to
your initial spec

2. find the object with the lowest Node.f attribute and update or remove

it

you want to remove arbitrary nodes a heap may not be the appropriate data
structure. The modified

import operator

#...

class Nodes():
def __init__(self):
self.lookup = {}
def add(self, node):
self.lookup[node.pos] = node
def __iter__(self):
return iter(self.lookup.itervalues())
def __contains__(self, pos):
return pos in self.lookup
def lowest(self):
return min(self, key=operator.attrgetter("f"))
def __delitem__(self, pos):
del self.lookup[pos]

nodes = Nodes()

nodes.add(Node((1,1), None, 1, 5))
nodes.add(Node((1,2), (1,1), 4, 6))
nodes.add(Node((1,3), (1,2), 9, 10))

del nodes[1, 2]

allows you to delete random nodes, but the lowest() method will slow down as
it has to iterate over all dict values.

One important caveat: both Nodes classes lead to data corruption if you
modify a .pos attribute while the node is in a Nodes instance. The same goes
for the first implementation and the .f attribute.

Robert Voigtländer · Jan 22, 2014

Unlikely. Are you sure that .heap and .lookup contents are still in sync

with your modification?

No it's not. Atfer having read about heapq it's clear why.
Thanks for the hint.

allows you to delete random nodes, but the lowest() method will slow down as
it has to iterate over all dict values.

Thanks a lot for the alternative class. I will go with two lists, each using one of the aproaches. In one list I always need to have only hte object with lowest .f.
With the other list I may have to delete random nodes.

So I can make perfect use of both classes.

One important caveat: both Nodes classes lead to data corruption if you
modify a .pos attribute while the node is in a Nodes instance. The same goes
for the first implementation and the .f attribute.

That's fine. pos is static and identifies the node.

Thanks again to all who responded. I learned a lot.

use class in class	4	Jan 21, 2014
data structure	1	Sep 3, 2006
Collect Excel Data from Website	5	Apr 30, 2022
Reading a file into a data structure....	8	Oct 13, 2011
which data structure to use, and when	4	Jan 9, 2007
Database vs Data Structure?	7	Apr 18, 2008
[WxPython] Advice on Temperature Application Method & Structure	1	May 26, 2010
Data structure recommendation?	8	Apr 7, 2008

which data structure to use?

Robert Voigtländer

Chris Angelico

Robert Voigtländer

Robert Voigtländer

Oscar Benjamin

Peter Otten

Peter Otten

Robert Voigtländer

Robert Voigtländer

Mark Lawrence

Peter Otten

Robert Voigtländer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads