Not fully understanding the role of Queue.task_done()

M

Martin DeMello

I'm writing a cluster monitor, that collects information from a set of
machines and logs it to a database

In the interests of not hammering the db unnecessarily, I'm
considering the following
1. A series of independent "monitor" threads that collect information
over TCP from the cluster of machines, and write it to a queue
2. A "logger" thread that empties the queue every second or so and
inserts the collected information to the db via a single insert
statement

Reading up on python's built in Queue class, though, it seems oriented
towards "job queues", with a two-step dequeue operation (get() and
task_done()). I'm worried that this would make it too heavyweight for
my application. Is ther documentation somewhere on what exactly
task_done() does, and whether I can disable the tracking of a job once
it's removed from the queue? The python docs for the Queue module were
a bit light.

martin
 
F

Fredrik Lundh

Martin said:
Reading up on python's built in Queue class, though, it seems oriented
towards "job queues", with a two-step dequeue operation (get() and
task_done()). I'm worried that this would make it too heavyweight for
my application. Is ther documentation somewhere on what exactly
task_done() does, and whether I can disable the tracking of a job once
it's removed from the queue? The python docs for the Queue module were
a bit light.

"task_done" just decrements a counter (incremented by "put"). when the
counter reaches zero, the "join" call is unblocked.

</F>
 
M

Martin DeMello

"task_done" just decrements a counter (incremented by "put").  when the
counter reaches zero, the "join" call is unblocked.

Thanks! Is there any standard python idiom to empty a queue into a
list? Or do I just call get() repeatedly and catch the exception when
it's done?

martin
 
C

castironpi

Thanks! Is there any standard python idiom to empty a queue into a
list? Or do I just call get() repeatedly and catch the exception when
it's done?

martin

Random access isn't supported by the defined interface. You can make
it more convenient, though.

import Queue

class IterQueue( Queue.Queue ):
def __iter__( self ):
return self
def next( self ):
if self.empty():
raise StopIteration
return self.get()

q= IterQueue()
q.put( 'a' )
q.put( 'b' )
q.put( 'c' )

print [ x for x in q ]

/Output:
['a', 'b', 'c']
 
M

Martin DeMello

Random access isn't supported by the defined interface.  You can make
it more convenient, though.

Thanks. I wasn't looking for random access, just wondering what the
cleanest way to implement items = Queue.get_all() was. Your code
should work nicely for that.

martin
 
F

Fredrik Lundh

Martin said:
I'm writing a cluster monitor, that collects information from a set of
machines and logs it to a database

In the interests of not hammering the db unnecessarily, I'm
considering the following
1. A series of independent "monitor" threads that collect information
over TCP from the cluster of machines, and write it to a queue
2. A "logger" thread that empties the queue every second or so and
inserts the collected information to the db via a single insert
statement

why are you using a queue for this case, btw? why not just use a plain list

L = []
lock = threading.Lock()

and add stuff using append in the monitor threads

with lock:
L.append(item)

and regularily reset the list in the logger thread

with lock:
data = L[:]
L[:] = [] # clear the list
for item in data:
... insert into database ...

(list append and assignments to global variables are atomic in CPython,
so you can eliminate the lock by being a bit clever, but that's probably
better left for a non-premature optimization pass).

</F>
 
M

Martin DeMello

Martin said:
I'm writing a cluster monitor, that collects information from a set of
machines and logs it to a database
In the interests of not hammering the db unnecessarily, I'm
considering the following
1. A series of independent "monitor" threads that collect information
over TCP from the cluster of machines, and write it to a queue
2. A "logger" thread that empties the queue every second or so and
inserts the collected information to the db via a single insert
statement

why are you using a queue for this case, btw?  why not just use a plain list

     L = []
     lock = threading.Lock()

Good point - I thought of queue because it was self-locking, but
you're right, I can as well use a simple list and lock it myself.

martin
 
A

Aahz

Martin said:
In the interests of not hammering the db unnecessarily, I'm
considering the following
1. A series of independent "monitor" threads that collect information
over TCP from the cluster of machines, and write it to a queue
2. A "logger" thread that empties the queue every second or so and
inserts the collected information to the db via a single insert
statement

why are you using a queue for this case, btw? why not just use a plain list

L = []
lock = threading.Lock()

and add stuff using append in the monitor threads

with lock:
L.append(item)

Because using a queue requires less thinking. I certainly would use a
queue in this case instead of rolling my own.
 
F

Fredrik Lundh

Aahz said:
why are you using a queue for this case, btw? why not just use a plain list

L = []
lock = threading.Lock()

and add stuff using append in the monitor threads

with lock:
L.append(item)

Because using a queue requires less thinking.

given that the whole reason for this thread was that Queue API didn't
fit the OP:s problem, that's a rather dubious statement.

(btw, I've always thought that Python was all about making it easy to
express the solution to a given problem in code, not to let you write
programs without using your brain. when did that change?)

</F>
 
A

alex23

Fredrik Lundh said:
(btw, I've always thought that Python was all about making it easy to
express the solution to a given problem in code, not to let you write
programs without using your brain.  when did that change?)

The day Google App Engine was opened up to developers, I believe.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top