Is Queue.Queue.queue.clear() thread-safe?

R

Russell Warren

I'm guessing no, since it skips down through any Lock semantics, but
I'm wondering what the best way to clear a Queue is then.

Esentially I want to do a "get all" and ignore what pops out, but I
don't want to loop through a .get until empty because that could
potentially end up racing another thread that is more-or-less blindly
filling it asynchronously.

Worst case I think I can just borrow the locking logic from Queue.get
and clear the deque inside this logic, but would prefer to not have to
write wrapper code that uses mechanisms inside the object that might
change in the future.

Also - I can of course come up with some surrounding architecture to
avoid this concern altogether, but a thread-safe Queue clear would do
the trick and be a nice and short path to solution.

If QueueInstance.queue.clear() isn't thread safe... what would be the
best way to do it? Also, if not, why is the queue deque not called
_queue to warn us away from it?

Any other comments appreciated!

Russ
 
T

Tim Peters

[Russell Warren]
I'm guessing no, since it skips down through any Lock semantics,

Good guess :) It's also unsafe because some internal conditions must
be notified whenever the queue becomes empty (else you risk deadlock).
but I'm wondering what the best way to clear a Queue is then.

Esentially I want to do a "get all" and ignore what pops out, but I
don't want to loop through a .get until empty because that could
potentially end up racing another thread that is more-or-less blindly
filling it asynchronously.

Worst case I think I can just borrow the locking logic from Queue.get
and clear the deque inside this logic, but would prefer to not have to
write wrapper code that uses mechanisms inside the object that might
change in the future.

There's simply no defined way to do this now.
Also - I can of course come up with some surrounding architecture to
avoid this concern altogether, but a thread-safe Queue clear would do
the trick and be a nice and short path to solution.

If QueueInstance.queue.clear() isn't thread safe... what would be the
best way to do it? Also, if not, why is the queue deque not called
_queue to warn us away from it?

"Consenting adults" -- if you want an operation that isn't supplied
out of the box, and are willing to take the risk of breaking into the
internals, Python isn't going to stop you from doing whatever you
like. "mutex" isn't named "_mutex" for the same reason. A

q.mutex.acquire()
try:
q.queue.clear()
q.unfinished_tasks = 0
q.not_full.notify()
q.all_tasks_done.notifyAll()
finally:
q.mutex.release()

sequence probably works (caveat emptor). BTW, it may be _possible_
you could use the newer task_done()/join() Queue gimmicks in some way
to get what you're after (I'm not really clear on that, exactly).

Anyway, yes, there is cross-release risk in breaking into the
internals like that. That's an "adult decision" for you to make. The
only way to get permanent relief is to re-think your approach, or
propose adding a new Queue.clear() method and Queue._clear() default
implementation. I don't know whether that would be accepted -- it
seems simple enough, although since you're the first person to ask for
it 15 years :), you won't get much traction arguing that's a critical
lack, and every new bit of cruft becomes an ongoing burden too.
 
F

Fredrik Lundh

Tim said:
Good guess :) It's also unsafe because some internal conditions must
be notified whenever the queue becomes empty (else you risk deadlock).

"also" ? if it weren't for the other things, the clear() call itself
would have been atomic enough, right ?

</F>
 
T

Tim Peters

[Russell Warren]
[Tim Peters]
[Fredrik Lundh]
"also" ? if it weren't for the other things, the clear() call itself
would have been atomic enough, right ?

Define "other things" :) For example, Queue.get() relies on that
when self._empty() is false, it can do self._get() successfully.
That's 100% reliable now because the self.not_empty condition is in
the acquired state across both operations. If some yahoo ignores the
underlying mutex and clears the queue after self._empty() returns true
but before pop() executes self._get(), the call to the latter will
raise an exception. That kind of failure is what I considered to be
an instance of a problem due to the OP's "skips down through any Lock
semantics"; the failure to notify internal conditions that the queue
is empty is a different _kind_ of problem, hence my "also" above.

If you're just asking whether deque.clear() is "atomic enough" on its
own, define "enough" :) It has to decref each item in the deque, and
that can end up executing arbitrary Python code (due to __del__
methods or weakref callbacks), and that in turn can allow the GIL to
be released allowing all other threads to run, and any of that
Python-level code may even mutate the deque _while_ it's being
cleared.

The C code in deque_clear() looks threadsafe in the sense that it
won't blow up regardless -- and that's about the strongest that can
ever be said for a clear() method. dict.clear() is actually a bit
stronger, acting as if a snapshot were taken of the dict's state at
the instant dict.clear() is called. Any mutations made to the dict
as a result of decref side effects while the original state is getting
cleared survive. In contrast, if a new item is added to a deque d
(via decref side effect) while d.clear() is executing, it won't
survive. That's "atomic enough" for me, but it is kinda fuzzy.
 
R

Russell Warren

Thanks guys. This has helped decipher a bit of the Queue mechanics for
me.

Regarding my initial clear method hopes... to be safe, I've
re-organized some things to make this a little easier for me. I will
still need to clear out junk from the Queue, but I've switched it so
that least I can stop the accumulation of new data in the Queue while
I'm clearing it. ie: I can just loop on .get until it is empty without
fear of a race, rather than needing a single atomic clear.

My next Queue fun is to maybe provide the ability to stuff things back
on the queue that were previously popped, although I'll probably try
and avoid this, too (maybe with a secondary "oops" buffer).

If curious why I want stuff like this, I've got a case where I'm
collecting data that is being asynchronously streamed in from a piece
of hardware. Queue is nice because I can just have a collector thread
running and stuffing the Queue while other processing happens on a
different thread. The incoming data *should* have start and stop
indications within the stream to define segments in the stream, but
stream/timing irregularities can sometimes either cause junk, or cause
you to want to rewind the extraction a bit (eg: in mid stream-assembly
you might realize that a stop condition was missed, but can deduce
where it should have been). Fun.

Russ
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top