Queue.Queue-like class without the busy-wait

P

pyguy2

Cool Code!

One possible sticking point is that I think select only works on
network sockets on windows. This would make the code not crossplatforn.

john
 
P

pyguy2

Thinking about cross-platform issues. I found this, from the venerable
Tim Peters to be enlightening for python's choice of design:

"It's possible to build a better Queue implementation that runs only on
POSIX systems, or only on Windows systems, or only on one of a dozen
other less-popular target platforms. The current implementation works
fine on all of them, although is suboptimal compared to what could be
done in platform-specific Queue implementations. "

Here is a link:
http://groups-beta.google.com/group...1&q=queue+timeout+python#doc_011f680b2dac320c

The whole thread (oops a pun) is worth a read.

john
 
A

Antoon Pardon

Op 2005-03-31 said:
Cool Code!

One possible sticking point is that I think select only works on
network sockets on windows. This would make the code not crossplatforn.

As far as I understand, what I did with pipes, can be done just as
fine with network sockets. If someone want to rewrite the code
to do so and make the code more crossplatform, I'll happily
incorperate it. I'm just not familiar enough with sockets to do
it myself.
 
D

David Bolen

Paul L. Du Bois said:
Has anyone written a Queue.Queue replacement that avoids busy-waiting?
It doesn't matter if it uses os-specific APIs (eg
WaitForMultipleObjects). I did some googling around and haven't found
anything so far.

This isn't a Queue.Queue replacement, but it implements a buffer
intended for inter-thread transmission, so it could be adjusted to
mimic Queue semantics fairly easily. In fact, internally it actually
keeps write chunks in a list until read for better performance, so
just removing the coalesce process would be the first step.

It was written specifically to minimize latency (which is a
significant issue with the polling loop in the normal Python Queue
implementation) and CPU usage in support of a higher level
Win32-specific serial I/O class, so it uses Win32 events to handle the
signaling for the key events when waiting.

The fundamental issue with the native Python lock is that to be
minimalistic in what it requires from each OS, it doesn't impose a
model of being able to wait on an event signal - that's the key thing
you need to have (a timed blocking wait on some signalable construct)
to be most efficient for these operations - which is what I use the
Win32 Event for.

-- David

- - - - - - - - - - - - - - - - - - - - - - - - -

import thread
import win32event as we

class Buffer:
"""A thread safe unidirectional data buffer used to represent data
traveling to or from the application and serial port handling threads.

This class is used as an underlying implementation mechanism by SerialIO.
Application code should not typically need to access this directly, but
can handle I/O through SerialIO.

Note that we use Windows event objects rather than Python's because
Python's OS-independent versions are not very efficient with timed waits,
imposing internal latencies and CPU usage due to looping around a basic
non-blocking construct. We also use the lower layer thread lock rather
than threading's to minimize overhead.
"""

def __init__(self, notify=None):
self.lock = thread.allocate_lock()
self.has_data = we.CreateEvent(None,1,0,None)
self.clear()
self.notify = notify

def _coalesce(self):
if self.buflist:
self.buffer += ''.join(self.buflist)
self.buflist = []

def __len__(self):
self.lock.acquire()
self._coalesce()
result = len(self.buffer)
self.lock.release()
return result

def clear(self):
self.lock.acquire()
self.buffer = ''
self.buflist = []
self.lock.release()

def get(self, size=0, timeout=None):
"""Retrieve data from the buffer, up to 'size' bytes (unlimited if
0), but potentially less based on what is available. If no
data is currently available, it will wait up to 'timeout' seconds
(forever if None, no blocking if 0) for some data to arrive"""

self.lock.acquire()
self._coalesce()

if not self.buffer:
# Nothing buffered, wait until something shows up (timeout
# rules match that of threading.Event)
self.lock.release()
if timeout is None:
win_timeout = we.INFINITE
else:
win_timeout = int(timeout * 1000)
rc = we.WaitForSingleObject(self.has_data, win_timeout)
self.lock.acquire()
self._coalesce()

if not size:
size = len(self.buffer)

result_len = min(size,len(self.buffer))
result = self.buffer[:result_len]
self.buffer = self.buffer[result_len:]
we.ResetEvent(self.has_data)
self.lock.release()
return result

def put_back(self,data):
self.lock.acquire()
self.buffer = data + self.buffer
self.lock.release()
we.SetEvent(self.has_data)
if self.notify:
self.notify()

def put(self, data):
self.lock.acquire()
self.buflist.append(data)
self.lock.release()
we.SetEvent(self.has_data)
if self.notify:
self.notify()
 
N

Nick Craig-Wood

Paul Rubin said:
That would be terrible, if semaphores are as heavy as file descriptors.
I'd like to hope the OS's are better designed than that.

I believe futex is the thing you want for a modern linux. Not
very portable though.

From futex(4)

The Linux kernel provides futexes ('Fast Userspace muTexes') as a
building block for fast userspace locking and semaphores. Futexes are
very basic and lend themselves well for building higher level locking
abstractions such as POSIX mutexes.

This page does not set out to document all design decisions but
restricts itself to issues relevant for application and library devel-
opment. Most programmers will in fact not be using futexes directly but
instead rely on system libraries built on them, such as the NPTL
pthreads implementation.

A futex is identified by a piece of memory which can be shared between
different processes. In these different processes, it need not have
identical addresses. In its bare form, a futex has semaphore semantics;
it is a counter that can be incremented and decremented atomically;
processes can wait for the value to become positive.

Futex operation is entirely userspace for the non-contended case. The
kernel is only involved to arbitrate the contended case. As any sane
design will strive for non-contension, futexes are also optimised for
this situation.

In its bare form, a futex is an aligned integer which is only touched
by atomic assembler instructions. Processes can share this integer over
mmap, via shared segments or because they share memory space, in which
case the application is commonly called multithreaded.
 
P

Paul Rubin

Nick Craig-Wood said:
I believe futex is the thing you want for a modern linux. Not
very portable though.

That's really cool, but I don't see how it can be a pure userspace
operation if the futex has a timeout. The kernel must need to keep
track of the timeouts. However, since futexes can be woken by any
thread, the whole thing can be done with just one futex. In fact the
doc mentions something about using a file descriptor to support
asynchronous wakeups, but it's confusing whether that applies here.
 
N

Nick Craig-Wood

Paul Rubin said:
That's really cool, but I don't see how it can be a pure userspace
operation if the futex has a timeout. The kernel must need to keep
track of the timeouts. However, since futexes can be woken by any
thread, the whole thing can be done with just one futex. In fact the
doc mentions something about using a file descriptor to support
asynchronous wakeups, but it's confusing whether that applies here.

No it isn't pure user space, only for the non-contended case which for
most locks is the most frequent operation.

Futex operation is entirely userspace for the non-contended
case. The kernel is only involved to arbitrate the contended
case. As any sane design will strive for non-contension,
futexes are also optimised for this situation.
 
N

Nick Craig-Wood

Thinking about cross-platform issues. I found this, from the venerable
Tim Peters to be enlightening for python's choice of design:

"It's possible to build a better Queue implementation that runs only on
POSIX systems, or only on Windows systems, or only on one of a dozen
other less-popular target platforms. The current implementation works
fine on all of them, although is suboptimal compared to what could be
done in platform-specific Queue implementations. "

Here is a link:
http://groups-beta.google.com/group...1&q=queue+timeout+python#doc_011f680b2dac320c

Interesting thread.

How about leaving the current threading alone, but adding a pthreads
module for those OSes which can use or emulate posix threads? Which is
windows and most unixes?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top