A matter of queues, tasks and multiprocessing


E

Emanuele D'Arrigo

Greetings everybody,

I've tried to come up with this message for a couple of weeks now and
it doesn't look like I'm getting any clearer in my thoughts so I
decided that it's probably best to take the plunge and ask you guys to
kindly throw me a rope...

What I'm trying to come up with is some kind of simple, dynamically
scalable, domain-agnostic, multi-processor capable, tasking system.
Maybe I'm already giving myself too many constraints here, i.e. it
might all be possible except... the "simple" part.

Clearly a good (the?) starting point is to think in terms of tasks,
queues and worker threads. Once synchronization issues are properly
attended this methodology feels, at least in theory, neat and
intuitive. The problem is of course to put in practice.

For example in my context I can think of tasks with very heterogeneous
computing requirements, i.e. quick tasks such as UI interactions, slow
tasks such as I/O with large files, and pretty much everything in-
between. In this context, and given that I cannot know a priory a
task's execution time, how can I create a system that is responsive on
things like UI, takes its time on things such as I/O but gives the
application the opportunity to further customize things? I.e. a user
might want to limit the number of processors/core used to a subset of
those available. Alternatively an application might want to create
more threads than the optimal number simply because interleaving the
executions of a few tasks is preferable to simply queuing them. And
all this might have to happen at runtime.

Currently on one end of the spectrum I'm considering a purely
functional approach, with one queue for UI-related tasks, one for I/O-
related tasks and one for everything else. On a single processor/core
machine they'd simply be serviced by one thread each, all sharing one
processor. On a two processors/cores machine UI and I/O queues might
be serviced by two separate threads on the same processor while
everything else is serviced by a worker thread on the other processor.
As the number of cores/processors increases more threads/processor
would be attached to each queue, but how to establish what should get
more resources first?

Alternatively I've read about queues with threads that "steal" from
other queues. But that seems to assumes more or less homogeneous task
execution or applications where responsiveness might drop to zero,
i.e. because slow I/O tasks are being attended by all available
threads/processors. In these cases, should the UI-related thread/queue
never ever steal from the other queues, to maintain responsiveness at
all times, effectively creating an hybrid approach with one purely
functional thread/queue and (potentially) many other self-balancing
threads/queues?

Thoughts, hints, tips, tricks, directions, reading material of any
kind will all be appreciated! =)

Sincerely,

Manu
 
Ad

Advertisements

D

danmcleran

If you are using Python 2.6 or greater, look into the multiprocessing
module. It may contain 90% of what you need.
 
E

Emanuele D'Arrigo

If you are using Python 2.6 or greater, look into the multiprocessing
module. It may contain 90% of what you need.

Thank you Dan, indeed the multi-processing module has been my first
port of call and indeed it has all the basic ingredients. What I'm
looking for is an higher-level architecture built on the multi-
processing and the multi-threading modules.

Thanks again!

Manu
 
J

John Nagle

Greetings everybody,

I've tried to come up with this message for a couple of weeks now and
it doesn't look like I'm getting any clearer in my thoughts so I
decided that it's probably best to take the plunge and ask you guys to
kindly throw me a rope...

What I'm trying to come up with is some kind of simple, dynamically
scalable, domain-agnostic, multi-processor capable, tasking system.
Maybe I'm already giving myself too many constraints here, i.e. it
might all be possible except... the "simple" part.

No, the problem is vagueness. We can't tell from your description
what you want to do. A real-time desktop application? Warehouse-scale
computing? Your own "cloud"?

Incidentally, bear in mind that once you scale beyond tens of
machines, error detection and recovery starts to dominate
the problem.

John Nagle
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top