A matter of queues, tasks and multiprocessing

Emanuele D'Arrigo · Nov 10, 2010

Greetings everybody,

I've tried to come up with this message for a couple of weeks now and
it doesn't look like I'm getting any clearer in my thoughts so I
decided that it's probably best to take the plunge and ask you guys to
kindly throw me a rope...

What I'm trying to come up with is some kind of simple, dynamically
scalable, domain-agnostic, multi-processor capable, tasking system.
Maybe I'm already giving myself too many constraints here, i.e. it
might all be possible except... the "simple" part.

Clearly a good (the?) starting point is to think in terms of tasks,
queues and worker threads. Once synchronization issues are properly
attended this methodology feels, at least in theory, neat and
intuitive. The problem is of course to put in practice.

For example in my context I can think of tasks with very heterogeneous
computing requirements, i.e. quick tasks such as UI interactions, slow
tasks such as I/O with large files, and pretty much everything in-
between. In this context, and given that I cannot know a priory a
task's execution time, how can I create a system that is responsive on
things like UI, takes its time on things such as I/O but gives the
application the opportunity to further customize things? I.e. a user
might want to limit the number of processors/core used to a subset of
those available. Alternatively an application might want to create
more threads than the optimal number simply because interleaving the
executions of a few tasks is preferable to simply queuing them. And
all this might have to happen at runtime.

Currently on one end of the spectrum I'm considering a purely
functional approach, with one queue for UI-related tasks, one for I/O-
related tasks and one for everything else. On a single processor/core
machine they'd simply be serviced by one thread each, all sharing one
processor. On a two processors/cores machine UI and I/O queues might
be serviced by two separate threads on the same processor while
everything else is serviced by a worker thread on the other processor.
As the number of cores/processors increases more threads/processor
would be attached to each queue, but how to establish what should get
more resources first?

Alternatively I've read about queues with threads that "steal" from
other queues. But that seems to assumes more or less homogeneous task
execution or applications where responsiveness might drop to zero,
i.e. because slow I/O tasks are being attended by all available
threads/processors. In these cases, should the UI-related thread/queue
never ever steal from the other queues, to maintain responsiveness at
all times, effectively creating an hybrid approach with one purely
functional thread/queue and (potentially) many other self-balancing
threads/queues?

Thoughts, hints, tips, tricks, directions, reading material of any
kind will all be appreciated! =)

Sincerely,

Manu

danmcleran · Nov 10, 2010

If you are using Python 2.6 or greater, look into the multiprocessing
module. It may contain 90% of what you need.

Emanuele D'Arrigo · Nov 11, 2010

If you are using Python 2.6 or greater, look into the multiprocessing
module. It may contain 90% of what you need.

Thank you Dan, indeed the multi-processing module has been my first
port of call and indeed it has all the basic ingredients. What I'm
looking for is an higher-level architecture built on the multi-
processing and the multi-threading modules.

Thanks again!

Manu

Tim Golden · Nov 11, 2010

Thank you Dan, indeed the multi-processing module has been my first
port of call and indeed it has all the basic ingredients. What I'm
looking for is an higher-level architecture built on the multi-
processing and the multi-threading modules.

Try Celery

http://ask.github.com/celery/getting-started/introduction.html

TJG

Emanuele D'Arrigo · Nov 12, 2010

Try Celery

http://ask.github.com/celery/getting-started/introduction.html

Thank you Tim, looks very interesting!

Manu

John Nagle · Nov 12, 2010

Greetings everybody,

I've tried to come up with this message for a couple of weeks now and
it doesn't look like I'm getting any clearer in my thoughts so I
decided that it's probably best to take the plunge and ask you guys to
kindly throw me a rope...

What I'm trying to come up with is some kind of simple, dynamically
scalable, domain-agnostic, multi-processor capable, tasking system.
Maybe I'm already giving myself too many constraints here, i.e. it
might all be possible except... the "simple" part.

No, the problem is vagueness. We can't tell from your description
what you want to do. A real-time desktop application? Warehouse-scale
computing? Your own "cloud"?

Incidentally, bear in mind that once you scale beyond tens of
machines, error detection and recovery starts to dominate
the problem.

John Nagle

John Nagle · Nov 12, 2010

No, the problem is vagueness. We can't tell from your description
what you want to do. A real-time desktop application? Warehouse-scale
computing? Your own "cloud"?

Incidentally, bear in mind that once you scale beyond tens of
machines, error detection and recovery starts to dominate
the problem.

John Nagle

Here's how Google does it:

http://www.stanford.edu/class/ee380/Abstracts/101110-slides.pdf

John Nagle

Creating a vector of queues	2	Jan 23, 2025
Tasks	1	Nov 29, 2022
How to keep the order of executing tasks? - Help needed.	1	Feb 21, 2023
Difference between queues and pipes in multiprocessing	0	Aug 4, 2010
Digging into multiprocessing	0	Aug 12, 2013
Multiprocessing / threading confusion	11	Sep 5, 2013
Multiprocessing problem	1	Mar 2, 2010
Advice regarding multiprocessing module	0	Mar 11, 2013

A matter of queues, tasks and multiprocessing

Emanuele D'Arrigo

danmcleran

Emanuele D'Arrigo

Tim Golden

Emanuele D'Arrigo

John Nagle

John Nagle

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads