Inter-Process Messaging

D

Daniel DeLorme

Francis said:
What no one has yet asked you is: what kind of data do you have to pass
between this fork-parent and child, and by what protocol?

Are they simple commands and responses (as in HTTP)? Is there a
state-machine (as in SMTP)? Or is the data-transfer full-duplex without a
protocol?

Keeping in mind that this project is mainly intended as a learning
experience, my specific idea is a http server architecture with
generalist and specialist worker processes. The dispatcher would
partition requests to generalists (as in HTTP) who in turn might
dispatch to specialists (as in SMTP) for particular sub-tasks.

Simple example: 100 requests for a thumbnail come at the same time, are
split among N generalist workers, each of which asks the thumbnail
specialist process to generate the thumbnail. The specialist catches the
100 simultaneous requests, generates the thumbnail *once* and sends the
result back to the N generalists, who render it.
Are the data-flows extremely large? Exactly what are your performance
requirements?

Good questions, but ultimately my true purpose is to educate myself
about parallel processing; that's the only requirement I have. So while
I'd say data-flows are unlikely to be large in this case, I'd still like
to how to handle large data-flows.

Hmmm, any good books to recommend?
 
D

Daniel DeLorme

Francis said:
Sounds like you've already read all the books you need to read. You have the
standard lingo down pat!

I'm afraid that must be a freak accident ;-)
But here goes: you picked the wrong project to demonstrate parallel
processing. Fast handling of network I/O is best done in an event-driven
way, and not in parallel. The parallelism that this problem exhibits arises
from the inherent nondeterminacy of having many independent clients
operating simultaneously. This pattern does expose capturable intramachine
latencies, but they're due to timing differentials, not to processing
inter-dependencies.

Could you elaborate what you mean by "timing differentials" and
"processing inter-dependencies"? For regular webapps, time spent
querying the database most certainly exposes capturable intramachine
latencies. Event-driven sounds good, but doesn't that requires that
*all* I/O be non-blocking? If you have blocking I/O in, say, a
third-party lib, you're toast.
It's intuitively attractive to structure a network server as a set of
parallel processes or threads, but it doesn't add anything in terms of
performance or scalability. As regards multicore architectures, they add
little to a network server because the size of the incoming network pipe
typically dominates processor bandwidth in such applications.

You mean to say it's the network that is usually the bottleneck, not the
CPU? Well, in my experience the database is usually the bottleneck, but
let's not forget that ruby is particularly demanding on the CPU.
You may rejoin: "but how about an HTTP server that does a massive amount of
local processing to fulfill each request?" Now that's more interesting. Just
get rid of the HTTP part and concentrate on how to parallelize the
processing. That's a huge and well-studied problem in itself, and the net is
full of good resources on it.

While optimizing for CPU speed is fine, I'm also interested in process
isolation. If you have a monster lib that takes 1 minute to initialize
and requires 1 GB of resident memory but is used only occasionally, do
you really want to load it in all of your worker processes?
 
M

M. Edward (Ed) Borasky

Francis said:
In general, the problem of architecting a high-performance web server that
includes external dependencies like databases, legacy applications, SOAP,
message-queueing systems, etc etc, is a very big problem with no simple
answer. It's also been intensely studied, so there are resources for you all
over the web.

In other words, Robert Heinlein's TANSTAAFL principle holds up for this
domain, like many others: "There Ain't No Such Thing As A Free Lunch!"

Ironically, I was invited to a seminar a couple of weeks ago about
concurrency titled, "The Free Lunch Is Over". What was ironic about it
was that I couldn't attend because I had a prior commitment -- a service
anniversary cruise with my employer at which I received a free lunch. :)
I made that remark in relation to efforts to make network servers run faster
by hosting them on multiprocessor or multicore machines. You'll usually find
that a single computer with one big network pipe attached to it won't be
able to process the I/O fast enough to keep all the cores busy. You might
then be tempted to host the DBMS on the same machine, but that's rarely a
good idea. Simpler is better.

Up to a point, yes, simpler is better. But the goal of system
performance engineering of this type is to have, as much as possible, a
balanced system -- network, disk and processor utilizations
approximately equal and none of them saturated. That's the "sweet spot"
where you get the highest throughput for the lowest cost.

If your workload is well-behaved, you can sometimes get here "on the
average over a workday". But web server workloads are anything but
well-behaved, even in the absence of deliberate denial of service
attacks. :)
 
M

M. Edward (Ed) Borasky

Francis said:
At the risk of starting a threadjack, I think Ruby is today not as
well-served as some other development products in the way of native
message-passing systems. There is Assaf Arkin's very nice reliable-msg
library, which defines a good API for messaging, and has some support for
persistence. And of course there are several libraries which support Stomp,
making it easier to work with products like Java's AMQ.

I'd like to see a full-featured, high-performance MQ system for Ruby,
however. By "for Ruby" I don't necessarily mean "in Ruby," but rather
tightly integrated and easy/intuitive to use. Doing such a thing was the
original motivation for creating EventMachine, by the way.

"If you build it, they will come." :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top