state of blocking/nonblocking I/O

Joshua Haberman · Oct 2, 2005

Here is my understanding about the current state of I/O in Ruby.
Please correct me where I am mistaken.

- by default, ruby i/o operations block, but only block the calling
Ruby thread. Ruby does this by scheduling a thread out if the fd is
not read-ready/write-ready. If there is more than one Ruby thread,
Ruby won't do a read(2) or write(2) on an fd unless select() says it
is ready, to prevent blocking the entire process.

- the one flaw with this scheme is that write(2) can block even if an
fd is write-ready, if you try to write too much data. This will
cause such a write to lock the entire process and all Ruby threads
therein ([0] is a simple test program that displays the problem).

- You can try setting O_NONBLOCK on your IO objects with fcntl. That
will help you in the case where you only have one Ruby thread -- now
read and write will raise Errno::EAGAIN if the fd isn't ready. But
in the case where there is more than one Ruby thread, this won't work
because Ruby won't perform the read(2) or write(2) until the fd is
ready. So even though you have O_NONBLOCK set, you block your Ruby
thread. (See [1] for an example]).

Is this right? What is the current state of supporting nonblocking i/
o in Ruby?

One other question: are the buffered fread()/fwrite() functions
guaranteed to work correctly if O_NONBLOCK is set on the underlying
descriptor? I have not been able to find a good answer to this.

Josh

Example [0]:

thread = Thread.new {
while true
puts "Background thread running..."
sleep 1;
end
}

# Give the background thread a few chances to show that it's running
sleep 2;

(read_pipe, write_pipe) = IO:

ipe

# this will stall the entire process, including the background thread.
# change the length to 4096 and everything is fine.
write_pipe.write(" " * 4097)

thread.join

Example [1]:

require 'fcntl'

thread = Thread.new {
while true
puts "Background thread running..."
sleep 1;
end
}

(read_pipe, write_pipe) = IO:

ipe
read_pipe.fcntl(Fcntl::F_SETFL, read_pipe.fcntl(Fcntl::F_GETFL) |
Fcntl::O_NONBLOCK)

# this will block our thread, even though the fd is set to nonblocking.
# however, if you eliminate the background thread, this call with
give you EAGAIN,
# which is what you want.
read_pipe.read

# we will never get here
puts "Finished read!"

Tanaka Akira · Oct 3, 2005

Joshua Haberman said:
- by default, ruby i/o operations block, but only block the calling
Ruby thread. Ruby does this by scheduling a thread out if the fd is
not read-ready/write-ready. If there is more than one Ruby thread,
Ruby won't do a read(2) or write(2) on an fd unless select() says it
is ready, to prevent blocking the entire process.
Right.

- the one flaw with this scheme is that write(2) can block even if an
fd is write-ready, if you try to write too much data. This will
cause such a write to lock the entire process and all Ruby threads
therein ([0] is a simple test program that displays the problem).
Right.

- You can try setting O_NONBLOCK on your IO objects with fcntl. That
will help you in the case where you only have one Ruby thread -- now
read and write will raise Errno::EAGAIN if the fd isn't ready.

No.

IO#write doesn't raise Errno::EAGAIN but retry until all data is written.

IO#read also retry since Ruby 1.9.

So IO#write and IO#read may block calling thread.

But
in the case where there is more than one Ruby thread, this won't work
because Ruby won't perform the read(2) or write(2) until the fd is
ready. So even though you have O_NONBLOCK set, you block your Ruby
thread. (See [1] for an example]).
Right.

One other question: are the buffered fread()/fwrite() functions
guaranteed to work correctly if O_NONBLOCK is set on the underlying
descriptor? I have not been able to find a good answer to this.

fwrite(3) may lost data.

So Ruby 1.8 may lost data.

% ruby-1.8.3 -v
ruby 1.8.3 (2005-09-21) [i686-linux]
% ruby-1.8.3 -rfcntl -e '
w = STDOUT
w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
w << "a" * 4096
w.flush
w << "b"
w.flush
' | ruby -e 'sleep 1; p STDIN.read.length'
4096

However no data is lost if IO#sync = true since Ruby 1.8.2.
It's because stdio is bypassed.

% ruby-1.8.3 -rfcntl -e '
w = STDOUT
w.sync = true
w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
w << "a" * 4096
w.flush
w << "b"
w.flush
' | ruby -e 'sleep 1; p STDIN.read.length'
4097

Ruby 1.9 doesn't have the problem because it has its own
buffering mechanism.

# this will block our thread, even though the fd is set to nonblocking.
# however, if you eliminate the background thread, this call with
give you EAGAIN,
# which is what you want.
read_pipe.read

If you want to test some data available, use IO.select.

Joshua Haberman · Oct 3, 2005

--Apple-Mail-5--331309369
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Tanaka,

Thanks for your helpful answers!

No.

IO#write doesn't raise Errno::EAGAIN but retry until all data is
written.

IO#read also retry since Ruby 1.9.

So IO#write and IO#read may block calling thread.

Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
have to write IO#nonblock_read and IO#nonblock_write, that do not
have this retry behavior?

One other question: are the buffered fread()/fwrite() functions
guaranteed to work correctly if O_NONBLOCK is set on the underlying
descriptor? I have not been able to find a good answer to this.

Click to expand...

fwrite(3) may lost data.

So Ruby 1.8 may lost data.

% ruby-1.8.3 -v
ruby 1.8.3 (2005-09-21) [i686-linux]
% ruby-1.8.3 -rfcntl -e '
w = STDOUT
w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
w << "a" * 4096
w.flush
w << "b"
w.flush
' | ruby -e 'sleep 1; p STDIN.read.length'
4096

Ooh, that's bad. What's the explanation for that?

If you want to test some data available, use IO.select.

Yes, but IO.select can't tell me how *much* data I can read or
write. IO#read and IO#write can still block if I try to read or
write too much data, which is what I want to avoid.

Thanks,
Josh

--Apple-Mail-5--331309369--

Tanaka Akira · Oct 3, 2005

Joshua Haberman said:
Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
have to write IO#nonblock_read and IO#nonblock_write, that do not
have this retry behavior?

IO#sysread and IO#syswrite is possible candidates.
However they may block when multithreaded because select.
Also they cannot be combined with buffering methods.

Nonblocking methods such as IO#nonblock_read and
IO#nonblock_write is good idea. If matz accept it, I'll
implement them definitely. However I'm not sure that matz
think the method names are good enough.

Ooh, that's bad. What's the explanation for that?

R. Stevens says

using standard I/O with nonblocking descriptors,
a recipe for disaster

UNIX Network Programming Vol1, p.399

For more information, read the source of fflush in stdio.
Version 7, 4.4BSD and glibc has the problem as far as I
know. I feel it's portable behavior.

Yes, but IO.select can't tell me how *much* data I can read or
write. IO#read and IO#write can still block if I try to read or
write too much data, which is what I want to avoid.

IO#readpartial is available since ruby 1.8.3.
It doesn't block if some data available.

For writing, I think IO#syswrite is required.

Ara.T.Howard · Oct 3, 2005

IO#sysread and IO#syswrite is possible candidates.
However they may block when multithreaded because select.
Also they cannot be combined with buffering methods.

Nonblocking methods such as IO#nonblock_read and
IO#nonblock_write is good idea. If matz accept it, I'll
implement them definitely. However I'm not sure that matz
think the method names are good enough.

thanks so much for doing this work!

suggestions:

IO#nb_read
IO#nb_write

or objectify:

nbio = NBIO::new an_io

nb.read 42 #=> will not block
nb.write 42 #=> will not block

etc.

this would be a great addition - a good name must be found! ;-)

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================

Joshua Haberman · Oct 3, 2005

IO#sysread and IO#syswrite is possible candidates.
However they may block when multithreaded because select.

It seems that Ruby should keep track of whether a descriptor has
O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
On the other hand, that will break if O_NONBLOCK is set by a C
extension, or by another process that has the same ofile open. Sigh.

Nonblocking methods such as IO#nonblock_read and
IO#nonblock_write is good idea. If matz accept it, I'll
implement them definitely. However I'm not sure that matz
think the method names are good enough.

Well I don't know if will help convince matz, but djb advocates that
naming scheme as well, for C:

http://cr.yp.to/unix/nonblock.html

Now that I think of it, implementing IO#nonblock_read and
IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.

I guess for now I'll have to use sysread/syswrite, along with a home-
rolled buffering layer.

R. Stevens says

using standard I/O with nonblocking descriptors,
a recipe for disaster

I guess that says it all.

Josh

Tanaka Akira · Oct 3, 2005

Joshua Haberman said:
It seems that Ruby should keep track of whether a descriptor has
O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
On the other hand, that will break if O_NONBLOCK is set by a C
extension, or by another process that has the same ofile open. Sigh.

Yes. The shared fd is a problem hard to solve.

Now that I think of it, implementing IO#nonblock_read and
IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.

They are not problem if IO#sync = true.

Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
are IO#sync = true by default, the problem is not occur in
most cases.

I guess for now I'll have to use sysread/syswrite, along with a home-
rolled buffering layer.

You need your buffering layer if O_NONBLOCK is
used on ruby 1.8. However IO#sync = true is enough if
buffering is not required.

Robert Klemme · Oct 3, 2005

Tanaka Akira said:
Yes. The shared fd is a problem hard to solve.

They are not problem if IO#sync = true.

Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
are IO#sync = true by default, the problem is not occur in
most cases.

You need your buffering layer if O_NONBLOCK is
used on ruby 1.8. However IO#sync = true is enough if
buffering is not required.

I have one question on this matter which I still don't understand (I'm not
so deep into C stdlib IO variants so please bear with me): why would anybody
want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
read anything on return even if the stream is not closed) in the light of
Ruby threads? I mean, with that one would have to build the multiplexing in
Ruby which is already present in the interpreter with multiple Ruby threads?
Are there situations that I'm not aware of where this is useful / needed?
Thanks!

Kind regards

robert

Tanaka Akira · Oct 4, 2005

Robert Klemme said:
I have one question on this matter which I still don't understand (I'm not
so deep into C stdlib IO variants so please bear with me): why would anybody
want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
read anything on return even if the stream is not closed) in the light of
Ruby threads? I mean, with that one would have to build the multiplexing in
Ruby which is already present in the interpreter with multiple Ruby threads?
Are there situations that I'm not aware of where this is useful / needed?

It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.

Joshua Haberman · Oct 4, 2005

--Apple-Mail-8--225002262
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.

Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

You might have many IO objects open that are interrelated. Say your
program logic is something like:

when there's data available on object A, process it and send the
results to B and C
when there's data available on object B, process it and send the
results to A and C
when there's data available on object C, process it and send the
results to A and B

How should I break this down into threads? Three threads that block-
on-read for A, B, and C? But what if A and B get data at the same
time? They might interleave their writes to C. Do I put a mutex
around C?

For this case, it's a lot easier and more natural to write a main
loop like:

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

Nonblocking I/O gives you more control over the execution of your
program, and frees you from the worries of synchronizing between
threads. And it's simpler than using threads for programs that
follow certain patterns.

Josh

--Apple-Mail-8--225002262--

David Gurba · Oct 4, 2005

Joshua said:
It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.

Click to expand...

Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

You might have many IO objects open that are interrelated. Say your
program logic is something like:

when there's data available on object A, process it and send the
results to B and C
when there's data available on object B, process it and send the
results to A and C
when there's data available on object C, process it and send the
results to A and B

How should I break this down into threads? Three threads that block-
on-read for A, B, and C? But what if A and B get data at the same
time? They might interleave their writes to C. Do I put a mutex
around C?

For this case, it's a lot easier and more natural to write a main
loop like:

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

Nonblocking I/O gives you more control over the execution of your
program, and frees you from the worries of synchronizing between
threads. And it's simpler than using threads for programs that
follow certain patterns.

Josh

This sounds really interesting, but I don't fully understand the while
loop. Nonblocking IO sends/recieves data when its ready/requested...eg.
it doesn't block for the data, right?

I have written some threaded applications. A java tic-tac-toe game which
had players and observers of a game that all viewed a global 'board'
state. Methods to modify the game state were thread safe with mutexes,
how is what your saying different...? Any info appreciated...

ooooo my 1st post to the mailing list

Robert Klemme · Oct 4, 2005

Joshua said:
It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.

Click to expand...

Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

You might have many IO objects open that are interrelated. Say your
program logic is something like:

when there's data available on object A, process it and send the
results to B and C
when there's data available on object B, process it and send the
results to A and C
when there's data available on object C, process it and send the
results to A and B

How should I break this down into threads? Three threads that block-
on-read for A, B, and C? But what if A and B get data at the same
time? They might interleave their writes to C. Do I put a mutex
around C?

For this case, it's a lot easier and more natural to write a main
loop like:

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

Nonblocking I/O gives you more control over the execution of your
program, and frees you from the worries of synchronizing between
threads. And it's simpler than using threads for programs that
follow certain patterns.

Thanks for the feedback. Even in this case I'd probably choose a
different architecture. I dunno which of these is easier but here's how
I'd do it:

Have a thread per open client connection that reads requests. Requests
are put into a queue (thread safe!). Then I'd have a number of workers
that fetch from the task queue and do the work. Either each worker sends
results directly to affected clients or puts results into a second queue
from which a number of sender threads fetch their tasks and send
responses. There could also be dedicated sender threads per client.

If there are no dedicated sender threads you would need just a single
point of synchronization (apart from what queue does internally) for the
sending socket in order to prevent multiple responses to interfere (if
it's possible that a request is received while the answer to another
request is being processed).

Kind regards

robert

snacktime · Oct 4, 2005

------=_Part_6458_16119870.1128415530682
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

This sounds really interesting, but I don't fully understand the while
loop. Nonblocking IO sends/recieves data when its ready/requested...eg.
it doesn't block for the data, right?

I have written some threaded applications. A java tic-tac-toe game which

had players and observers of a game that all viewed a global 'board'
state. Methods to modify the game state were thread safe with mutexes,
how is what your saying different...? Any info appreciated...

ooooo my 1st post to the mailing list

In simple terms, with an event framework you have one main event loop that
keeps a state engine of sorts for all the current IO operations going on. I=
n
your code when you need to do an IO operation, you send it to the event
loop, register a callback, and then when there is something to read the
event loop fires the callback. Event frameworks such as python's twisted
provide a lot of the internal non blocking IO functions for you so you don'=
t
have to implement them yourself. For example writing to a file, waiting on =
a
socket, etc.. You call the higher level function, register a callback, and
continue on your way.

Chris

------=_Part_6458_16119870.1128415530682--

snacktime · Oct 4, 2005

------=_Part_6518_31333048.1128415900501
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

I have one question on this matter which I still don't understand (I'm no= t
so deep into C stdlib IO variants so please bear with me): why would
anybody
want to use nonblocking IO (on the Ruby level, e.g. IO#read might not hav= e
read anything on return even if the stream is not closed) in the light of
Ruby threads? I mean, with that one would have to build the multiplexing
in
Ruby which is already present in the interpreter with multiple Ruby
threads?
Are there situations that I'm not aware of where this is useful / needed?
Thanks!

Kind regards

robert

I don't know exactly why this is, but an event framework using a single
event loop is far more efficient then a bunch of threads each doing their
own IO. Now this is with python and perl, ruby could be different although
that would surprise me. On a number of applications that I have converted
from threads to an event loop, cpu usage dropped like a rock.

Not to mention that the issue of synchronization pretty much goes away.

Chris

------=_Part_6518_31333048.1128415900501--

Tanaka Akira · Oct 4, 2005

Joshua Haberman said:
Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

I see.

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

It seems too simplified to explain nonblocking I/O problem.

O_NONBLOCK is required to avoid that write(2) blocks entire
process. But if write(2) doesn't block due to O_NONBLOCK,
some data are not written. So the result of write(2) should
be checked and remaining data should be try to write later.

It can be implemented by two ways.

1. Using event driven framework and register an event
handler for writability to the client. Since the event
handler must not block, it needs a nonblocking write
operation.

2. Using a writing thread dedicated for the client.
Since the thread is dedicated for the writing, it can use
a blocking write operation.

Ara.T.Howard · Oct 4, 2005

It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.

it's sort of the same thing as 2, but network intense clients might me written
more easily too... i've written code that was managing 100s of ssh connection
for example. i could have used threads but it was easier/more responsive to
just have an array of pipes and non-blocking reads.

the only other thing i can think of is any time you may actually be fine
blocking on a read, but wish to remain aware of time - eg. you don't want to
block too long - and here something like readpartial is great.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================

Joshua Haberman · Oct 4, 2005

Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

Click to expand...

I see.

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

Click to expand...

It seems too simplified to explain nonblocking I/O problem.

O_NONBLOCK is required to avoid that write(2) blocks entire
process. But if write(2) doesn't block due to O_NONBLOCK,
some data are not written. So the result of write(2) should
be checked and remaining data should be try to write later.

Yes, the code I posted made some simplifying assumptions. If some of
the data was not written, you need to account for that somehow. The
options you suggest would work for ensuring that the write happens
eventually. But a nice thing about doing everything in a single
thread is that you can do better if you choose.

Imagine that you start with world W1. You get a message from client
A that updates the state of the world to W2. Call DELTA1 the message
you have to send to B and C to update them to W2. You try to write
DELTA1 to B, but C is not ready. The next time through your event
loop, you get a message from client B updating the state of the world
to W3. DELTA2 updates W2 to W3, so you send that to A, but C is
still not ready.

Once C becomes ready, you could send C DELTA1 and DELTA2, or you
could be smart and combine those into a single DELTA3 that updates W1
to W3. DELTA3 will likely be smaller than DELTA1 + DELTA2. If you
had initially blocked-on-write to send DELTA1, you would not have
that option.

Josh

Mark Cotner · Oct 4, 2005

Here's an example that I tested recently. The application is a network
management framework for polling millions of devices per hour. Threads are
nice, but context switching between enough threads to get the job done(175
per my testing) generates a ton of CPU load(~load jumps to 30 and 0% idle on
4 CPU box) when trying to maintain this many threads. It was written in the
producer/consumer pattern so thread startup isn't compounding the issue.
However, the same application using Perl('cause Ruby can't do this just yet)
asynchronous SNMP polls twice as many devices in the same amount of time
with 2-4 processes and the system load is ~1 and ~70% idle.

Granted, I'm doing some fairly extreme things, but it does help answer the
question of threads vs async IO.

'njoy,
Mark

Joshua Haberman · Oct 5, 2005

Joshua said:
Joshua said:

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless
client == io }
}
end

Nonblocking I/O gives you more control over the execution of your
program, and frees you from the worries of synchronizing between
threads. And it's simpler than using threads for programs that
follow certain patterns.

Josh

Click to expand...

This sounds really interesting, but I don't fully understand the
while loop. Nonblocking IO sends/recieves data when its ready/
requested...eg. it doesn't block for the data, right?

I'm not sure exactly what you're asking. Nonblocking I/O basically
tells the OS: "when I do a read() or write(), only perform as much of
the operation as you can without making me wait." If the OS cannot
perform *any* of the operation (because there is no data waiting to
read, or no buffer space available to write), the call errors with
EAGAIN.

IO.select is what you use to ask the OS what file descriptors are
available for reading or writing. IO.select is what blocks, until
one of your fds is available, or a timeout has elapsed. If you
didn't use select, you'd have to busy-wait by reading from the fd
over and over (getting EAGAIN every time). That would waste the
CPU. Instead, you ask select to block until a file descriptor is
available.

I have written some threaded applications. A java tic-tac-toe game
which had players and observers of a game that all viewed a global
'board' state. Methods to modify the game state were thread safe
with mutexes, how is what your saying different...? Any info
appreciated...

If you follow the pattern above, you don't have to make anything
thread-safe. You don't have to use mutexes. You don't have to think
about possibly bad interactions between threads like deadlock.
Everything happens in the same thread.

Josh

How can an I use nonblocking I/O with openssl?	3	Feb 6, 2009
[ANN] kgio 2.4.0 - kinder, gentler I/O for Ruby	0	May 6, 2011
Nonblocking Sockets	14	Jul 16, 2005
Nonblocking IO read	30	Oct 31, 2006
Windows - Socket - Connect - Nonblocking	4	Jul 6, 2004
Blocking IO on windows in IRB	1	Jul 16, 2007
[ANN] sleepy_penguin 2.0.0 - Linux I/O events for Ruby	3	Mar 10, 2011
[ANN] nbfifo-0.0.0 - non blocking fifos for threads	0	Sep 23, 2005

state of blocking/nonblocking I/O

Joshua Haberman

Tanaka Akira

Joshua Haberman

Tanaka Akira

Ara.T.Howard

Joshua Haberman

Tanaka Akira

Robert Klemme

Tanaka Akira

Joshua Haberman

David Gurba

Robert Klemme

snacktime

snacktime

Tanaka Akira

Ara.T.Howard

Joshua Haberman

Mark Cotner

Joshua Haberman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads