[ANN] EventLoop 0.0.20050825.1600

D

Daniel Brockman

Hi list,

Due to the somewhat popular demand of an event loop for Ruby,
I've recently been working on packaging the one I've written
for a network application of mine (Refusde, an NMDC client).

With the help of Tilman Sauerbeck, I've now managed to put
together some documentation and a gem/tarball of what I've
decided is going to be the first publicly announced version.

Here's the canonical short package overview:

EventLoop is a simple IO::select-based main event loop
featuring IO event notification and timeout callbacks.
It comes with a signal system inspired by that of GLib.

The code is licensed under the GPL and can be found at
<http://www.brockman.se/software/ruby-event-loop/>.

At this point, some of you will probably want to see an
example of how it works --- a kind of screenshot.

For this purpose, I chose to implement a simple asynchronous
buffered IO reader:

require "event-loop"

class BufferedReader
include SignalEmitter

define_signals :line, :done

def initialize(io, eol=3D"\n")
yield self if block_given?
io =3D File.new(io) if io.kind_of? String
buffer =3D String.new
io.on_readable do
begin
buffer << io.readpartial(1024)
while i =3D buffer.index(eol)
signal :line, buffer.slice!(0, i)
buffer.slice!(0, eol.size)
end
rescue EOFError
signal :done, buffer
io.close
end
end
end
end

reader =3D BufferedReader.new("/etc/passwd") do |r|
r.on_line { |content| puts "Line: #{content}" }
r.on_done { |leftover| puts "Done: #{leftover}" }
r.on_done { EventLoop.quit }
end

EventLoop.run

See how easy the event loop is to use, and how nicely it
blends into the rest of Ruby?

For good measure, maybe I should also attach a section of
the manual (i.e., the README file) that describes how event
loops fit into the rest of the world:


The Event Loop
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

This section explains how IO multiplexing works in general
(albeit briefly and not very in-depth), and specifically the
issues relevant for Ruby applications. You may safely skip
it if you (a) already know this subject, or (b) don't care.

Plain ol' blocking IO works well when you're reading from
just a single file descriptor. But when you're interested
in a whole bunch of FDs, you can't wait for any single one
of them to become readable or writable, because then you'll
inevitably miss that happening to the other ones. Instead,
you need a multiplexer that can wait for them *all at once*.

There are a handful of low-level multiplexing primitives:
=E2=80=98select=E2=80=99, =E2=80=98poll=E2=80=99, =E2=80=98epoll=E2=80=99=
, =E2=80=98/dev/poll=E2=80=99, and =E2=80=98kqueue=E2=80=99.
In addition, there are portable low-level wrapper libraries
such as libevent, which can use any of those primitives.
The event loop in this package uses the standard =E2=80=98select=E2=80=99
wrapper shipped with Ruby, =E2=80=98IO::select=E2=80=99. But in the futu=
re,
I'd like to use libevent instead, because that'd be cooler.

Most applications use a higher-level abstraction built on
top of the low-level multiplexer, usually called a =E2=80=98main
loop=E2=80=99, an =E2=80=98event loop=E2=80=99, or an =E2=80=98event sour=
ce=E2=80=99. There are
also libraries such as liboop, which generalizes the event
source and event sink concepts, so that components (event
sinks) written against liboop become event-source-agnostic.

Actually, the combination of blocking IO and Ruby's green
threads works well in most cases where you would normally
use an event loop. When you call =E2=80=98IO#read=E2=80=99 on an empty f=
ile
descriptor, for instance, Ruby suspends that thread until
its internal event loop, known as the scheduler (currently
based on =E2=80=98select=E2=80=99), determines that the file descriptor h=
as
become readable. In particular, Ruby never calls the
low-level =E2=80=98read=E2=80=99 function unless it knows that it will no=
t
block (because =E2=80=98select=E2=80=99 said it wouldn't, but see below).

There are several reasons why you would use an event loop
such as the one implemented by this library instead of
not-so-plain ol' blocking IO with Ruby's green threads.

First of all, you may consider the event loop API more
pleasant than Ruby's threads and not-quite-blocking IO.
Otherwise, don't listen to me; go on using the latter. :)

Blocking IO can occasionally cause unexpected problems.
For example, in some cases a blocking read *can* block even
though select said that the file descriptor was readable.
This problem may be rare (it can happen, for instance, when
the checksum of a piece of data fails to match the payload),
but the bottom line is that non-blocking IO is safer.

Perhaps most importantly, while Ruby's threads are green,
they are still effectively preemptively scheduled, with all
the implications thereof =E2=80=94 in a word, synchronization hell.
By contrast, event handlers are executed in a strictly
sequential manner; an event loop will never run two event
handlers simultaneously. (Though, of course, all bets are
off if you run multiple event loops in separate threads.)

--=20
Daniel Brockman <[email protected]>
 
T

Tanaka Akira

Joel VanderWerf said:
I *love* ruby threads. Still, I wish ruby's thread scheduler would
handle more types of blocking than select can handle, such as waiting
for a file lock.

File#flock works well since Ruby 1.8.2. It blocks only the calling
thread. It doesn't block other threads.

% ruby-1.8.2 -ve '
f1 = open("z", "w")
f1.flock(File::LOCK_EX)
t = Thread.new {
f2 = open("z", "w")
p :f2_lock_start
f2.flock(File::LOCK_EX)
p :f2_lock_end
}
3.times {|i| p i; sleep 1 }
f1.flock(File::LOCK_UN)
t.join
'
ruby 1.8.2 (2004-12-25) [i686-linux]
:f2_lock_start
0
1
2
:f2_lock_end
 
J

Joel VanderWerf

Daniel said:
Or use a deterministic event loop and avoid the problem of
synchronization altogether.

In a callback-based system, you have to deal with callbacks.
In a preemptively multithreaded system, you have to deal
with synchronization. It's a tradeoff, and largely a matter
of taste, preference and familiarity.

You might also ask yourself, do you really *need* to have
the scheduler arbitrarily switch contexts back and forth?
Do your event handlers really take that much time to run?
If so, fine. Otherwise, why not have determinism instead?

That's a good point.

I do like what using threads does to the architecture of my program.
It's very easy to separate all the functionality out into components,
each of which performs a specific task, has a ThreadGroup to manage its
own threads, and communicates with other components by queues. The
components can be tested idependently and even executed in other
processes/hosts, if you replace Queue with something based on Sockets
and Marshal, or DRb.

So I guess another consideration in making this tradeoff is the degree
to which the system as a whole can be decoupled.

If, for example, the handlers are making atomic updates to some
monolithic data structure, or to a GUI, then decoupling doesn't make
sense: the overhead to make the updates atomic would be too high.
 
D

Daniel Berger

Joel said:
I updated FSDB(*) to 0.5 to take advantage of this, in case it's running
on 1.8.2 or better. In code with several processes each with several
threads, I see about a 12%-17% speed boost, because of not having to use
the polling hack.

(*) http://redshift.sourceforge.net/fsdb/

In other news, 1989 called. They want their version numbering system back.

Please give us a sane version number. :)

Dan
 
J

Joel VanderWerf

Daniel said:
Joel VanderWerf wrote: ...

In other news, 1989 called. They want their version numbering system back.

Please give us a sane version number. :)

Dan

I'm still living in 1989 in many ways....

It's not quite a three-year old project yet, so I don't think it
deserves 1.0 status ;)

Or do you mean more digits? (Internally, it is 0.5.5, but, for a minor
project like this, I only release the last in each 0.x series.)
 
T

Tanaka Akira

Daniel Brockman said:
In a callback-based system, you have to deal with callbacks.
In a preemptively multithreaded system, you have to deal
with synchronization. It's a tradeoff, and largely a matter
of taste, preference and familiarity.

It seems that a giant lock can be some compromise of them.
(like GIL of Python)

Apart from that, Ruby's IO methods are not so good for event loop.
You may have frustration when you find that some methods block even if
O_NONBLOCK is set.

The blocking behavior is good for threaded programs. The context
switch behind the blocking is enough to do some works because the
works are held by other threads. So the blocking behavior makes
threaded programs happy even if O_NONBLOCK is set. Anyway O_NONBLOCK
is required to avoid entire process blocking on write operation.

However the behavior is bad for event loop style programs. Because
the works are held by the event loop in the caller's thread.

So I think it is good to have both blocking methods and nonblocking
methods. The nonblocking methods should make event loop style
programs happy. However it is not accepted by matz because good names
for nonblocking methods are not found yet. Recently I proposed
connect_nonblock, nonblock_connect, nbconnect for nonblocking connect
but they are rejected.
 
J

Joel VanderWerf

Tanaka said:
Ruby does the polling. You may call it a hack.

Oh, well as long as ruby does it, it's more efficient than me doing it,
so less of a hack.
 
T

Tanaka Akira

Bill Kelly said:
Wow - the main thing holding back progress on this front is
method names? I could embrace connect_nonblock or nbconnect,
there.

Do you have a problem with threads?

If you use threads, nonblocking methods are not required in general.

I'd like to know why people doesn't use threads.
The blocking I/O issues are the thorniest problem for me
writing applications in ruby. (Of course, it's 1000 times
worse on Windows, ... where nonblocking I/O is apparently not
supported at all yet. That is just a nightmare.)

I heard Windows has nonblocking I/O for sockets.
But regarding the method names - I'm wondering - are separate
methods really needed? Are there any cases where Ruby can't
just inspect the fcntl() flags of the socket, and if
O_NONBLOCK is set, provide nonblocking behavior? You mentioned
connect(), which is an instance method. Couldn't connect()
just check for O_NONBLOCK? Why would a separate method be
needed? (Sorry if this is a FAQ. :)

1. The threaded programs needs blocking methods for a IO object with
O_NONBLOCK. O_NONBLOCK is required to avoid enteire process
blocking by write operations. But the threaded programs still
needs blocking behavior because most threaded programs doesn't
expects EAGAIN. I think nonblocking methods are better than
implementing EAGAIN retry loop for all threaded programs.

2. There is no F_GETFL on Windows.
Ruby cannot test O_NONBLOCK is set/clear on a fd. So connect
method cannot check O_NONBLOCK.
 
B

Bill Kelly

Hi,
From: "Wilson Bilkovich" <[email protected]>


Windows actually has plenty of support for nonblocking operations on
sockets and files.
Here's an example hit from MSDN:
http://msdn.microsoft.com/library/en-us/ipc/base/named_pipe_type_read_and_wait_modes.asp

Thanks; I should have been more clear... I'd posted earlier
this year, in
http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/138533
about a way to put windows sockets into nonblocking mode.

What I meant by "nightmare" is that few (if any) nonblocking
operations in windows are supported in ruby.

One thing I've wondered, is if a win32 socket were put into
nonblocking mode via a C extension (I notice ruby's win32.c
already defines a rb_w32_ioctlsocket() ... but nothing seems
to use it) ... Would ruby's scheduler on win32 work correctly
with windows sockets in nonblocking mode? I haven't tried
this yet.


Regards,

Bill
 
B

Bill Kelly

Hi,

From: "Tanaka Akira said:
Do you have a problem with threads?

If you use threads, nonblocking methods are not required in general.

I'd like to know why people doesn't use threads.

I'm about to head to the airport so I'll just say that
I'm still experimenting. I sometimes develop a system
with threads, because I think select() was a pain in the
last system I did. Then I get annoyed with the new
threaded system, and go back to select() on the next one.

I keep encountering trade-offs, and I'm not sure which
way I like best.
2. There is no F_GETFL on Windows.
Ruby cannot test O_NONBLOCK is set/clear on a fd. So connect
method cannot check O_NONBLOCK.

:( OK thanks.

Well, what if we were to add a #nonblocking= method to IO,
or at least to Socket?

So the programmer could say: socket.nonblocking = true

And have Ruby perform the appropriate action behind the
scenes?


Regards,

Bill
 
Y

Yohanes Santoso

Daniel Brockman said:
You might also ask yourself, do you really *need* to have
the scheduler arbitrarily switch contexts back and forth?
Do your event handlers really take that much time to run?
If so, fine. Otherwise, why not have determinism instead?

To nitpick, neither pre-emptive threading nor cooperative threading
(of which explicit event handling loop is a form of) has anything to
do with determinism.

It is what is being executed in that thread that determines whether it
is deterministic or not.

YS.
 
D

Daniel Brockman

Yohanes Santoso said:
To nitpick, neither pre-emptive threading nor cooperative
threading (of which explicit event handling loop is a form
of) has anything to do with determinism.

To nitpick back, I think you overstated that claim a bit.
Cooperatively threaded systems are deterministic by default;
pre-emptively scheduled ones are probablistic by default.

If you write a multithreaded program without keeping
synchronization in mind, it is likely to still end up
essentially deterministic under cooperative threading.
If you are using pre-emptive threading, however, you are
very likely to introduce race conditions.

So what I'm saying here is that while I agree that the
determinism of a correctly written program does not depend
fundamentally on the kind of threading in use, I must object
to the claim that ``[neither threading model] has anything
to do with determinism.''

In a cooperatively multithreaded program, control progresses
linearly through the source --- every line of code will be
executed immediately after the previous one has finished.
In a pre-emptively scheduled one, on the other hand, control
jumps around probablistically. Determinism is clearly
relevant here, IMHO.

But I see your point. I did sort of imply that pre-emptive
threading leads to non-determinism, which might not be the
fairest way of putting it. Sorry about that.
It is what is being executed in that thread that
determines whether it is deterministic or not.

I agree. It's just you don't have to put anything fancy in
cooperative threads to make them deterministic, because they
already are by default. Unless you put `rand' everywhere.
 
S

snacktime

=20
Do you have a problem with threads?
=20
If you use threads, nonblocking methods are not required in general.
=20
I'd like to know why people doesn't use threads.

Performance and resource consumption is it for me. I have a
client/server transaction processing application written in twisted
that can do about 300 tps and use less then 5% of the cpu on a 3ghz
pentium. Each transaction does 2-3 database queries, an http post to
a remote server, and some text formatting. It does use a thread
pool for database connections but that's it. Since the http post
takes 2-3 seconds to complete, there are anywhere from 900 to 2000
active client connections at any one time. The server uses a steady
30mb of ram the whole time.


Chris
 
T

Tanaka Akira

snacktime said:
Performance and resource consumption is it for me. I have a
client/server transaction processing application written in twisted
that can do about 300 tps and use less then 5% of the cpu on a 3ghz
pentium. Each transaction does 2-3 database queries, an http post to
a remote server, and some text formatting. It does use a thread
pool for database connections but that's it. Since the http post
takes 2-3 seconds to complete, there are anywhere from 900 to 2000
active client connections at any one time. The server uses a steady
30mb of ram the whole time.

I see. I never experienced such applications, though.
 
B

Bill Kelly

From: "Tanaka Akira said:
The trade-off should not big except programming style issue since Ruby
thread mechanism use select(). You use select() anyway, directly or
indirectly. The thread mechanism can do what IO.select can and vice
versa, in principle.

Right... For the program I'm writing right now, if I were
using threads, what I'd like is to have is multiple
pairs of threads--a read thread doing

sock.gets()

and a write thread doing

sock.puts(line)

... but I'm afraid the #puts will block my whole process,
potentially. . . . So I can break that down into select()
and send() with NONBLOCK ... But then I'm afraid to use
puts() on the same socket, because I fear mixing high-level
gets/puts with low-level send/recv... So I presume I need
to break both threads into select() and send/recv...

And so it turns into a bigger chore than it ought to be
in Ruby... :( And so at that point I think why not just
have one thread with a central select() ...

Well, ... come to think of it - another reason I'd decided
to try a central select() again, was my previous program
that massively used threads in ruby would pause occasionally,
and I could never track down what the heck the process was
doing when it was paused. (This wasn't the UDP checksum
thing - these were pauses from maybe a fraction of a second
to 2 or 3 seconds.) It just happened frequently enough to
be annoying, but infrequently enough that it was difficult
to trace.... And so I remember thinking, if this were single-
threaded, it would be--in theory-- easier to determine where
program was paused in these situations.

However - I deduced later, it may have been doing garbage
collection and causing page swaps. (I had a large in-memory
hash table.) If that were the case, the mystery would have
been the same in a single-threaded ruby app. :)

So - I don't know. One thing I'm sure of is that if Ruby
handled nonblocking better behind the scenes, network
programming in ruby could be as much of a joy as most other
ruby programming is.
I heard sock.fcntl(Fcntl::F_SETFL, File::NONBLOCK) makes sock
nonblocking mode on Windows. However sock.fcntl(Fcntl::F_GETFL)
doesn't work.

Is this new? I don't seem to have File::NONBLOCK in my
ruby 1.8.2 (2004-12-25) [i386-mswin32]


Regards,

Bill
 
T

Tanaka Akira

Bill Kelly said:
Right... For the program I'm writing right now, if I were
using threads, what I'd like is to have is multiple
pairs of threads--a read thread doing

sock.gets()

and a write thread doing

sock.puts(line)

... but I'm afraid the #puts will block my whole process,
potentially. . . . So I can break that down into select()
and send() with NONBLOCK ... But then I'm afraid to use
puts() on the same socket, because I fear mixing high-level
gets/puts with low-level send/recv... So I presume I need
to break both threads into select() and send/recv...

I can understand the fear. However I'm not sure how it can be
eliminated. Some documenatation might help.
So - I don't know. One thing I'm sure of is that if Ruby
handled nonblocking better behind the scenes, network
programming in ruby could be as much of a joy as most other
ruby programming is.

I'm trying.
Is this new? I don't seem to have File::NONBLOCK in my
ruby 1.8.2 (2004-12-25) [i386-mswin32]

I'm not sure. Maybe after that.
 
B

Bill Kelly

From: "Tanaka Akira said:
I can understand the fear. However I'm not sure how it can be
eliminated. Some documenatation might help.

Wow, I guess I did say "afraid" a lot of times there.

As I mentioned earlier, I'm still experimenting with
multi-threaded and single-threaded-select() based
implementations.

I wrote a new single-threaded-select() implementation, but
it *was* a lot more code than a threaded version using Queue
would have been. So this afternoon I put the single-threaded
version aside, and coded up a multi-threaded version, using
gets / puts as I described above.

The multi-threaded version looks nice, but here's what
I'm getting on Windows:

^^^^^^^^ This call never returns, the whole process hangs,
apparently.

This is ruby 1.8.2 (2004-12-25) [i386-mswin32]

I must admit I didn't expect it to block the process in this
situation. Am I doing something stupid?

... Hmmm, I'm getting this same hanging behavior in Linux,
regardless of whether the socket is in O_NONBLOCK mode.

Hrm.. It appears this is hanging due to the TCPServer being
in the same Ruby process as the client.

Or at least it seemed to make a difference on Linux... But

^^^^^^^^ hang


Am I doing something dumb here? It seems this should be
legal.


Thanks,

Regards,

Bill
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top