Rev/actor TCP monkey patching

F

fedzor

Short and sweet -

How exactly does one monkey patch Rev's TCP over Net::HTTP-s?

Thanks,
-------------------------------------------------------|
~ Ari
seydar: it's like a crazy love triangle of Kernel commands and C code
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Short and sweet -

How exactly does one monkey patch Rev's TCP over Net::HTTP-s?

Check out the SVN trunk:

svn checkout http://rev.rubyforge.org/svn/

There's now a Rev::SSL module you can extend any subclass of Rev::TCPSocket
with (this includes Revactor::TCP::Socket).

To use it with Rev::HttpClient, just:

require 'rev/ssl'

and subclass HttpClient, defining the following callbacks:

def on_connect
super
extend(Rev::SSL)
ssl_client_start
end

You'll also need to:

def on_ssl_connect
request(...) # Make the HTTP request after the SSL handshake completes
end

...to initiate the HTTP request after the SSL handshake is complete.

There's a cooresponding #on_ssl_error(ex) method you can define to capture
any errors which occur during the SSL handshake (or any attempts to
negotiate a new session).
You can also:

def ssl_context
OpenSSL::SSL::SSLContext.new(...)
end

...to create an OpenSSL context to use for the SSL session. By default no
certificate verification takes place, so if you'd like to do any add the
appropriate certificates to the SSL context object.

If you're having any trouble, you might take a look at the Revactor SVN
trunk:

svn checkout http://revactor.rubyforge.org/svn/

Revactor's HttpClient is already configured for SSL support, and wraps
everything up a lot better, e.g.:

response = Revactor::HttpClient.get("https://your.ssl.server.here/")

This also takes care of some of the other annoyances of HTTP, like following
redirects.

If you don't want to use Revactor directly, you can at least use it as a
reference for using Rev::HttpClient in conjunction with SSL. Just have a
look at lib/revactor/http_client.rb

I'd like to wrap up SSL support in Rev's HttpClient a bit more nicely, but
for now my real priority was getting it working in Revactor.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Short and sweet -

How exactly does one monkey patch Rev's TCP over Net::HTTP-s?

I probably should've mentioned this up front, but you can save yourself a
lot of grief by just using Revactor:
=> #<Revactor::HttpResponse:0x5d30f4
@client=#<Revactor::HttpClient:0x5d5020>, @status=200, @reason="OK",
@version="HTTP/1.1", @content_length=nil, @chunked_encoding=true,
@header_fields={"Cache-Control"=>"private",
"Set-Cookie"=>"PREF=ID=b7f27e80b36a740b:TM=1202788671:LM=1202788671:S=CIZI75tKEMuS7q3X;
expires=Thu, 11-Feb-2010 03:57:51 GMT; path=/; domain=.google.com",
"Server"=>"gws", "Transfer-Encoding"=>"chunked", "Date"=>"Tue, 12 Feb 2008
03:57:51 GMT", "Connection"=>"Close"}, @content_type="text/html;
charset=ISO-8859-1">=> "<html><head><meta http-equiv=\"content-type\" content=\"text/html;
charset=ISO-8859-1\"><title>Google</title> ...

That's it.

Just grab the Rev and Revactor svn repositories, "rake gem" in both, and
install the gems. You get all the benefits of an evented HTTP client
without the inversion-of-control headaches typically associated with evented
programming.
 
F

fedzor

Thank you so much!

This has really saved me a lot of speed and given me some more speed ;-)

On a side note, would you be able to post some more example with
Actors to achieve concurrency? Because at first I understand your
echo server, and then I tested out actors myself and everything I
knew fell apart....

Thanks,
Ari
--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

On a side note, would you be able to post some more example with
Actors to achieve concurrency? Because at first I understand your
echo server, and then I tested out actors myself and everything I
knew fell apart....

Here's a simple example. It will create a new Actor for each specified URL,
which probably isn't the behavior you want. You'll instead want something
like a queue. Also, this will gib if you have duplicates in the URL list.
And as another bonus, I haven't tested it, but I think it will work :)

However, it should be enough to get you started:

require 'revactor'

# A list of URLs we want to fetch
url_list = ['http://www.google.com', ...]

# Capture the parent Actor to send messages to
parent = Actor.current

# Spawn a new Actor for each URL we want to fetch
url_list.each do |u|
Actor.spawn do
# Request the URL
response = Revactor::HttpClient.get(u)

# Consume and store the body if the response status was 200
response.body if response.status == 200

# Close the connection
response.close

parent << [:http_response, u, response]
end
end

# Store all the responses in a hash mapping URLs to the responses
responses = {}

# Consume all the responses
while responses.size < url_list.size
Actor.receive do |filter|
# Catch messages which start with :http_response and contain two objects
filter.when(Case[:http_response, Object, Object]) do |_, url, response|
responses = response
end
end
end

# Inspect the responses
p responses
 
F

fedzor

# Capture the parent Actor to send messages to
parent = Actor.current

# Spawn a new Actor for each URL we want to fetch
url_list.each do |u|
Actor.spawn do
# Request the URL
response = Revactor::HttpClient.get(u)

# Consume and store the body if the response status was 200
response.body if response.status == 200

# Close the connection
response.close

parent << [:http_response, u, response]
end
end

Here is where I've found a problemo will occur. In my *ahem*
extensive *ahem* research with your fine fine actors ;-), concurrency
only exists when run within an actor - Actor.current has failed me
here! Example:

foods = ['chocolate', 'seltzer', 'awesome sauce']
foods.each do |food|

Actor.spawn(food) do |f|
sleep 1
puts f
end
end
puts "foods rounded up"

# chocolate
# foods rounded up

Uhoh! Not quite. Lemme try something different -

Well, whatever, I couldn't reproduce that code, because I updated to
your svn code. The one released as a gem required me to embed it all
within an actor before getting any sort of concurrency. But
basically, I sent some snail mail to an actor and it waited on
quitting until all of the internal actors finished working
concurrently. It was really awesome and I understood it! And then I
reinstalled revactor. oof! Better work something out.

Do you think you could help me understand what I would need to do to
get awesome sauce printed?

Thanks,
-------------------------------------------------------|
~ Ari
if god gives you lemons
YOU FIND A NEW GOD
 
F

fedzor

As a follow up, I realized why it wasn't working: bug in filters. I'm
too scared to patch it up, but I think I'll do that anyways. Example
problem:

myactor = Actor.spawn do
Actor.receive do |filter|
filter.when:)dog) { puts "I got a dog!" }
end
end

myactor << :dog

Run it and see..... NOTHING. it exits.... so..... what up G.


-------------------------------------------------------|
~ Ari
seydar: it's like a crazy love triangle of Kernel commands and C code
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Here is where I've found a problemo will occur. In my *ahem*
extensive *ahem* research with your fine fine actors ;-), concurrency
only exists when run within an actor - Actor.current has failed me
here! Example:

foods = ['chocolate', 'seltzer', 'awesome sauce']
foods.each do |food|

Actor.spawn(food) do |f|
sleep 1
puts f
end
end
puts "foods rounded up"

I should really document this better, but: In Revactor, Actors are
Fiber-based, so anything that blocks for prolonged periods of time will hang
all Actors in the system. It's the same sort of thing you'll encounter with
an evented framework like EventMachine.

Fortunately, there's "Actor-safe" replacements for most of the blocking
tasks you'll perform in a networked application, namely: DNS resolution,
opening connections, SSL handshakes, and reading from and writing to the
network.

In the case of sleep, there's the handy:

Actor.sleep 1

Which is just shorthand for Actor.receive { |filter| filter.after 1 }

For anything related to networking, you need to use Actor::TCP or
Actor::HttpClient (which uses the fully asynchronous HTTP client from Rev)

To execute long-running blocks of code which aren't related to networking,
the next release of Revactor will be thread safe. This means you can spin
the long running task off in a thread, and have it send a message when it
completes.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

As a follow up, I realized why it wasn't working: bug in filters. I'm
too scared to patch it up, but I think I'll do that anyways. Example
problem:

myactor = Actor.spawn do
Actor.receive do |filter|
filter.when:)dog) { puts "I got a dog!" }
end
end

myactor << :dog

Run it and see..... NOTHING. it exits.... so..... what up G.

The problem here is that you need to do something which yields control back
to the Actor scheduler before it can run. The simplest thing you can do is:

Actor.sleep 0

This will yield control to the scheduler, which will process any outstanding
messages then return control back to you (since you told it you wanted to
sleep for 0 seconds)

It's pretty confusing from irb, I'll admit...

When doing anything with Actors, just remember you're "queuing up"
operations which will run later... later being whenever you call
Actor.receive. Actor.receive is the only way to defer control to other
Actors (keeping in mind Actor.sleep is just shorthand for Actor.receive)

All of the "blocking" operations in Revactor, things like Actor::TCP.connect,
Actor::HttpClient.get, etc. are all calling Actor.receive underneath.
 
F

fedzor

The problem here is that you need to do something which yields
control back
to the Actor scheduler before it can run. The simplest thing you
can do is:

Actor.sleep 0

This will yield control to the scheduler, which will process any
outstanding
messages then return control back to you (since you told it you
wanted to
sleep for 0 seconds)

It's pretty confusing from irb, I'll admit...

When doing anything with Actors, just remember you're "queuing up"
operations which will run later... later being whenever you call
Actor.receive. Actor.receive is the only way to defer control to
other
Actors (keeping in mind Actor.sleep is just shorthand for
Actor.receive)

BTW, do you hang on IRC? #ruby-lang, seydar.

So I need to do the Actor.sleep to dish out control, but how come I
didn't have to do that with the version of revactor actually released?

-------------------------------------------------------|
~ Ari
if god gives you lemons
YOU FIND A NEW GOD
 
F

fedzor

To execute long-running blocks of code which aren't related to
networking,
the next release of Revactor will be thread safe. This means you
can spin
the long running task off in a thread, and have it send a message
when it
completes.

Next release or next svn update? I'm running the svn version.

~ Ari
English is like a pseudo-random number generator - there are a
bajillion rules to it, but nobody cares.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

BTW, do you hang on IRC? #ruby-lang, seydar.

I'm typically on #rubinius and #erlang, tarcieri

So I need to do the Actor.sleep to dish out control, but how come I
didn't have to do that with the version of revactor actually released?


The semantics of how Actors are scheduled changed slightly in trunk.
Previous versions of Revactor would return control back to the root Actor /
Fiber in the event that all inter-Actor messages had been dispatched and
there were no pending network events. The trunk version adds a thread-safe
message queue to allow Actors in one thread to send messages to Actors in
another thread, and implementing this required a number of changes to the
semantics of the scheduler.

There was a nasty idle loop bug in previous versions of Revactor, as well.
If you load Revactor 1.2, and call something to the effect of
Actor.receive{ |f|
f.when:)foo) {} } in the root Actor, the scheduler will just sit and spin,
because all messages have been dispatched and there are no event sources.
Infinite loop.

The new scheduler semantics make use in irb (or in RSpec) a bit more
cumbersome, but they add thread safety, a ~10% performance improvement (per
tools/messaging_throughput.rb), and eliminate a potential infinite loop.
And that infinite loop isn't just hypothetical: you'll encounter it in any
system which has run out of events to process, and it will immediately get
in the way when implementing distributed systems which need to wait for
remote events.

Next release or next svn update? I'm running the svn version.
The next release. The svn version has the thread-safety improvements in
place, but they're not speced yet and are definitely buggy. I wouldn't try
using them yet, but you won't encounter any problems unless you try to send
messages between Actors across threads. The idle loop bug is also fixed.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

So I need to do the Actor.sleep to dish out control, but how come I
didn't have to do that with the version of revactor actually released?


Oh, also note: I have tried to restore the semantics of the original
scheduler by making the current Actor reschedule itself and relinquish
control to the event loop whenever sending messages, but my initial attempts
at doing this introduced a number of scheduling bugs and broke several of
the specs. I might give it another try...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,135
Latest member
VeronaShap
Top