Choosing the most appropiate Ruby version and programming model todevelop a SIP server

  • Thread starter Iñaki Baz Castillo
  • Start date
I

Iñaki Baz Castillo

Hi, I need to do a choice between the various Ruby versions (1.8, 1.9, JRub=
y,=20
Rubinius...) and programming models (Reactor Model, Actor Model...) to buil=
d=20
a SIP (VoIP) server.

I've already started it and for now I've done a complete SIP grammar parser=
=20
using TreeTpo parser, but probably I must migrate to Ragel in order to get=
=20
more performance.

But what I really don't see clear is which paradigm and framework to use.=20
=46irst of all let me explain something about SIP protocol (which I know ve=
ry=20
well):

SIP is really complex. Its syntax looks like HTTP, but the protocol itself=
=20
includes various layers and becomes really more difficult than HTTP or SMTP=
=2E=20
Ufortunatelly I'll just implement TCP transport (SIP also allows UDP) but=20
anyway it's remain complex. There are various layers in SIP:
=2D Transport Layer: receives data (requests or responses) via any socket,=
=20
performs a basic parsing and sends the extracted data to the:
=2D Transaction Layer: Deals with retransmissions of the requests (if no=20
response was sent in a predefined interval), matches responses against thei=
r=20
requests, deal with CANCEL and ACK requests...
=2D Core Layer: Receives the request/response and handles it (a phone would=
=20
ring, a SIP proxy would route it, a SIP server would do "something" cool...=
).

There are a lot of timers in all the layers, so if a timer triggers then th=
e=20
full call/dialog/transaction could be discarded (for example).


I've already read about various Ruby frameworks to achieve it:

=2D EventMachine: It's a Reactor Pattern implementation. It's based on even=
ts.=20
It could be a solution but it seems that the resulting code would be really=
a=20
pain to debug and not very "visible". I've got a recomendation from an=20
experimented Ruby coder for non using EventMachine in my project due to my=
=20
project complexity. It's actively mantained.

=2D Revactor: It's an Actor Model implementation. It offers a "like Ruby"=20
structured programming style. It allows "non-blocking" functions to behave=
=20
as "blocking" so the code is cleaner than using an event based style. It=20
seems to need Ruby 1.9 (I can wait) and AFAIK it not actively mantained.

=2D Omnibus and Dramatis: More Actor Model implementations in Ruby. Not sur=
e if=20
they are mantained.

=2D Threads: I could built my application using Threads ("Green" threads in=
Ruby=20
or native threads if I choose JRuby). But I think the above solutions are=20
more ellegant and secure.



I must choose also the Ruby version:

=2D Ruby 1.8 and 1.9: It seems that the garbage collector is not good enoug=
h so=20
I'd would experiment memory issues.

=2D Rubinius: Better garbage collector, but it seems to be not enough matur=
e for=20
now. It implements a built-in Actor Model. For now it cannot use native=20
threads.

=2D JRuby: Good garbage collector and mature, but not sure if it allows an =
Actor=20
Model using "something".



Well, as you see there are so many posibilities that is really difficult to=
do=20
the choice. Also note that, for now, I'm not an expertised programmer and=20
just know the above concepts because I've read about them in my search.

Could I receive a suggestion about it please? Thanks a lot.


=2D-=20
I=C3=B1aki Baz Castillo
 
B

Brian Candler

Iñaki Baz Castillo said:
Hi, I need to do a choice between the various Ruby versions (1.8, 1.9,
JRuby,
Rubinius...) and programming models (Reactor Model, Actor Model...) to
build
a SIP (VoIP) server.

I've already started it and for now I've done a complete SIP grammar
parser
using TreeTpo parser, but probably I must migrate to Ragel in order to
get
more performance.

Go back one step.

You are concerned about performance before you've even written
something. How critical is performance going to be to your finished
product? If the answer is "very" then maybe Ruby is not the best
platform in the first place. I'd suggest using Erlang instead: it's used
in massive telephony systems, and has extremely good support for
protocol handling.

You'll need to buy Joe Armstrong's Erlang book (pragprog.com), and then
you can make do with the on-line OTP documentation and looking at other
people's code (e.g. mochiweb and yaws, which are HTTP servers). I even
have a vague idea there is an existing Erlang SIP stack you can use. Ah,
here it is:
http://www.stacken.kth.se/project/yxa/

You can also hide Erlang very well. For example, couchdb and rabbitmq
are both built on Erlang, but users don't see any of this.

However, if you have some external requirement which means you *must*
use Ruby, then understand that you may be trading ease of development
against final performance and stability.

Personally, I would start with the simplest possible implementation
(e.g. using gserver.rb to accept incoming TCP connections in Ruby green
threads), and get something working. Once you need to scale up, it
should be pretty easy to refactor to some other model.
 
I

Iñaki Baz Castillo

2009/1/23 Brian Candler said:
Go back one step.

You are concerned about performance before you've even written
something. How critical is performance going to be to your finished
product?

It's important, but my aim is designing it in a way that it could be
scalable by adding more servers in parallel and all of the sharing
some resources (storing them in DB for as first approach, using DRB or
whatever).
There will be a SIP proxy before my server, so the proxy would
distribute the requests using some dispatcher algorithm.

If the answer is "very" then maybe Ruby is not the best
platform in the first place. I'd suggest using Erlang instead: it's used
in massive telephony systems, and has extremely good support for
protocol handling.

Yes, but I would like to implement it in Ruby due to the easy
development it offers. Of course I'll take in count your suggestion.

You'll need to buy Joe Armstrong's Erlang book (pragprog.com), and then
you can make do with the on-line OTP documentation and looking at other
people's code (e.g. mochiweb and yaws, which are HTTP servers). I even
have a vague idea there is an existing Erlang SIP stack you can use. Ah,
here it is:
http://www.stacken.kth.se/project/yxa/

I already know it. However it's a SIP proxy implementation, not a SIP
server/B2BUA which is very different.

However, if you have some external requirement which means you *must*
use Ruby, then understand that you may be trading ease of development
against final performance and stability.
Personally, I would start with the simplest possible implementation
(e.g. using gserver.rb to accept incoming TCP connections in Ruby green
threads), and get something working. Once you need to scale up, it
should be pretty easy to refactor to some other model.

This is exactly what I'm already doing (gserver and green threads),
and this is the reason I would like to migrate to a better and more
efficient model.

Thanks a lot for your help.




--=20
I=C3=B1aki Baz Castillo
<[email protected]>
 
B

Brian Candler

Iñaki Baz Castillo said:
my aim is designing it in a way that it could be
scalable by adding more servers in parallel and all of the sharing
some resources (storing them in DB for as first approach, using DRB or
whatever).
There will be a SIP proxy before my server, so the proxy would
distribute the requests using some dispatcher algorithm.

Well, you could use a Rails-like multi-process deployment model;
unfortunately, your SIP TCP connections are likely to be persistent, so
there will be a lot of open connections and threads being maintained in
each process. Considering this, it's easy to appreciate why so many SIP
implementations only support UDP :)
I already know it. However it's a SIP proxy implementation, not a SIP
server/B2BUA which is very different.

OK, but there's a lot of stack you should be able to rip out and re-use.
I think Erlang supports state machines very nicely too, which is good
for the various levels of protocol timeouts you referred to in your
initial post.

Ruby's timeout() mechanism is fundamentally broken - see posts passim -
and so you may end up writing your own timeout list plus job queue
implementation from scratch, when Erlang's mailboxes give you that
straight away.

As you say, there are many versions of Ruby to choose from, which may be
a blessing or a curse depending on how you look at it. I think you'll
certainly need to test them in your own application. When you say "Ruby
1.8 and 1.9: ... the garbage collector is not good enough", is this
based on your own experience?

Having said that: applications which handle hundreds of TCP connections
via hundreds of green threads in a single process are bound to stress
Ruby badly. Having N separate processes which handle 1/Nth of the
connections will certainly help.

It would be really good if you could map the problem domain onto HTTP
somehow, because there is so much infrastructure available for scaling
Ruby with HTTP. But then, building a front-end SIP proxy so that it
forwards requests over HTTP is probably as hard as making the SIP
backend scale.

Cheers,

Brian.
 
I

Iñaki Baz Castillo

2009/1/23 Brian Candler said:
Well, you could use a Rails-like multi-process deployment model;
unfortunately, your SIP TCP connections are likely to be persistent, so
there will be a lot of open connections and threads being maintained in
each process. Considering this, it's easy to appreciate why so many SIP
implementations only support UDP :)

No, the SIP proxy will use a single TCP connection with my server (if
there are server N processes in my server then there wuld be just N
persistent TCP connections between the SIP proxy and my server).
Also, allowing UDP in SIP was a bad decission. Today there are many
SIP features that require more data size, so UDP datagram is not
enough sometimes. Also, by using TCP you can discard some SIP timers
(as RFC 3261 states).
This is, SIP is easier with TCP (except in the case you are dealing
with NAT and so, that is not my case).

OK, but there's a lot of stack you should be able to rip out and re-use.
I think Erlang supports state machines very nicely too, which is good
for the various levels of protocol timeouts you referred to in your
initial post.
Ok.


Ruby's timeout() mechanism is fundamentally broken - see posts passim -

Yes, I know :(


As you say, there are many versions of Ruby to choose from, which may be
a blessing or a curse depending on how you look at it. I think you'll
certainly need to test them in your own application.

Well, it's not so easy. For example, AFAIK Rubinius implements an
Actor model byitself. If I do my server using Rubinius I cannot test
it in other Ruby environment (and I wouldn't like if I have to build
my server so many times as Ruby versions exist XD).

When you say "Ruby
1.8 and 1.9: ... the garbage collector is not good enough", is this
based on your own experience?

No, it's a opinion I've got from some developers.

Having said that: applications which handle hundreds of TCP connections
via hundreds of green threads in a single process are bound to stress
Ruby badly. Having N separate processes which handle 1/Nth of the
connections will certainly help.

As I explained above, there will be very few TCP connections, and they
will be persistent.

It would be really good if you could map the problem domain onto HTTP
somehow, because there is so much infrastructure available for scaling
Ruby with HTTP. But then, building a front-end SIP proxy so that it
forwards requests over HTTP is probably as hard as making the SIP
backend scale.

Bufff, it's sound terribly difficult... Sincerelly I would like to
programm it "from scratch" (but using some good plataform and existing
programming model). Trying to transform a current HTTP implementation
into SIP would be too much for me.



Really thanks a lot for all your help.



--=20
I=C3=B1aki Baz Castillo
<[email protected]>
 
B

Brian Candler

Iñaki Baz Castillo said:
the SIP proxy will use a single TCP connection with my server (if
there are server N processes in my server then there wuld be just N
persistent TCP connections between the SIP proxy and my server).

Ah, then in principle it should scale Rails-like:
- each process accepts *one* inbound TCP connection (*)
- it listens for SIP messages, and sends SIP responses, on that one
socket
- call state is held in regular objects (rather than threads)
- timeouts can be done by means of a timer queue

This to me seems simpler, more robust and portable than something like
EventMachine.

As you've already said, you need some way to share state between these
processes. You might want to consider if your SIP-aware front-end proxy
can be "sticky", sending all messages relating to the same call down the
same TCP connection.

Regards,

Brian.

(*) Or even just talks on stdin/stdout. This is easy to test standalone,
and then you can launch the whole process from inetd.

Otherwise, you run N different processes running on N different ports.
Maybe your front-end SIP proxy can distribute the load between them
itself. If not, you can point it at a simple TCP proxy like pen, which
will redirect the connections. pen can be configured not to make more
than one connection to any particular backend process.
http://siag.nu/pen/
 
R

Roger Pack

I've already read about various Ruby frameworks to achieve it:
- EventMachine: It's a Reactor Pattern implementation. It's based on
events.
It could be a solution but it seems that the resulting code would be
really a
pain to debug and not very "visible". I've got a recomendation from an
experimented Ruby coder for non using EventMachine in my project due to
my
project complexity. It's actively mantained.

From my own experience writing a bittorrent-like protocol, it ends up
being easier to write it in EventMachine than a threaded model [and
faster, and also since it's not multi-threaded, the garbage collector is
more efficient]. I suppose Ruby 1.9 + revactor would also be somewhat
efficient.

Rev is maintained [though pretty small community] by its author I know
of no outstanding bugs.



- Ruby 1.8 and 1.9: It seems that the garbage collector is not good
enough so
I'd would experiment memory issues.


There's a patch on ruby core currently that will make 1.8.7 GC
better--and it actually works, but for now yeah the 1.8.x GC is not
efficient.


- JRuby: Good garbage collector and mature, but not sure if it allows an
Actor
Model using "something".

There is EM for Jruby, as well. I'd assume it works I've never used it.
I'd probably just worry about garbage collection and memory problems
when they occur. For my own experience running a multi-threaded socket
app in 1.8 used like 120MB RSS, which to some is too much, and to some
is something along the lines of "who cares?" EM style used like 30MB.
Again with the MBARI patches recently introduced it would probably go
down to something like 50MB RSS.

Pre-mature optimization is... :)

Good luck.
-=r
 
I

Iñaki Baz Castillo

El Viernes, 23 de Enero de 2009, Brian Candler escribi=C3=B3:
Ah, then in principle it should scale Rails-like:
- each process accepts *one* inbound TCP connection (*)
- it listens for SIP messages, and sends SIP responses, on that one
socket
- call state is held in regular objects (rather than threads)

Could you please explain this last point?

- timeouts can be done by means of a timer queue

Any example of it?

This to me seems simpler, more robust and portable than something like
EventMachine.

I don't need it to be portable, but I 'm afraid of the complexity the code=
=20
would get if it's done with EventMachine. Should then I really discard=20
EventMachine for this project?

As you've already said, you need some way to share state between these
processes. You might want to consider if your SIP-aware front-end proxy
can be "sticky", sending all messages relating to the same call down the
same TCP connection.

Yes, the SIP proxy (OpenSer) does this task very well, it can dispatch all =
the=20
transactions of a dialog to the same server. But anyway I need to share oth=
er=20
states between all the servers (not just SIP dialog related).


Otherwise, you run N different processes running on N different ports.
Maybe your front-end SIP proxy can distribute the load between them
itself. If not, you can point it at a simple TCP proxy like pen, which
will redirect the connections. pen can be configured not to make more
than one connection to any particular backend process.
http://siag.nu/pen/

This sounds great but I don't need it, the SIP proxy already does it :)


Thanks again.

PD: Would you *strongly* recommend me to avoid Ruby for this task?
PPD: If not, any Ruby "version" and programming model implementation=20
(mantained and robust) running on that Ruby version? (I know my question is=
=20
very difficult to answer, but I would really appreciate it).



=2D-=20
I=C3=B1aki Baz Castillo
 
B

Brian Candler

Iñaki Baz Castillo said:
Could you please explain this last point?

Well, your main loop could be something like this:

while true
next_timeout = timer_queue.first.timestamp
t = next_timeout - Time.now
if select([sock], nil, nil, t < 0 ? 0 : t)
... read and handle an incoming message
end
if t <= 0
... handle a timeout
end
end

Then, when you handle each incoming message, you will have to look at
attributes of this message to see if it's a new call, or references an
existing call. If it's an existing call, then you look it up in some
data structure:

calls = {} # callid => CallState

You process the incoming message based on its contents combined with the
stored state, update the state, send a reply if necessary, and loop
round.

It's still bog-standard event-based programming, but since you are not
talking to any untrusted party (the upstream is your own trusted SIP
proxy) you probably don't have to worry about protocol violations or
long pauses mid-way through a message.

(And even if you did, you could just have one Thread reading and
decoding inbound messages on the socket, and pushing the completed
messages into a Queue for another Thread to pick off and process)

What you could gain from Fiber-based models (Revactor?) in 1.9 is the
ability to write your code in a more linear style, rather than as a
state machine. I don't know how well SIP would map onto that.
Any example of it?

Not to hand. A good queue needs to make it easy to:
- pop items off the front
- locate and drop items in the middle (cancelling timers)
- insert new timers in the middle and locate the correct place to put
them

A doubly-linked list is fine for small numbers of timers, but when you
get into the hundreds you'll want something more like a priority queue.
Should then I really discard
EventMachine for this project?

I only have minimal experience of it, with Swiftiply. Unfortunately that
was a bad experience, but that may have been Swiftiply's fault rather
than EM's. (I had an easily reproducible crash, but it was ignored in
the Swiftiply mailing list and tracker)

But I don't see that EM gives you much benefit, if you're only handling
a single TCP connection per process. It might give you a decent timer
queue implementation I suppose; I haven't looked at it.
PD: Would you *strongly* recommend me to avoid Ruby for this task?

I can only speak for myself.

I have a lot of experience with Ruby, and have only dabbled with Erlang,
but this looks like such a hand-in-glove fit for Erlang that it would
prompt me to go that way. If there's a half-decent SIP stack already
written, I think that would compensate at least partly for the
additional learning curve.

But there could be other overriding concerns to lean towards Ruby (e.g.
time to market, team constraints)
PPD: If not, any Ruby "version" and programming model implementation
(mantained and robust) running on that Ruby version? (I know my question
is
very difficult to answer, but I would really appreciate it).

I don't have enough experience with a wide enough range implementations
to answer that properly.

1.8.6p114 has been good to me. You have to be very careful with later
1.8.6's to avoid the broken ones.

I tried Jruby once, taking an existing Rails app and packaging it as a
war file with warbler. With no clients using it, the JVM took about
600MB of RSS, and the response time was rubbish. However that was a year
or so ago, and there are plenty of people who swear by (rather than at)
the J-word.

At least with MRI, I know that if it breaks, at worst I can debug it
with gdb. The whole (1.8) codebase is only a few megs. I wouldn't know
where to start fixing a problem with the JVM.

Regards,

Brian.
 
I

Iñaki Baz Castillo

El Viernes, 23 de Enero de 2009, Brian Candler escribi=C3=B3:
I=C3=B1aki Baz Castillo said:
Could you please explain this last point?

Well, your main loop could be something like this:

while true
next_timeout =3D timer_queue.first.timestamp
t =3D next_timeout - Time.now
if select([sock], nil, nil, t < 0 ? 0 : t)
... read and handle an incoming message
end
if t <=3D 0
... handle a timeout
end
end

Then, when you handle each incoming message, you will have to look at
attributes of this message to see if it's a new call, or references an
existing call. If it's an existing call, then you look it up in some
data structure:

calls =3D {} # callid =3D> CallState

You process the incoming message based on its contents combined with the
stored state, update the state, send a reply if necessary, and loop
round.

It's still bog-standard event-based programming, but since you are not
talking to any untrusted party (the upstream is your own trusted SIP
proxy) you probably don't have to worry about protocol violations or
long pauses mid-way through a message.

(And even if you did, you could just have one Thread reading and
decoding inbound messages on the socket, and pushing the completed
messages into a Queue for another Thread to pick off and process)

What you could gain from Fiber-based models (Revactor?) in 1.9 is the
ability to write your code in a more linear style, rather than as a
state machine. I don't know how well SIP would map onto that.
Any example of it?

Not to hand. A good queue needs to make it easy to:
- pop items off the front
- locate and drop items in the middle (cancelling timers)
- insert new timers in the middle and locate the correct place to put
them

A doubly-linked list is fine for small numbers of timers, but when you
get into the hundreds you'll want something more like a priority queue.
Should then I really discard
EventMachine for this project?

I only have minimal experience of it, with Swiftiply. Unfortunately that
was a bad experience, but that may have been Swiftiply's fault rather
than EM's. (I had an easily reproducible crash, but it was ignored in
the Swiftiply mailing list and tracker)

But I don't see that EM gives you much benefit, if you're only handling
a single TCP connection per process. It might give you a decent timer
queue implementation I suppose; I haven't looked at it.
PD: Would you *strongly* recommend me to avoid Ruby for this task?

I can only speak for myself.

I have a lot of experience with Ruby, and have only dabbled with Erlang,
but this looks like such a hand-in-glove fit for Erlang that it would
prompt me to go that way. If there's a half-decent SIP stack already
written, I think that would compensate at least partly for the
additional learning curve.

But there could be other overriding concerns to lean towards Ruby (e.g.
time to market, team constraints)
PPD: If not, any Ruby "version" and programming model implementation
(mantained and robust) running on that Ruby version? (I know my question
is
very difficult to answer, but I would really appreciate it).

I don't have enough experience with a wide enough range implementations
to answer that properly.

1.8.6p114 has been good to me. You have to be very careful with later
1.8.6's to avoid the broken ones.

I tried Jruby once, taking an existing Rails app and packaging it as a
war file with warbler. With no clients using it, the JVM took about
600MB of RSS, and the response time was rubbish. However that was a year
or so ago, and there are plenty of people who swear by (rather than at)
the J-word.

At least with MRI, I know that if it breaks, at worst I can debug it
with gdb. The whole (1.8) codebase is only a few megs. I wouldn't know
where to start fixing a problem with the JVM.

Thanks a lot, I really appreciate all your help.



=2D-=20
I=C3=B1aki Baz Castillo
 
M

Michal Suchanek

I don't have enough experience with a wide enough range implementations
to answer that properly.

1.8.6p114 has been good to me. You have to be very careful with later
1.8.6's to avoid the broken ones.

The latest 1.8.6 should be as good as 1.8 ruby goes. There was a 1.8.6
release just after a vulnerability report that fixed the vulnerability
but broke other things but that is hopefully not going to repeat.
I tried Jruby once, taking an existing Rails app and packaging it as a
war file with warbler. With no clients using it, the JVM took about
600MB of RSS, and the response time was rubbish. However that was a year
or so ago, and there are plenty of people who swear by (rather than at)
the J-word.

JRuby is a good option if you do not require any C extensions. The
startup is slow (and hence the first few requests) but the JVM is
better built than MRI (they had much more time for that). You get
optimization of long running processes and sound memory management.
You also get real threads and some Java libraries if you need that,

Debugging problems with either VM is hard, and it depends on the
language you are familiar with (Java vs C). Recently MRI has improved
as well, though. Some memory leak that caused my processes to get
twice as large with MRI compared to JRuby is gone, at least in latest
1.8.7 (which I don't recommend because it differs significantly from
earlier 1.8 and there is no sane packaging of it for some platforms).

I currently don't run 1.8.6 because 1.8.7 is packaged in Debian so I
updated my code to support that.

Thanks

Michal
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top