DRb connection error with more than 250+ DRb services

J. Wook · May 11, 2007

Simple test codes are here:

class TestServer
include DRb:

RbUndumped

def initialize
@obj = Array.new
end

def register(addr)
@obj << addr
end

def sum
s = 0
@obj.each do |t|
v = DRbObject.new_with_uri(t).get_value
if v.nil?
puts s.to_s + " & error"
end
s += v
end
return s
end
end

class TestClient
include DRb:

RbUndumped

def initialize(addr, server, value)
DRbObject.new_with_uri(server).register(addr)
@value = value
end
def get_value
@value
end
end

uri = "druby://localhost:"
server_uri = uri + "40000"
server = DRb.start_service(server_uri, TestServer.new)

max_size = 300

(1..max_size).each do |t|
client_uri = uri + (40000 + t).to_s
DRb.start_service(client_uri, TestClient.new(client_uri, server_uri,
t))
end

sum = DRbObject.new_with_uri(server_uri).sum
puts sum

For max_size = 10, sum = 55
For max_size = 100, sum = 5050

...

but

For max_size = 300,
DRb:

RbConnError exception raised.

try to make another DRb server after that error in the same process....
#<Errno::EMFILE: Too many open files - socket(2)> raised.

How can I open more than 300 DRb connection?
(I need to make about 1,000 connections .... )

Jung wook Son · May 11, 2007

Brian said:
What does ulimit -a show? Particularly look at max open files

dev@seoul$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) unlimited
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Brian Candler · May 11, 2007

dev@seoul$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) unlimited
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Hmm. Should be enough, unless you're opening other file-like objects in your
program. But you could try raising/lowering ulimit -n to see if it makes a
difference.

Brian Candler · May 11, 2007

I am opening 1024+ file-like objects.
That's the answer exactly.

One more question.
When I try to change ulimit value, I have some error.

dev@seou$: ulimit -n 2048
-bash: ulimit: open files: cannot modify limit: Operation not permitted

dev@seoul$ sudo ulimit -Sn 2048
sudo: ulimit: command not found

You have to be root to increase it.

There is some config file which sets the default at login time. I can't
remember offhand what it is - you may need to google.

Brian Candler · May 11, 2007

You have to be root to increase it.

There is some config file which sets the default at login time. I can't
remember offhand what it is - you may need to google.

in /etc/security/limits.conf, something like:

brian hard nofile 2048

Then the user must logout and login again.

Jung wook Son · May 11, 2007

Thanks, Brian.

I made it.

Marcin Raczkowski · May 13, 2007

Simple test codes are here:

class TestServer
include DRb:RbUndumped

def initialize
@obj = Array.new
end

def register(addr)
@obj << addr
end

def sum
s = 0
@obj.each do |t|
v = DRbObject.new_with_uri(t).get_value
if v.nil?
puts s.to_s + " & error"
end
s += v
end
return s
end
end

class TestClient
include DRb:RbUndumped

def initialize(addr, server, value)
DRbObject.new_with_uri(server).register(addr)
@value = value
end
def get_value
@value
end
end

uri = "druby://localhost:"
server_uri = uri + "40000"
server = DRb.start_service(server_uri, TestServer.new)

max_size = 300

(1..max_size).each do |t|
client_uri = uri + (40000 + t).to_s
DRb.start_service(client_uri, TestClient.new(client_uri, server_uri,
t))
end

sum = DRbObject.new_with_uri(server_uri).sum
puts sum

For max_size = 10, sum = 55
For max_size = 100, sum = 5050

...

but

For max_size = 300,
DRb:RbConnError exception raised.

try to make another DRb server after that error in the same process....
#<Errno::EMFILE: Too many open files - socket(2)> raised.

How can I open more than 300 DRb connection?
(I need to make about 1,000 connections .... )

question probably everyone want's to ask.why the hell you need 1000
connections? dont' tell me you have 1000 servers with drbservers running
somewhere

Michal Suchanek · May 13, 2007

question probably everyone want's to ask.why the hell you need 1000
connections? dont' tell me you have 1000 servers with drbservers running
somewhere

It's been said that it's some 250 connections. For one, the
application might act as some sort of proxy which would double the
number of sockets.

Apparently it either uses some other files of uses about 4
handles/connection. It looks quite a lot and could be probably lowered
but it does not change the fact that 1000 is a safe default for
non-server applications but can be easily reached by servers.

I wonder why such limits are imposed. It is probably some workaround
for a flaw in UNIX design that introduces possible DoS by exhausting
kernel memory/structures. Maybe it was fixed in some kernels (if
that's even possible) but nobody cared to fix the limit as well.

Thanks

Michal

Brian Candler · May 13, 2007

But aren't most socket-type connections really two file descriptors
(coming and going)?
So the 250 would already be 500...?

No - a socket is bi-directional.

John Joyce · May 13, 2007

No - a socket is bi-directional.

Ok. I had that mixed up then with some other networking thing...
(good thing I don't do sockets. or, maybe I should play with it some
soon)

Thomas Hurst · May 14, 2007

* Michal Suchanek ([email protected]) said:
I wonder why such limits are imposed. It is probably some workaround
for a flaw in UNIX design that introduces possible DoS by exhausting
kernel memory/structures. Maybe it was fixed in some kernels (if
that's even possible) but nobody cared to fix the limit as well.

How would you propose to avoid letting users exhaust system resources
without limits of some sort? Sadly the real world implementations of
turing machines don't generally include unlimited tape...

Modest default limits are a good thing, since they help reduce the
impact of runaways, leaks and, indeed, DoS attempts. This is as true in
*ix as it is in any other system.

Michal Suchanek · May 14, 2007

How would you propose to avoid letting users exhaust system resources
without limits of some sort? Sadly the real world implementations of
turing machines don't generally include unlimited tape...

Modest default limits are a good thing, since they help reduce the
impact of runaways, leaks and, indeed, DoS attempts. This is as true in
*ix as it is in any other system.

Sure it is avoidable. In a system where memory is allocated to users
(not somehow vaguely pooled), and the networking service is able to
back its socket data structures by user memory you only care about
memory, not what the user uses it for. The user then can store files,
sockets, or anything else that he wishes.
Of course, this probably would not happen on a POSIX system.

Thanks

Michal

Michal Suchanek · May 14, 2007

Such limits have been around long before inhibiting DoS attacks became an
important design goal. Every process has a "descriptor table" which defaults
to a certain size but can sometimes be increased. However, the per-process
descriptor table is nothing but an array of pointers to data structures
maintained by the kernel. (That's why in Unix, a file descriptor is always a
low-valued integer: it's just an offset into the pointer array.) What this
means is that the system resources consumed by the file descriptor (which is
owned by a process) must be considered in distinction to the *kernel*
resources consumed by an actual open file or network connection, which are
managed separately and obey very different constraints. It's normal in Unix
for the same kernel object representing an open file to appear in the
descriptor tables of several different processes. Just having the ability to
represent 50,000 different file descriptors in a single process, however,
doesn't automatically mean you have that much more I/O bandwidth available
to your programs. Think about IBM mainframes, which are designed for
extremely high I/O loads. You can have 500 or more actual open files on an
IBM mainframe, all performing real live I/O. Intel-based servers can't come
anywhere near that kind of capacity. If your per-process tables are large
enough for thousands of open file descriptors, that says something about the
size of your I/O data structures (which are constrained primarily by memory,
a medium-inexpensive resource), but nothing at all about the real throughput
you'll get.

Of course, the FDs are only to organize your IO. You can use TCP ans
the OS provided sockets, or you can use a single UDP socket and
maintain the connection state yourself.

Similarly the FDs to open files give you organized access to space on
a disk drive, and you can always open the partition device and manage
the storage yourself.

The throughput of the structured IO would be usually lower because the
OS does already some processing on the data to organize it neatly for
you.

Limiting the number of FDs per process does not do much to protect the
kernel memory. Maybe a single process cannot exhaust it but forking
more processes is easy. It is only a workaround for the poor design
after all.

Thanks

Michal

Brian Candler · May 14, 2007

Of course, the FDs are only to organize your IO. You can use TCP ans
the OS provided sockets, or you can use a single UDP socket and
maintain the connection state yourself.

If you want to run your own TCP stack you'll need to open a raw socket, and
you can't do that unless you're root, because of the security implications
(e.g. ordinary users could masquerade as privileged services such as SMTP on
port 25)

Similarly the FDs to open files give you organized access to space on
a disk drive, and you can always open the partition device and manage
the storage yourself.

Similarly, only root can have direct access to the raw device. Otherwise,
any user would be able to read any other user's files, modify any file at
whim, totally corrupt the filesystem etc.

Michal Suchanek · May 16, 2007

If you want to run your own TCP stack you'll need to open a raw socket, and
you can't do that unless you're root, because of the security implications
(e.g. ordinary users could masquerade as privileged services such as SMTP on
port 25)

Similarly, only root can have direct access to the raw device. Otherwise,
any user would be able to read any other user's files, modify any file at
whim, totally corrupt the filesystem etc.

Of course, using UDP is not the same as implementing your own TCP
stack. You just do not need the enormous amounts of sockets even if
you communicate with multiple hosts. And you have to keep track of the
communication yourself because the OS will not do it for you.

Also you would want to decide for each partition if you use it for a
filesystem and mount it, or give access to the partition to a service
or user. Note that mtools use raw floppy devices in exactly this way,
and given the access they can be used for partitions as well.

Thanks

Michal

Michal Suchanek · May 16, 2007

I really should let this go because it has nothing much to do with Ruby, but
I don't agree that the Unix IO-access design is a poor one. User programs
should never be accessing raw devices for any reason. It's absolutely not
the case that direct access to raw devices gives you better "performance,"
especially considering how much work is being done by well-optimized device
drivers, and also balanced against the damage you can do by accessing them
yourself. And the design has stood the test of time, having proved its
ability to easily accommodate a wide range of real devices and
pseudo-devices over the years. And Windows even copied the design of the
system calls (even though the underlying implementation appears to be quite
different, except of course for Windows' TCP, which was stolen from BSD).

Note that accessing the partition and bypassing the filesystem layer
you still use the optimized and balanced drivers. But you do not get
the benefit of organizing your data in the hierarchical namespace the
filesystem provides.
The fact that half of an unix-like system resides in the kernel causes
problems like this. Only filesystems compiled into the kernel can be
mounted. Only root can mount filesystems. Opening too many files is
not allowed because it drains kernel resources. Etc, etc.
Certainly many things have stood the test of time but doubts exist
about their optimal design. For one, the QWERTY layout was designed
for mechanical typewrites that suffered from collisions of the
mechanical parts. It was optimized so that such collisions are
unlikely. Since then new layouts emerged that were optimized for
typing speed. But QWEERTY has stood the test of time, much longer than
UNIX ;-)

Thanks

Michal

Robert Klemme · May 16, 2007

Simple test codes are here:

class TestServer
include DRb:RbUndumped

def initialize
@obj = Array.new
end

def register(addr)
@obj << addr
end

def sum
s = 0
@obj.each do |t|
v = DRbObject.new_with_uri(t).get_value
if v.nil?
puts s.to_s + " & error"
end
s += v
end
return s
end
end

class TestClient
include DRb:RbUndumped

def initialize(addr, server, value)
DRbObject.new_with_uri(server).register(addr)
@value = value
end
def get_value
@value
end
end

uri = "druby://localhost:"
server_uri = uri + "40000"
server = DRb.start_service(server_uri, TestServer.new)

max_size = 300

(1..max_size).each do |t|
client_uri = uri + (40000 + t).to_s
DRb.start_service(client_uri, TestClient.new(client_uri, server_uri,
t))
end

sum = DRbObject.new_with_uri(server_uri).sum
puts sum

For max_size = 10, sum = 55
For max_size = 100, sum = 5050

..

but

For max_size = 300,
DRb:RbConnError exception raised.

try to make another DRb server after that error in the same process....
#<Errno::EMFILE: Too many open files - socket(2)> raised.

How can I open more than 300 DRb connection?
(I need to make about 1,000 connections .... )

Do you actually have 1000 client /processes/ or just 1000 client /objects/?

Kind regards

robert

DRB Program Error	2	May 4, 2007
Raising my own exceptions with DRb	1	Jan 23, 2006
DRb::DRbBadScheme when using drbunix sockets, why?	3	Feb 13, 2009
Append (<<) to Array attributes of DRb objects?	5	Jul 30, 2007
DRb Connection Closed Error?!?!?!?	12	May 6, 2004
DRB ruby ACL problems	1	Jul 31, 2006
DRb and method_missing?	1	Aug 3, 2005
Recursive parallel readdir() with drb and slave libraries	0	Feb 6, 2009

DRb connection error with more than 250+ DRb services

J. Wook

Jung wook Son

Brian Candler

Brian Candler

Brian Candler

Jung wook Son

Marcin Raczkowski

Michal Suchanek

Brian Candler

John Joyce

Thomas Hurst

Michal Suchanek

Michal Suchanek

Brian Candler

Michal Suchanek

Michal Suchanek

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads