Threads troubles

S

Stefan Weiss

Hi,

I am trying to write a multiuser server (for games, chat, etc) in Perl.
In time, this program might have to service >500 users simultaneously,
who would be connected for some time. So I thought about using threads
instead of forking a new process for every user, in order to keep my
process table healthy and minimize inter-process communication overhead.

I wrote simple server and client scripts to test the theory, but it
does not work at all like I expected. The scripts are short enough,
and I have cut away any unnecessary parts, so I included them both at
the end of this posting.

First problem - threads seem to be working fine, but when I look at the
output of ps, there are still seperate processes for each connection.
Shouldn't the threads all be contained in one large process? I'm using
an ithreads-enabled current version of Perl with Linux 2.4.19.

Second problem - the process (group) just grows and grows with every
connection, and no memory seems to be released after a client
disconnects. I am aware that perl usually does not return freed memory
to the system, but I was hoping that the memory would at least be
reused for the next connection. This is what it looks like:

PID %CPU %MEM VSZ RSS TTY COMMAND

4176 1.3 1.2 4924 3168 pts/6 | | \_ perl svr_thr_x.pl
(server started, no connections yet)

4176 1.0 1.7 8288 4436 pts/6 | | \_ perl svr_thr_x.pl
4182 0.0 1.7 8288 4436 pts/6 | | \_ perl svr_thr_x.pl
4183 0.0 1.7 8288 4436 pts/6 | | \_ perl svr_thr_x.pl
(first client)

4176 1.0 2.0 11384 5356 pts/6 | | \_ perl svr_thr_x.pl
4182 0.0 2.0 11384 5356 pts/6 | | \_ perl svr_thr_x.pl
4183 0.0 2.0 11384 5356 pts/6 | | \_ perl svr_thr_x.pl
4187 0.0 2.0 11384 5356 pts/6 | | \_ perl svr_thr_x.pl
(first client + second client)

4176 0.6 2.0 11368 5348 pts/6 | | \_ perl svr_thr_x.pl
4182 0.0 2.0 11368 5348 pts/6 | | \_ perl svr_thr_x.pl
(both clients disconnected)

4176 0.5 2.4 14440 6244 pts/6 | | \_ perl svr_thr_x.pl
4182 0.0 2.4 14440 6244 pts/6 | | \_ perl svr_thr_x.pl
4193 0.0 2.4 14440 6244 pts/6 | | \_ perl svr_thr_x.pl
(third client)


Third problem - I have absolutely no idea why, but when I run the
client script in another terminal (or on another host even), and hit
Ctrl-C while the client is still receiving data from the server, the
server(!) will quit (and the client too, of course).


Can anybody explain to me what is happening? Is there a better approach
than using threads? Should I just fork away for each new client?


Thanks a lot in advance,
stefan



===[ file svr_thr_x.pl ]==================================================

#!/usr/bin/perl

use strict;
use warnings;

use threads qw(yield);
use threads::shared;

use Socket;
use POSIX qw(strftime);
use IO::Handle;

use constant EOL => "\015\012";

# configuration
my $SVR_NAME = "XXXXXXX v0.001"; # name string
my $SVR_ADDR = undef; # set to undef for INADDR_ANY
my $SVR_PORT = 4887; # default port is 4887
my $MAXCLIENTS = 100; # SOMAXCONN for system max

# listen on SVR_ADDR:SVR_PORT
$SVR_ADDR = $SVR_ADDR ? inet_aton($SVR_ADDR) : INADDR_ANY;
socket(SERVER, PF_INET, SOCK_STREAM, getprotobyname("tcp"))
or die "Error opening socket: $!\n";
setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
or die "Error trying to set socket reusable option: $!\n";
bind(SERVER, pack_sockaddr_in($SVR_PORT, $SVR_ADDR))
or die "Error on bind: $!\n";
listen(SERVER, $MAXCLIENTS)
or die "Error on listen: $!\n";

print "$SVR_NAME server running on port $SVR_PORT\n";

# main loop
my $curr_client_id = 0;
for (;;) {
my $cl_id = sprintf "CLIENT_%010d", $curr_client_id++;
my $cl_sock = accept(my $client, SERVER);
threads->create("serve_client", $client, $cl_id, $cl_sock);
}

# spawned threads use this sub
sub serve_client {
my ($client, $cl_id, $cl_sock) = @_;
my ($cl_port, $cl_addr) = unpack_sockaddr_in($cl_sock);
my $cl_name = gethostbyaddr($cl_addr, AF_INET);
print "connection from $cl_name:$cl_port [", inet_ntoa($cl_addr), "]\n";
autoflush $client 1;
for (1 .. 10) {
print $client "... talking to $cl_name [id $cl_id] ...", EOL;
sleep 1;
}
close $client;
threads->self->join;
}


===[ file client_x.pl ]===================================================

#!/usr/bin/perl

use strict;
use warnings;

use Socket;

my $host = "localhost";
my $port = 4887;
my $svr_sock = sockaddr_in($port, inet_aton($host));

socket(SERVER, PF_INET, SOCK_STREAM, getprotobyname("tcp"));
connect(SERVER, $svr_sock) or die "Error - could not connect: $!";
while (defined(my $line = <SERVER>)) {
print $line;
}
close (SERVER);
 
S

Sam Holden

Hi,

I am trying to write a multiuser server (for games, chat, etc) in Perl.
In time, this program might have to service >500 users simultaneously,
who would be connected for some time. So I thought about using threads
instead of forking a new process for every user, in order to keep my
process table healthy and minimize inter-process communication overhead. [snip]

First problem - threads seem to be working fine, but when I look at the
output of ps, there are still seperate processes for each connection.
Shouldn't the threads all be contained in one large process? I'm using
an ithreads-enabled current version of Perl with Linux 2.4.19.

Threads can show up as different processes. Check you ps documentation or
you kernel documentation for details on how to identify threads (and
I guess the perl docs too?)

[snip second problem of memory usage which I can't help with]
Third problem - I have absolutely no idea why, but when I run the
client script in another terminal (or on another host even), and hit
Ctrl-C while the client is still receiving data from the server, the
server(!) will quit (and the client too, of course).

That's probably something along of the lines of a SIGPIPE terminating
your process. Catch the signal you are getting when writing to the broken
socket, clean up, and continue.

Can anybody explain to me what is happening? Is there a better approach
than using threads? Should I just fork away for each new client?

I know very little about perl threads so I'm not going to try and comment
on the threading stuff.

A third solution, to forking and threading, is to do the multiplexing
yourself using something like select (perldoc -f select). Doing
that can often be simpler for things like chat servers where the
connections are not independant (data read from one will be sent to
others, common data will be updated, etc) since all the data is
shared and there are no race conditions or locking issues (since there's
only one 'thread' executing).
 
R

Rocco Caputo

Hi,

I am trying to write a multiuser server (for games, chat, etc) in Perl.
In time, this program might have to service >500 users simultaneously,
who would be connected for some time. So I thought about using threads
instead of forking a new process for every user, in order to keep my
process table healthy and minimize inter-process communication overhead.

I wrote simple server and client scripts to test the theory, but it
does not work at all like I expected. The scripts are short enough,
and I have cut away any unnecessary parts, so I included them both at
the end of this posting.

First problem - threads seem to be working fine, but when I look at the
output of ps, there are still seperate processes for each connection.
Shouldn't the threads all be contained in one large process? I'm using
an ithreads-enabled current version of Perl with Linux 2.4.19.

Surprise! Linux's "threads" are lightweight processes that share memory
rather than copy-on-write it.
Second problem - the process (group) just grows and grows with every
connection, and no memory seems to be released after a client
disconnects. I am aware that perl usually does not return freed memory
to the system, but I was hoping that the memory would at least be
reused for the next connection. This is what it looks like:

Which version of Perl are you using? I have it on good authority that
version 5.8.0 has a number of leaks that have since been fixed.

Then again, perhaps it's not good to join a thread from itself. I'm not
familiar enough with the intricacies of iThreads to say for sure.
Third problem - I have absolutely no idea why, but when I run the
client script in another terminal (or on another host even), and hit
Ctrl-C while the client is still receiving data from the server, the
server(!) will quit (and the client too, of course).

You should probably trace the source of the exit. The easiest way is to
throw in debugging warn or print statements to see what the server's
doing just before it stops.
Can anybody explain to me what is happening? Is there a better approach
than using threads? Should I just fork away for each new client?

Perl's threads use about half a megabyte apiece, and that's usually on
the low side of the estimate. 300 of them will require a minimum of
150MB of memory. That figure will rise quickly if you start building
objects and using modules.

You may want to consider a single-process, single-thread design, based
on four-argument select(), IO::Select, or a higher-level networking
library like POE.
===[ file svr_thr_x.pl ]==================================================

#!/usr/bin/perl

use strict;
use warnings;

use threads qw(yield);
use threads::shared;

use Socket;
use POSIX qw(strftime);
use IO::Handle;

use constant EOL => "\015\012";

# configuration
my $SVR_NAME = "XXXXXXX v0.001"; # name string
my $SVR_ADDR = undef; # set to undef for INADDR_ANY
my $SVR_PORT = 4887; # default port is 4887
my $MAXCLIENTS = 100; # SOMAXCONN for system max

# listen on SVR_ADDR:SVR_PORT

Consider IO::Socket::INET here.
$SVR_ADDR = $SVR_ADDR ? inet_aton($SVR_ADDR) : INADDR_ANY;
socket(SERVER, PF_INET, SOCK_STREAM, getprotobyname("tcp"))
or die "Error opening socket: $!\n";
setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
or die "Error trying to set socket reusable option: $!\n";
bind(SERVER, pack_sockaddr_in($SVR_PORT, $SVR_ADDR))
or die "Error on bind: $!\n";
listen(SERVER, $MAXCLIENTS)
or die "Error on listen: $!\n";

print "$SVR_NAME server running on port $SVR_PORT\n";

# main loop
my $curr_client_id = 0;
for (;;) {
my $cl_id = sprintf "CLIENT_%010d", $curr_client_id++;
my $cl_sock = accept(my $client, SERVER);
threads->create("serve_client", $client, $cl_id, $cl_sock);
}

# spawned threads use this sub
sub serve_client {
my ($client, $cl_id, $cl_sock) = @_;
my ($cl_port, $cl_addr) = unpack_sockaddr_in($cl_sock);
my $cl_name = gethostbyaddr($cl_addr, AF_INET);
print "connection from $cl_name:$cl_port [", inet_ntoa($cl_addr), "]\n";
autoflush $client 1;
for (1 .. 10) {

You don't test whether print() succeeds here. How do you know when the
client has disconnected?
print $client "... talking to $cl_name [id $cl_id] ...", EOL;
sleep 1;
}
close $client;

Is it okay for a thread to join itself? This seems circular; would it
be the cause of memory not being released?
 
S

Stefan Weiss

Hi,

many thanks to you and Sam Holden for your fast and very helpful
answers.

Rocco Caputo said:
Surprise! Linux's "threads" are lightweight processes that share memory
rather than copy-on-write it.

That explains a lot. I will take Sam's advice and familiarize myself
with the way Linux handles threads before I start using them again.
Which version of Perl are you using? I have it on good authority that
version 5.8.0 has a number of leaks that have since been fixed.

Spot on. Perl 5.8.0 running here - I thought that was a reasonably
current version but I guess I'm a bit out of touch.
Then again, perhaps it's not good to join a thread from itself. I'm not
familiar enough with the intricacies of iThreads to say for sure.

I'm not saying that I am familiar with iThreads, but calling join from
a thread does not seem to be problematic. I have found this to be the
only way to effectively close the client socket (filehandle), and I
have seen people-who-know-what-they-are-doing detach()ing threads
from inside. But then again, I am just starting to understand Perl
threads.
You may want to consider a single-process, single-thread design, based
on four-argument select(), IO::Select, or a higher-level networking
library like POE.

I am currently downloading POE, and it looks very promising. The reason
why I did not go for a high-level framework in the first place is that
I wanted to understand what exactly was happening, before I use
abstractions and high-level interfaces to hide the gory details :)
===[ file svr_thr_x.pl ]==================================================
[..]
Consider IO::Socket::INET here.

Same reason - understand the low level first, then use a specialized
module.
print "connection from $cl_name:$cl_port [", inet_ntoa($cl_addr),
"]\n"; autoflush $client 1;
for (1 .. 10) {

You don't test whether print() succeeds here. How do you know when the
client has disconnected?

Point taken.

Thanks again for your advice. I will try multiplexing with select()
now, and then probably use POE.


cheers,
stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top