Millions of Threads ?

F

frankgerlach

Hello folks,
I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.
Still, two problems remain: Memory used by each TCP connection and by
the enormous number of threads (each client would have a server thread
for the "easy" implementation)
Because of all those issues I am considering the use of datagram
sockets and state machines (one per client) instead of one thread per
client. On the other hand, what is the difference between a state
machine called "Thread" and a "hand-crafted" state machine ? Both
consume memory, and maybe I could configure the JVM to allocate very
little memory per Thread.....
What do you think ? What is typically used in large telecoms
applications ?
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.
Still, two problems remain: Memory used by each TCP connection and by
the enormous number of threads (each client would have a server thread
for the "easy" implementation)
Because of all those issues I am considering the use of datagram
sockets and state machines (one per client) instead of one thread per
client. On the other hand, what is the difference between a state
machine called "Thread" and a "hand-crafted" state machine ? Both
consume memory, and maybe I could configure the JVM to allocate very
little memory per Thread.....
What do you think ? What is typically used in large telecoms
applications ?

I would say something like:

<1000 client : one thread per client
1000-10000 client : nio
>10000 clients : udp

Usually threads are mapped to OS threads and have a significantly
overhead.

With a million clients I think you should also worry about
horizontal scalability !!

Arne
 
F

frankgerlach

With a million clients I think you should also worry about
horizontal scalability !!
Well, a quick calculation suggests that a single server (4CPU or so)
could handle a million clients: Say each client sends 10 messages of
1000 bytes per day (The application in question is not very
data-intensive). That's 1000000*10*1000==10Gbyte. An internet
connection with 10Mbyte/sec (100Mbit Ethernet) could easily handle
this: 10000000*86400=864Gbyte. (Assuming there are no significant
traffic spikes). If I used Gbit Ethernet I could scale even higher; and
then I drop in 4 to 10 Gbit Ethernet cards...
 
F

frankgerlach

Whatabout "green" threads ? Could they scale to millions per machine
(with less than 10 CPU cores and less than 32 GByte main memory)?
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Whatabout "green" threads ? Could they scale to millions per machine
(with less than 10 CPU cores and less than 32 GByte main memory)?

Green thread can only use 1 core and will likely be
even slower.

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Well, a quick calculation suggests that a single server (4CPU or so)
could handle a million clients: Say each client sends 10 messages of
1000 bytes per day (The application in question is not very
data-intensive). That's 1000000*10*1000==10Gbyte. An internet
connection with 10Mbyte/sec (100Mbit Ethernet) could easily handle
this: 10000000*86400=864Gbyte. (Assuming there are no significant
traffic spikes). If I used Gbit Ethernet I could scale even higher; and
then I drop in 4 to 10 Gbit Ethernet cards...

If we assume that you spend 20 milliseconds processing
a request then a roundtrip over all connections
if they all need to process with 4 CPU's will take 83 minutes.

I hope it is not an interactive application.

:)

Arne
 
P

Patricia Shanahan

Arne said:
If we assume that you spend 20 milliseconds processing
a request then a roundtrip over all connections
if they all need to process with 4 CPU's will take 83 minutes.

I hope it is not an interactive application.

:)

Arne

If each client sends 10 messages a day, there is a relatively low
probability that they all need service simultaneously. On the other
hand, it would be equally unreasonable to assume the traffic is
uniformly distributed.

I would model message arrivals with a Poisson distribution, but using
the rate associated with the busiest time of day, not the 24 hour
average. The problem is going to be working out where all the
bottlenecks are, and estimating the queue delays.

A thread has significant memory overhead, and appears on operating
system queues. How much do you really need to remember about each client
between messages? If it is small, you are better off keeping a client
state object for each client. When a message comes in, allocate a thread
and select the appropriate client state object.

Patricia
 
C

Chris Uppal

I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.

Assuming < 10e6 customers, and average of 10 messages per day, and a target
turn-around time of 0.5 seconds, that suggests that you would have an average
of around 600 interactions "in-flight" at any one instant. Obviously you would
have to model/estimate how much the peak load would exceed that. That suggests
that running out of port numbers is not going to be one of your problems.
Still, two problems remain: Memory used by each TCP connection and by
the enormous number of threads (each client would have a server thread
for the "easy" implementation)

On most OSs each thread consumes a lot of resource -- the stack space for
instance. It also puts load on the scheduler. Don't use that many threads
unless either your JVM is specifically designed to handle that many threads as
lightweight entities (Sun's are not), or your OS is specifically designed to
provide ultra light-weight threads (I believe that there is a Linux threads
implementation which claims to have this property -- I have no experience of it
at all myself).

Otherwise, that many concurrent requests puts you squarely into NIO territory.
Then scale up your CPUs and/or load-balanced server boxes to make the target
turn-around time achievable. Same goes for the database (if any).

Notice that, on the above assumptions, you will be serving a request at
intervals of (on average) about 1 millisecond. That is substantially less than
the typical disk-seek time. If each request causes one disk read, and if there
is not enough temporal correlation between requests (which there might be),
then each request will also cause one physical disk seek -- so you need
sufficient spindles to allow that many seeks to be issued without blocking each
other.

-- chris
 
T

Thomas Hawtin

I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.

That my be an OS problem, but it shouldn't be a problem for TCP or UDP.
Each connection is identified by four number: client IP address, client
port, server IP address and server port. Even with the last two
constant, you should get at least 32768 connection per client. Should be
enough.

Having said that, if all the connections go through the same opaque
proxy (or you try to a 1:1 mapping to back-ed app server or database),
then you could cluster onto a single client IP address. I really don't
know how mobile gateways operate.
Still, two problems remain: Memory used by each TCP connection and by
the enormous number of threads (each client would have a server thread
for the "easy" implementation)

If you want a million simultaneous connections, then ouch.

It's an area that has moved on a great deal over the last few years. So
a lot of what you read will be out of date.

The killer for threads is the amount of virtual address space used.
Therefore stick to 64-bit operating systems (shouldn't be a problem
these days). To handle large numbers of threads, OS have moved to
scalable algorithms. On Linux use a 2.6 rather than 2.4 kernel. I would
check what Solaris 10 x64 can do, but my Ultra 20 is on the blink.

I guess large numbers of TCP connections will consume a lot of memory.
Perhaps turning the window size will help (a large window helps
throughput with long latencies).
Because of all those issues I am considering the use of datagram
sockets and state machines (one per client) instead of one thread per
client. On the other hand, what is the difference between a state
machine called "Thread" and a "hand-crafted" state machine ? Both
consume memory, and maybe I could configure the JVM to allocate very
little memory per Thread.....

UDP does give you advantages in terms of not having to hold connections
and ability to work around TCP latency issues. Using UDP you will have
to reinvent a lot of TCP. Perhaps TCP and NIO would be better.

Probably you are best off using someone else's infrastructure. Here's
where I don't know much about what is available. IIRC, Grizzly and Ember
are two abstractions of NIO. Sun's Project Darkstar is a server that
handles problems such as fail-over. If you've got millions of customers,
you are probably going to want to make it more reliable than can
reasonably be achieved with one machine.

Tom Hawtin
 
B

Brandon McCombs

Well, a quick calculation suggests that a single server (4CPU or so)
could handle a million clients: Say each client sends 10 messages of
1000 bytes per day (The application in question is not very
data-intensive). That's 1000000*10*1000==10Gbyte. An internet
connection with 10Mbyte/sec (100Mbit Ethernet) could easily handle
this: 10000000*86400=864Gbyte. (Assuming there are no significant
traffic spikes). If I used Gbit Ethernet I could scale even higher; and
then I drop in 4 to 10 Gbit Ethernet cards...

I hope he has the money for a 10Mbyte/sec UPLINK. Sure, it seems small
and it is for an intranet link but that is an expensive link when you
have to put it on the Internet *and* for it to be your uplink speed, not
your downlink.
 
R

Robert Klemme

Hello folks,
I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.
Still, two problems remain: Memory used by each TCP connection and by
the enormous number of threads (each client would have a server thread
for the "easy" implementation)
Because of all those issues I am considering the use of datagram
sockets and state machines (one per client) instead of one thread per
client. On the other hand, what is the difference between a state
machine called "Thread" and a "hand-crafted" state machine ? Both
consume memory, and maybe I could configure the JVM to allocate very
little memory per Thread.....
What do you think ? What is typically used in large telecoms
applications ?

Another option to deal with this is LWP (light weight processing
framework). You can find more info on this in Doug Lea's book [1]. The
basic pattern is that you have a thread pool and send requests down a
queue from which the first free thread picks the task processes it and
either sends results directly or through another queue where it is read
by one of another set of threads that transfers data back. You save the
overhead of thread creation and deletion at the cost of a minimal more
complex architecture.

Of course the basic precondition is that the system at hand has enough
resources (main mem, CPU cycles, IO bandwidth) to actually deal with the
load.

HTH

Kind regards

robert


[1] "Concurrent Programming in Java"
http://www.awprofessional.com/bookstore/product.asp?isbn=0201310090&rl=1
[2] Doug Lea's home page http://g.oswego.edu/
 
D

Dale King

Hello folks,
I am thinking about a telecom application, which would potentially
handle millions of mobile
phones (J2ME) as clients. Of course, I need a server (J2SE), too.
The "easy" implementation uses TCP connections for the client/server
communication. Problem is that there are only 65000 sockets per IP
address of the server. I think I could solve that by configuring
multiple IP addresses per network card.

I think others have addressed the thread issue, I'll tackle your
misconception about IP. You need to understand the difference between
"sockets" which are a software concept and "ports" which are part of the
IP protocol. There is a limit of 65,536 "ports" per IP address. There is
really no limit other than memory, etc. on the number of "sockets". You
do not use a port per connected user. You have a socket connection with
each user, but they would all probably be communicating to a single IP port.

It is no different than a web server. There can be millions of users of
the web server and they are all connected to port 80 on the server. The
server only has to use one port.

You may have gotten confused by Sun itself! For many years, the Java
tutorial contained a totally incorrect and misleading explanation of how
TCP sockets work. It talked about the server switching from the well
known port (e.g. 80 for a web server) to another socket bound to a
different port once the connection was accepted. This is false. Once the
connection is made you move from the server socket that is listening to
new connections to another socket dedicated to that user, but it is
still bound to the same IP port on the server.

Here is a link to their old bogus tutorial page:

<http://web.archive.org/web/20000301...s/tutorial/networking/sockets/definition.html>

You can contrast this with the corrected version (which appears to have
only been corrected this year despite years of protests):

<http://java.sun.com/docs/books/tutorial/networking/sockets/definition.html>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top