Available memory and limit on thread creation

J

Joseph Dionne

I am using Linux 2.4 with JVM 1.4.2_04. I have noticed a limit to the
number of threads that can be created. I tried many settings of "-Xmx"
and "-Xss" and the largest number of threads I can launch is 1015. So,
I have several questions.

First, does available memory limit number of threads?
Second, if available memory does limit the number of threads, is there a
rule of thumb on how to calculate that limit?
Finally, exactly what affect do "-Xmx" and "-Xss" have to the VM and do
they affect thread memory at all?

thanks
Joseph
 
L

Liz

Joseph Dionne said:
I am using Linux 2.4 with JVM 1.4.2_04. I have noticed a limit to the
number of threads that can be created. I tried many settings of "-Xmx"
and "-Xss" and the largest number of threads I can launch is 1015. So,
I have several questions.

First, does available memory limit number of threads?
Obviously.

Second, if available memory does limit the number of threads, is there a
rule of thumb on how to calculate that limit?
Finally, exactly what affect do "-Xmx" and "-Xss" have to the VM and do
they affect thread memory at all?

thanks
Joseph
 
J

Joseph Dionne

Thank you Liz, as a c developer I know this is fact. As a decades long
UNIX programmer, knowing the function of swap space, which I have set
rather large, I can achieve a much better outcome. I dont want to
diminish java, but I can model a lot more threads, using pthreads in c,
than I can in java. I am looking for more detailed information because
quite frankly I prefer to work in java.

I have found that multiple JVMs running my test application reach a
similar limit, leading me to believe a any limit is biased by something
other than I am considering.

Joseph
 
M

Michael Borgwardt

Joseph said:
Finally, exactly what affect do "-Xmx" and "-Xss" have to the VM and do
they affect thread memory at all?

-Xmx sets the maximum heap memory the JVM can use, it should not affect
threads at all.

-Xss sets the stack memory per thread, which obviously would affect
the maximum number of threads somehow if there were no other limit.
 
M

Michael Borgwardt

Joseph said:
Thank you Liz, as a c developer I know this is fact. As a decades long
UNIX programmer, knowing the function of swap space, which I have set
rather large, I can achieve a much better outcome.

"Available memory" can also be meant to include swap space.
I have found that multiple JVMs running my test application reach a
similar limit, leading me to believe a any limit is biased by something
other than I am considering.

Indeed it is so, and it has nothing to do with Java. The limit is the
maximal number of threads allowed by the operating system, which in
your case apparently is 1024. The maximum number of Java threads should
decrease further when there are other applications running. You can
change this value via proc/sys/kernel/threads-max

As a "decades long UNIX programmer" this should be no news to you.
 
J

Joseph Dionne

Michael said:
"Available memory" can also be meant to include swap space.



Indeed it is so, and it has nothing to do with Java. The limit is the
maximal number of threads allowed by the operating system, which in
your case apparently is 1024. The maximum number of Java threads should
decrease further when there are other applications running. You can
change this value via proc/sys/kernel/threads-max

As a "decades long UNIX programmer" this should be no news to you.

Thanks Michael. But, threading is something relatively new to UNIX, we
old guys use to spin off processes. But, I adapt well, and my
thread-max setting is actually 8184, well above the number of threads I
am attempting to launch. And, as I have said, I can create more threads
using c/pthreads than I can using Java. There must be some other limit
that I am unaware of affecting the number of threads a JVM can launch.

Joseph
 
R

Roedy Green

There must be some other limit
that I am unaware of affecting the number of threads a JVM can launch.

One limit is the size of the stack for each thread. I think, by
default it is rather generous 1 MB.
 
J

Joseph Dionne

Joseph said:
Thanks Michael. But, threading is something relatively new to UNIX, we
old guys use to spin off processes. But, I adapt well, and my
thread-max setting is actually 8184, well above the number of threads I
am attempting to launch. And, as I have said, I can create more threads
using c/pthreads than I can using Java. There must be some other limit
that I am unaware of affecting the number of threads a JVM can launch.

Joseph

Additional information;

Increasing the threads-max to 16k has no affect, I am still only able to
launch 1015 threads from one JVM. However, I can start several JVMs,
and all of them are limited to 1015.

I also tried Sun JVM 1.5.0, and without any "-X" settings, it too was
limited to 1015 threads. However, using the "-Xprof" option, low and
behold I was able to start all 1024 threads I am attempting to start.
Using this option in 1.4.2_04 does not seem to change the problem.

Since the profile option changed the symptoms, at least in 1.5.0, I am
lead to believe the limit is specific to something in the JVM, and not
some external OS environment.

Joseph
 
J

Joseph Dionne

Roedy said:
One limit is the size of the stack for each thread. I think, by
default it is rather generous 1 MB.

I have tried several "-Xss" settings, with no affect. I have found that
128kb is the (close) to the smallest stack setting. Values below that
cause the VM to fail starting complaining about StackOverFlowError.

Joseph
 
R

Roedy Green

d the largest number of threads I can launch is 1015

that is suspiciously near a magic number 1024. It is also a rather
huge number of threads for a computer with a single CPU.

You might do an experiment to see what happens when you spin off some
threads from C in JNI.

You might do an experiment to see how many threads you can spin off in
an absolutely trivial program.

My gut reaction is that this is a reasonable upper bound for a JVM or
OS to impose.

You might consider some sort of thread pooling where you have a FIFO
queue of work packets to be done, and say 5 real threads feeding off
it, and then putting incomplete work back on the tail of the queue.

This means you have to maintain state yourself. It will almost
certainly be far more ram-efficient.

The work packets must decide co-operatively when they have done enough
work for a slice by calling a have-a-conscience method at reasonable
intervals. You can make the work packet itself responsible for saving
and restoring state, rather than saving and restoring everything the
way the system does for a full thread.

I wrote a co-operative thread scheme for Windows 3.1 back before
Windows had threads. It was a lot simpler than I thought it would be.
It required some assembler, but not as much as you might think.
I think what you would do here would not be so general. Chances are
there is great regularity in what all those threads of yours are
doing, e.g. spidering a website.
 
J

Joseph Dionne

Roedy said:
that is suspiciously near a magic number 1024. It is also a rather
huge number of threads for a computer with a single CPU.

I think you are right. Using the "-server" option reduces the number of
threads by 3 to 1012, so that option must also spin off some threads.

Yes, one might concider 1024 threads a lot of threads, but in my
application, a financial transaction front-end, most of the threads will
be blocking on sockets, so the impact on the CPU is low, however RAM is
impacted greatly.

The target production servers are much larger than my test system.
Unfortunately, they have not be purchased yet, cause I need to tell the
customer how many to purchase.
You might do an experiment to see what happens when you spin off some
threads from C in JNI.

Yes, my next approach is to put my c/pthreads model into JNI, where I
have more control of memory usage. My approach will be to roll my own
ServerSocketChannel, or something that approaches its functionality,
using native methods for my client socket threads doing socket IO.
You might do an experiment to see how many threads you can spin off in
an absolutely trivial program.

My test application is very trivial, spinning of threads that just sleep.
My gut reaction is that this is a reasonable upper bound for a JVM or
OS to impose.

I am checking into all the tunable parameters for Linux, and any help on
sites that list them would be appreciated. I dont believe the limit is
OS related, but within the VM of Java.
You might consider some sort of thread pooling where you have a FIFO
queue of work packets to be done, and say 5 real threads feeding off
it, and then putting incomplete work back on the tail of the queue.

What I am tring to create is a thread pool that will service clients
accepted by the ServerSocketChannel. The traffic will be high volume,
low connection times, averaging about 2 to 4 seconds. The application
will be running in a server farm, and so, the fewer threads it can
launch means more servers in the farm. One approach I am looking into
is using Virtual IPs, and start several VM on one box, adding the
Virtual IPs to the router hunt table. Since I can run several VMs, all
of which are limited to 1015 threads on my development system, this
might be the solution.
This means you have to maintain state yourself. It will almost
certainly be far more ram-efficient.

The work packets must decide co-operatively when they have done enough
work for a slice by calling a have-a-conscience method at reasonable
intervals. You can make the work packet itself responsible for saving
and restoring state, rather than saving and restoring everything the
way the system does for a full thread.

I wrote a co-operative thread scheme for Windows 3.1 back before
Windows had threads. It was a lot simpler than I thought it would be.
It required some assembler, but not as much as you might think.
I think what you would do here would not be so general. Chances are
there is great regularity in what all those threads of yours are
doing, e.g. spidering a website.

Thanks for your help, Roedy.
 
C

Christophe Vanfleteren

Joseph said:
I think you are right. Using the "-server" option reduces the number of
threads by 3 to 1012, so that option must also spin off some threads.

Yes, one might concider 1024 threads a lot of threads, but in my
application, a financial transaction front-end, most of the threads will
be blocking on sockets, so the impact on the CPU is low, however RAM is
impacted greatly.

The target production servers are much larger than my test system.
Unfortunately, they have not be purchased yet, cause I need to tell the
customer how many to purchase.


Yes, my next approach is to put my c/pthreads model into JNI, where I
have more control of memory usage. My approach will be to roll my own
ServerSocketChannel, or something that approaches its functionality,
using native methods for my client socket threads doing socket IO.


My test application is very trivial, spinning of threads that just sleep.


I am checking into all the tunable parameters for Linux, and any help on
sites that list them would be appreciated. I dont believe the limit is
OS related, but within the VM of Java.


What I am tring to create is a thread pool that will service clients
accepted by the ServerSocketChannel. The traffic will be high volume,
low connection times, averaging about 2 to 4 seconds. The application
will be running in a server farm, and so, the fewer threads it can
launch means more servers in the farm. One approach I am looking into
is using Virtual IPs, and start several VM on one box, adding the
Virtual IPs to the router hunt table. Since I can run several VMs, all
of which are limited to 1015 threads on my development system, this
might be the solution.

You do realize that with NIO in JDK 1.4 you have nonblocking sockets, do
you?
Using those should allow you to scale much better than with just using
threads.
 
J

Joseph Dionne

Christophe said:
You do realize that with NIO in JDK 1.4 you have nonblocking sockets, do
you?
Using those should allow you to scale much better than with just using
threads.

Yes, I am using ServerSocketChannel, but right now only to accept
connections, and dispatch them to the thread pool for processing.
Unfortunately, that processing requires a system call to a third party
command line application, which might be a way for a long time, or
return back quickly in under 1 to 2 seconds.

But, a redesign using a smaller pool of threads to read sockets, then
dispatch received data to another pool of threads for the external
processing as a detached application, and another pool of threads to
monitor the results of the external application might provide the
desired throughput. On issue to overcome is how to keep the socket to
receive the results when the third party application returns. However,
this should be a trivial issue.

Thank you.
 
R

Roedy Green

What I am tring to create is a thread pool that will service clients
accepted by the ServerSocketChannel. The traffic will be high volume,
low connection times, averaging about 2 to 4 seconds. The application
will be running in a server farm, and so, the fewer threads it can
launch means more servers in the farm. One approach I am looking into
is using Virtual IPs, and start several VM on one box, adding the
Virtual IPs to the router hunt table. Since I can run several VMs, all
of which are limited to 1015 threads on my development system, this
might be the solution.

Waiting for transactions on sockets is a common and specialised
problem. If your threads most of the time are not really maintaining
any state while they wait for i/o, just waiting for a HTTP request,
they don't really need a whole thread to themselves.

Surely there are socket packages or wombs out there that handle this
for you. You have not told us yet just what processing you are doing
on these sockets.

If I think about how I would do this at an assembler level, you would
only need one thread to read all the sockets. It gets interrupts from
the socket hardware, and it copies data into a FIFO queue. Other
threads service the queue.

Unless each thread represented something as complex at as separate
timesharing session, you don't really need the overhead of a fully
general thread to deal with it. You can represent each thread with
something much smaller and more specialised.
 
C

Carl Howells

Joseph said:
Yes, I am using ServerSocketChannel, but right now only to accept
connections, and dispatch them to the thread pool for processing.
Unfortunately, that processing requires a system call to a third party
command line application, which might be a way for a long time, or
return back quickly in under 1 to 2 seconds.

You don't need to spin off multiple threads for that... Using
Runtime.exec already spins off a new process to execute the command in.
There's absolutely no reason to block waiting for that process to finish.
 
R

Roedy Green

You don't need to spin off multiple threads for that... Using
Runtime.exec already spins off a new process to execute the command in.
There's absolutely no reason to block waiting for that process to finish.

Even if there is, you don't necessarily have to have a thread sit and
wait for the exec. You could do something like have the spawned stuff
report back when done through some other queuing mechanism.

The key is to reuse threads and keep them busy. The only reason they
should sleep, is when they are waiting a few milliseconds for a disk
i/o. If they have to sleep longer than that, think about keeping track
of their state some other way. There are natural breaking points in
thread's life where it's state can be very compactly described. It
can be put into hibernation so to speak as a simple object, and
instantly awakened by some other already-running thread.
 
J

Joseph Dionne

Roedy said:
Even if there is, you don't necessarily have to have a thread sit and
wait for the exec. You could do something like have the spawned stuff
report back when done through some other queuing mechanism.

The key is to reuse threads and keep them busy. The only reason they
should sleep, is when they are waiting a few milliseconds for a disk
i/o. If they have to sleep longer than that, think about keeping track
of their state some other way. There are natural breaking points in
thread's life where it's state can be very compactly described. It
can be put into hibernation so to speak as a simple object, and
instantly awakened by some other already-running thread.

One reason to block on system call is to eliminate the polling time that
will be required to see if the external application has completed, and
eliminates the need to maintain a socket to external application result
list.
 
S

Sudsy

Roedy Green wrote:
Even if there is, you don't necessarily have to have a thread sit and
wait for the exec. You could do something like have the spawned stuff
report back when done through some other queuing mechanism.

The key is to reuse threads and keep them busy. The only reason they
should sleep, is when they are waiting a few milliseconds for a disk
i/o. If they have to sleep longer than that, think about keeping track
of their state some other way. There are natural breaking points in
thread's life where it's state can be very compactly described. It
can be put into hibernation so to speak as a simple object, and
instantly awakened by some other already-running thread.

Of course if you only had a J2EE server...
Seriously, this is the sort of situation where you could justify
the complexity and expense. Stateless or even stateful session
EJBs would fit the bill. Let the container manage the details.
Why re-invent the wheel?
 
J

Joseph Dionne

Sudsy said:
Roedy Green wrote:



Of course if you only had a J2EE server...
Seriously, this is the sort of situation where you could justify
the complexity and expense. Stateless or even stateful session
EJBs would fit the bill. Let the container manage the details.
Why re-invent the wheel?

Because I have a frugal customer who is already not terribly excited
about paying my bill? The real world sucks!
 
R

Roedy Green

One reason to block on system call is to eliminate the polling time that
will be required to see if the external application has completed, and
eliminates the need to maintain a socket to external application result
list.

Think in terms of how an HTTP server works. Requests come in on
various sockets, but usually a request is a stateless GET. There is no
need to remember the state of what that socket was doing previously.
There does not need to be a thread waiting for that socket. As soon
as the i/o completes, the packet can be put into a queue and serviced
by a pool of threads.

There is no polling involved. The hardware gives you an interrupt
when the i/o completes. The queue of packets to be processed can be
empty, but even then that can be handled with a synchronization lock,
no polling required.

There are so many things you could potentially do with sockets. We
don't yet know what he is up to, so all we can do is toss out
possibilities.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top