Process vs Thread: what are the consequences?

Q

Qu0ll

I have a server-based "engine" and I would like to know the best way to
maximise its effectiveness. In the general sense, if I wish to deploy
multiple instances of this engine to improve processing throughput, would it
be better to have each engine running as a fully-fledged OS process in its
own JVM or as a separate thread in some master process (and thus sharing a
JVM)? I know this isn't much detail to go on but I am just after some
conceptual information at this stage.

--
And loving it,

-Q
_________________________________________________
(e-mail address removed)
(Replace the "SixFour" with numbers to email me)
 
D

Daniel Pitts

Qu0ll said:
I have a server-based "engine" and I would like to know the best way to
maximise its effectiveness. In the general sense, if I wish to deploy
multiple instances of this engine to improve processing throughput,
would it be better to have each engine running as a fully-fledged OS
process in its own JVM or as a separate thread in some master process
(and thus sharing a JVM)? I know this isn't much detail to go on but I
am just after some conceptual information at this stage.
I would think the most effective way is to have the engine in one JVM,
using multiple threads, although that could depend on the JVM you use.
If memory serves, JVMs on Linux will actually create an OS level
sub-process for every Thread spawned.

Alternatively, you might consider going a step further. One Server
instance can have multiple threads, and you can have multiple servers
living on different physical devices. This assumes that you can
effectively parallelize your computations across these separate instances.

In any case, I think that there would be at least as much overhead, if
not more, to creating two JVM instances than one JVM instance with two
threads.

Your final solution is more likely to depend on the amount of
inter-process/thread communication you're engine is going to need.
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Qu0ll said:
I have a server-based "engine" and I would like to know the best way to
maximise its effectiveness. In the general sense, if I wish to deploy
multiple instances of this engine to improve processing throughput,
would it be better to have each engine running as a fully-fledged OS
process in its own JVM or as a separate thread in some master process
(and thus sharing a JVM)? I know this isn't much detail to go on but I
am just after some conceptual information at this stage.

A JVM has some overhead and threading is quite efficient in most
Java implementations.

I would go for threads.

Arne
 
Q

Qu0ll

I would think the most effective way is to have the engine in one JVM,
using multiple threads, although that could depend on the JVM you use. If
memory serves, JVMs on Linux will actually create an OS level sub-process
for every Thread spawned.

Alternatively, you might consider going a step further. One Server
instance can have multiple threads, and you can have multiple servers
living on different physical devices. This assumes that you can
effectively parallelize your computations across these separate instances.

In any case, I think that there would be at least as much overhead, if not
more, to creating two JVM instances than one JVM instance with two
threads.

Your final solution is more likely to depend on the amount of
inter-process/thread communication you're engine is going to need.

Thanks Daniel. I prefer the one JVM, multiple-thread solution as it
facilitates easy communication between the engines. And yes, I have
considered the possibility of deploy across multiple machines and for this I
will probably use one of the 3rd-party clustering Java libraries.

--
And loving it,

-Q
_________________________________________________
(e-mail address removed)
(Replace the "SixFour" with numbers to email me)
 
Q

Qu0ll

A JVM has some overhead and threading is quite efficient in most
Java implementations.

I would go for threads.

Agreed - thanks.

--
And loving it,

-Q
_________________________________________________
(e-mail address removed)
(Replace the "SixFour" with numbers to email me)
 
L

Lew

That depends on which JVM you use. The current crop of Sun JVMs use native
threads on Linux and Solaris. I don't know what they do on Windows or Macs.


Java supports threads natively. Multiple-process coordination is harder.
Programming for multiple processes that don't communicate is probably easier.

A properly-designed multi-threaded Java program is going to have better
control and coordination than a multi-process solution, and will have less
overhead.

A badly-designed multi-threaded Java program will give heartaches.

A single JVM is able to optimize a lot of things among threads that multiple
processes will not be able to. For example, a number of scenarios that
involve synchronization can actually optimize away the synchronization.
 
K

Kenneth P. Turvey

On Mon, 12 Nov 2007 16:56:12 -0800, Daniel Pitts wrote:

[Snip]
If memory serves, JVMs on Linux will actually create an OS level
sub-process for every Thread spawned.
[Snip]

I should note that this isn't true anymore. At least it isn't true
anymore based on my recent experimentation using Java 1.6. The mapping
between native threads and green threads is not one-to-one anymore. I
went to some effort to get the JVM to use a specific number of native
threads for some calculations I was doing and found that it would ignore
the advice I gave it and decide when it was a good idea to use another
native thread on its own.

I must admit that the JVM probably makes better decisions about this than
I would, but it was still a bit annoying.
 
L

Lew

Kenneth said:
I should note that this isn't true anymore. At least it isn't true
anymore based on my recent experimentation using Java 1.6. The mapping
between native threads and green threads is not one-to-one anymore. I

Many JVMs do not use green threads at all.
went to some effort to get the JVM to use a specific number of native
threads for some calculations I was doing and found that it would ignore
the advice I gave it and decide when it was a good idea to use another
native thread on its own.

Many JVMs will just create a thread in the OS for each thread in the JVM.
I must admit that the JVM probably makes better decisions about this than
I would, but it was still a bit annoying.

Java VMs from Sun since 1.4.2 have not used green threads, at least not on
Linux with the Native POSIX Thread Library (NPTL).

"The NPTL approach keeps the 1-on-1 thread mapping (1 user or Java thread to 1 kernel thread),

As others have pointed out in this (newsgroup) thread, the implementation of
Java (execution) threads is JVM-dependent. Some JVMs map threads one-to-one
with pthreads or the platform equivalent, as quoted for Sun's 1.4.2 Red Hat
implementation; others do not. JVMs generally seem to be moving toward using
native threads 1-to-1 to Java threads in order to leverage the burgeoning
prevalence of multi-processor / multi-core / multi-(hardware)threaded computers.
 
L

Lew

Esmond said:
Was it ever true? 'ps' on Linux shows threads as well as processes, but
that's just 'ps'.

It used to be that Sun's Linux JVMs would create a process for each thread,
back before Java 1.4.2.

'ps' needs options to show threads, like -L or -T.
 
K

Kenneth P. Turvey

Java VMs from Sun since 1.4.2 have not used green threads, at least not on
Linux with the Native POSIX Thread Library (NPTL).

<http://java.sun.com/developer/technicalArticles/JavaTechandLinux/RedHat/index.html>


Just based on some experimentation I was doing, this doesn't seem to be
true. I'm running Linux with the Sun JVM, and it didn't map each Java
thread to a native thread until the Java thread was spending enough time
executing. I was actually trying to get this mapping (1 to 1) and found
it impossible to guarantee under Linux with the Sun JVM.

Under Solaris there is the -XX:UseBoundThreads (or something similar) to
get that behavior, but under Linux no such option exists.

I will freely admit that my experiment could have been flawed, but it
wasn't behaving as if it was using more than a single native thread. I
suspect that the article above is out of date.
 
E

Esmond Pitt

Lew said:
It used to be that Sun's Linux JVMs would create a process for each
thread, back before Java 1.4.2.

Surely you mean a native thread, not process?
 
L

Lew

Kenneth said:
Just based on some experimentation I was doing, this doesn't seem to be
true. I'm running Linux with the Sun JVM, and it didn't map each Java
thread to a native thread until the Java thread was spending enough time
executing. I was actually trying to get this mapping (1 to 1) and found
it impossible to guarantee under Linux with the Sun JVM.

Under Solaris there is the -XX:UseBoundThreads (or something similar) to
get that behavior, but under Linux no such option exists.

I will freely admit that my experiment could have been flawed, but it
wasn't behaving as if it was using more than a single native thread. I
suspect that the article above is out of date.

I might tend to trust your results, given that my conclusions are based on
Sun's literature and yours are based on experience.

The people that said, "It depends on the JVM" probably gave the best advice.
 
D

Daniel Pitts

Esmond said:
Surely you mean a native thread, not process?
Well, native threads are just about the same as processes, aren't they?
They have a PID and all.
 
E

Esmond Pitt

Lew said:

Then you must be mistaken about your assertion. A separate process with
therefore a new JVM couldn't possibly implement java thread semantics.
Surely this is just the 'ps' observation artefact?
 
L

Lew

Esmond said:
Then you must be mistaken about your assertion. A separate process with
therefore a new JVM couldn't possibly implement java thread semantics.
Surely this is just the 'ps' observation artefact?

I believe that you are correct.
 
E

Esmond Pitt

Daniel said:
Well, native threads are just about the same as processes, aren't they?
They have a PID and all.

A process has memory, code, a memory map, a set of registers, a set of
file & socket descriptors, and a set of threads. .... A thread has a PC
and a stack of its own, and the same memory, code, memory map, other
registers, and set of FDs as the process.

No, they're 'not just about the same'.
 
K

Kenneth P. Turvey

Just based on some experimentation I was doing, this doesn't seem to be
true. I'm running Linux with the Sun JVM, and it didn't map each Java
thread to a native thread until the Java thread was spending enough time
executing. I was actually trying to get this mapping (1 to 1) and found
it impossible to guarantee under Linux with the Sun JVM.

Under Solaris there is the -XX:UseBoundThreads (or something similar) to
get that behavior, but under Linux no such option exists.

I will freely admit that my experiment could have been flawed, but it
wasn't behaving as if it was using more than a single native thread. I
suspect that the article above is out of date.

I hate to followup my own post, but I've been looking at this problem
again and I'm really just unhappy with how it works. Since this can so
easily be solved under Solaris, and Lew (I think?) mentioned that this is
all JVM dependent. I was hoping somebody could point me to a JVM that
runs under Linux that supports the -XX:UseBoundThreads option or something
similar. I want a 1:1 mapping between native threads and Java threads and
I just can't seem to get it.

Does anyone have any idea? (BTW, I checked IBM's JVM).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top