JVM optimization

I

ilkinulas

Hi,
two functions test1 and test2 does the same thing but test2 performs
nearly 20 times better than test1. JVM is unable to optimize the code
in test1. is there a way to tell the java virtual machine to do this
kind of optimization at compile time or runtime?
NOTE : we are using Log4J and java version "1.4.2_02"
private static final Logger log = Logger.getLogger(DebugTest.class);

--------------------------------------------------------------------------

public void test1() {
for (int i = 0; i < 10000000;i++) {
String s ="test"+i;
if(log.isDebugEnabled()) {
log.debug(s);
}
}
}

public void test2() {
for (int i = 0; i < 10000000;i++) {
if(log.isDebugEnabled()) {
String s ="test"+i;
log.debug(s);
}
}
}
--------------------------------------------------------------------------
 
B

bugbear

ilkinulas said:
Hi,
two functions test1 and test2 does the same thing but test2 performs
nearly 20 times better than test1.

What - even if if(log.isDebugEnabled()) returns true?
JVM is unable to optimize the code
in test1. is there a way to tell the java virtual machine to do this
kind of optimization at compile time or runtime?
NOTE : we are using Log4J and java version "1.4.2_02"
private static final Logger log = Logger.getLogger(DebugTest.class);

--------------------------------------------------------------------------

public void test1() {
for (int i = 0; i < 10000000;i++) {
String s ="test"+i;
if(log.isDebugEnabled()) {
log.debug(s);
}
}
}

public void test2() {
for (int i = 0; i < 10000000;i++) {
if(log.isDebugEnabled()) {
String s ="test"+i;
log.debug(s);
}
}
}

The compiler has no way of knowing how "likely", let alone
constant, the result of method log.isDebugEnabled().

The reason *you* can optimise this is because you have
knowledge the compiler does not.

BugBear
 
C

Chris Uppal

bugbear said:
The compiler has no way of knowing how "likely", let alone
constant, the result of method log.isDebugEnabled().

The reason *you* can optimise this is because you have
knowledge the compiler does not.

This isn't quite entirely correct, although it's probably close enough for the
OP's purposes.

The runtime JITer /could/, in theory, analyse log.isDebugEnabled() and
determine that it would always return false (E.g. if it inlined it to an access
of a static boolean field that was declared final, or which it could "see" was
never written to). In that case, and if it could further determine that "s"
was not used elsewhere, and that the StringBuilder manipulations involved in
"test"+i had no side-effects, then it would be justified in removing that code.

The "server" JVM from Sun is certainly capable of performing that /kind/ of
optimisation. I don't know whether it would actually do so in this particular
case.

Note, BTW, that if this code was compiled with a version of javac before 1.5,
or compiled for a pre-1.5 platform, then "test"+i would be compiled into
StringBuffer manipulation, rather than StringBuilder. In that case it is /not/
true that "test"+1 has no side-effects (since it involves crossing a
synchronisation barrier), so I wouldn't expect the JITer to be able to remove
it (unless it was buggy, or "knew" that synchronisation barriers didn't matter
for the particular code it generated for the particular machine it was running
on).

To the OP: in general there is no way of telling the runtime that you want it
to perform any particular optimisation -- other than doing it yourself by
changing the code. Any particular JVM implementation /may/ have options to do
so, but I don't know of any, and in any case it would be extremely obscure, and
probably unsupported.

-- chris
 
T

Tim Tyler

ilkinulas said:
two functions test1 and test2 does the same thing but test2 performs
nearly 20 times better than test1. JVM is unable to optimize the code
in test1. is there a way to tell the java virtual machine to do this
kind of optimization at compile time or runtime?
NOTE : we are using Log4J and java version "1.4.2_02"
private static final Logger log = Logger.getLogger(DebugTest.class);

--------------------------------------------------------------------------

public void test1() {
for (int i = 0; i < 10000000;i++) {
String s ="test"+i;
if(log.isDebugEnabled()) {
log.debug(s);
}
}
}

public void test2() {
for (int i = 0; i < 10000000;i++) {
if(log.isDebugEnabled()) {
String s ="test"+i;
log.debug(s);
}
}
}
--------------------------------------------------------------------------

There are a number of class file optimesers available that optimise
after compilation.

http://www.geocities.com/marcoschmidt.geo/java-class-file-optimizers.html

....has a list.

Whether any of them will deal with your example, I don't know -
and the answer may depend on what log.isDebugEnabled() actually does.
 
I

ilkinulas

we assume that log.isDebugEnabled returns "false" every time.

if the tests are executed under jvm version 1.5, than test1 and test2
give almost the same results.
is it possible for a virtual machine to prepare the String s just
before it needs it.
i mean "test"+i; is calculated not in line 3 but before log.debug(s),
if log debug is enabled.
1 public void test1() {
2 for (int i = 0; i < 10000000;i++) {
3 String s ="test"+i;
4 if(log.isDebugEnabled()) {
5
6 log.debug(s);
7 }
8 }
9 }
 
L

Lee Fesperman

Chris said:
Note, BTW, that if this code was compiled with a version of javac before 1.5,
or compiled for a pre-1.5 platform, then "test"+i would be compiled into
StringBuffer manipulation, rather than StringBuilder. In that case it is /not/
true that "test"+1 has no side-effects (since it involves crossing a
synchronisation barrier), so I wouldn't expect the JITer to be able to remove
it (unless it was buggy, or "knew" that synchronisation barriers didn't matter
for the particular code it generated for the particular machine it was running
on).

A smart JIT could know that the StringBuffer object was local and its reference wasn't
passed to an external method. I would guess JIT would do 'variable' usage analysis of
this type. The machine architecture wouldn't matter. I know of one JIT that discovers if
a reference's lifetime is local and allocates it on the stack.
 
K

Kevin McMurtrie

ilkinulas said:
Hi,
two functions test1 and test2 does the same thing but test2 performs
nearly 20 times better than test1. JVM is unable to optimize the code
in test1. is there a way to tell the java virtual machine to do this
kind of optimization at compile time or runtime?
NOTE : we are using Log4J and java version "1.4.2_02"
private static final Logger log = Logger.getLogger(DebugTest.class);

--------------------------------------------------------------------------

public void test1() {
for (int i = 0; i < 10000000;i++) {
String s ="test"+i;
if(log.isDebugEnabled()) {
log.debug(s);
}
}
}

public void test2() {
for (int i = 0; i < 10000000;i++) {
if(log.isDebugEnabled()) {
String s ="test"+i;
log.debug(s);
}
}
}
--------------------------------------------------------------------------


Do you mean that log.isDebugEnabled() returns false? If so it's totally
your fault for test1 being slower.

This:
String s ="test"+i;

Compiles to:
String s= new StringBuffer("test").append(i).toString();

Lots of code is hidden in such a simple expression. It is beyond the
scope of the compiler to determine whether or not there are side effects
in all of that. It can not omit its execution simply because the result
is not used.
 
I

ilkinulas

if i have a method for logging like this:

public void debug(String s) {
if (log.isDebugEnabled()) {
log.debug(s);
}
}

i would like to use method "debug" in this way
debug("test" + someVariable);
String s is constructed before checking "if debug is enabled". if debug
is not enabled there is no need to concatenate "test" + someVariable.
 
T

Thomas Schodt

ilkinulas said:
if i have a method for logging like this:

public void debug(String s) {
if (log.isDebugEnabled()) {
log.debug(s);
}
}

i would like to use method "debug" in this way
debug("test" + someVariable);
String s is constructed before checking "if debug is enabled". if debug
is not enabled there is no need to concatenate "test" + someVariable.

You could do something like

// covers scalar primitives; byte,char,short,int,long
public void debug(String s,long l) {
if (!log.isDebugEnabled()) return;
log.debug(s+l);
}

public void debug(String s,boolean b) {
if (!log.isDebugEnabled()) return;
log.debug(s+b);
}

// covers the rest - might even cover primitives in 1.5 ?
public void debug(String s,Object o) {
if (!log.isDebugEnabled()) return;
log.debug(s+o.toString());
}

You can add as many debug() variants as you care to.
 
C

Chris Uppal

Lee said:
A smart JIT could know that the StringBuffer object was local and its
reference wasn't passed to an external method.

Doesn't make any difference -- entering or leaving a synchronised block has a
/global/ effect, and therefore cannot be optimised away.

(Unless, as I said, the JITer knows that the "global effect" is in fact zero
for that particular machine architecture and code-generation strategy -- which
may be the case, but which is not true in general.)

-- chris
 
L

Lee Fesperman

Chris said:
Doesn't make any difference -- entering or leaving a synchronised block has a
/global/ effect, and therefore cannot be optimised away.

You lost me here. How does synchronizing on a truly 'local' object have a global effect?
No other thread could possibly synchronize on the object. As I mentioned, the object
could even be on the stack.
 
M

Mark Thornton

Lee said:
You lost me here. How does synchronizing on a truly 'local' object have a global effect?
No other thread could possibly synchronize on the object. As I mentioned, the object
could even be on the stack.

Synchronizing causes any values 'cached' in thread local memory to be
written to main memory. It also invalidates any values previously read
from main memory --- they have to be reread in case they have changed.
This effect is not limited to the object used for synchronization.

Mark Thornton
 
L

Lee Fesperman

Mark said:
Synchronizing causes any values 'cached' in thread local memory to be
written to main memory. It also invalidates any values previously read
from main memory --- they have to be reread in case they have changed.
This effect is not limited to the object used for synchronization.

Sure, but how does optimizing that away invalidate the correctness of the execution
here? The memory barrier would only be important for values in the object being
synchronized, which no other thread could see or affect.

In the case we're discussing, the 1.5 compiler actually does optimize it away ;^)
 
M

Mark Thornton

Lee said:
Mark Thornton wrote:



Sure, but how does optimizing that away invalidate the correctness of the execution
here?

The memory barrier would only be important for values in the object being
synchronized, which no other thread could see or affect.

This is not true. Memory barriers affect all values, otherwise the
common technique of synchronizing on a 'lock' object (which is often
just an instance of Object) wouldn't be valid.
In the case we're discussing, the 1.5 compiler actually does optimize it away ;^)

The spec doesn't require the use of StringBuffer to implement string
concatenation, so it legitimate for it to be replaced by StringBuilder.
So any code relying on the use of StringBuffer (and its synchronization
is the only effect that might be visible) would be invalid. However if
the use of StringBuffer had remained, it would be extremely difficult
for a JIT to correctly remove it as it would have to prove that the
memory barrier was not required.

Mark Thornton
 
J

John C. Bollinger

Lee said:
Mark Thornton wrote:
[...]
Synchronizing causes any values 'cached' in thread local memory to be
written to main memory. It also invalidates any values previously read
from main memory --- they have to be reread in case they have changed.
This effect is not limited to the object used for synchronization.


Sure, but how does optimizing that away invalidate the correctness of the execution
here? The memory barrier would only be important for values in the object being
synchronized, which no other thread could see or affect.

You misunderstand. After the initial memory barrier (at entry to the
synchronized block) the thread must reload from main memory *every*
variable whose value it subsequently wants to use. Before the _second_
memory barrier (at exit from the synchronized block) the thread is
obliged to write to main memory *all* externally-visible variables that
it has modified since their load. The identity of the object
synchronized is completely irrelevant: synchronized(new Object()) {} is
exactly the same as synchronized(this) {} or synchronized(anythingElse)
{} in this regard. This may affect shared variables used by methods
further down the stack frame, which may belong to different classes, so
it is impossible to determine at compile time that it is safe to remove
the barrier.

The VM, with the whole program to work on, has a better chance of being
able to determine whether the memory barrier can be removed. It may
well still be that the barrier _cannot_ be removed without altering
program semantics, however, and in any case a non-trivial analysis is
required to make the determination.
In the case we're discussing, the 1.5 compiler actually does optimize it away ;^)

But in 1.5 a StringBuilder to implement the concatenation instead of a
StringBuffer (as I understand it); one of the key advantages of the
former is that it does not have an internal memory barrier.
 
L

Lee Fesperman

Mark said:
This is not true. Memory barriers affect all values, otherwise the
common technique of synchronizing on a 'lock' object (which is often
just an instance of Object) wouldn't be valid.

Ok, I wasn't considering that side effect.
The spec doesn't require the use of StringBuffer to implement string
concatenation, so it legitimate for it to be replaced by StringBuilder.
So any code relying on the use of StringBuffer (and its synchronization
is the only effect that might be visible) would be invalid. However if
the use of StringBuffer had remained, it would be extremely difficult
for a JIT to correctly remove it as it would have to prove that the
memory barrier was not required.

Right, because of the nature of bytecode, the JIT couldn't know if this was an explicit
or implicit use of concatenation, thus it couldn't tell if it were being used
specifically because of this side effect. It would have to prove there was nothing
affected by the memory barrier.
 
L

Lee Fesperman

Lee said:
Ok, I wasn't considering that side effect.

Oops, I'm going to have to take that back! I shoulda trusted my original intuition ;^)

Yes, the normal memory barrier does do this. However, it is inappropriate to depend on
those effects outside of the synchronized section. The only variables referenced in the
synchronized section we were discussing (those in the 'local' copy of StringBuffer) can
never be accessed externally. It would be incorrect to assume the synchronization
affects any other external variables. Besides, depending this effect on variables
*outside* the synchronization would be non-deterministic ... a 'was the variable changed
before the synchronization or after?' kind of situation.

I can envision a JIT that would determine all external variables accessed while
synchronized on a specific reference and would only apply the memory barrier to those.
IMO, that type of optimization should be allowed.
 
J

John C. Bollinger

Lee said:
Oops, I'm going to have to take that back! I shoulda trusted my original intuition ;^)

Yes, the normal memory barrier does do this. However, it is inappropriate to depend on
those effects outside of the synchronized section. The only variables referenced in the
synchronized section we were discussing (those in the 'local' copy of StringBuffer) can
never be accessed externally.

The only question of variable scope that is relevant is whether a
variable is local (to a method) or not (in which case it is "shared").
Whether or not a shared variable is referenced in the synchronized
section does not affect the fact that the thread needs to reload it from
main memory the next time its value is used after the barrier, whether
that happens to occur inside the synchronized section or not. This can
result in the thread "seeing" a different value for *any* shared
variable after the barrier than it did before the barrier.
It would be incorrect to assume the synchronization
affects any other external variables.

Yes, but more to the point, it is also incorrect to assume that the
synchronization *does not* affect any particular shared variable.
That's why the compiler cannot remove it, and why it's non-trivial for
the JIT to remove it.
Besides, depending this effect on variables
*outside* the synchronization would be non-deterministic ... a 'was the variable changed
before the synchronization or after?' kind of situation.

Multi-threaded programming *is* nondeterministic. The point of
synchronization is to constrain the possible global sequence of events,
but you cannot make it totally deterministic and retain "simultaneous"
execution. Chapter 17 of the JLS (second edition) is entirely devoted
to this topic.
I can envision a JIT that would determine all external variables accessed while
synchronized on a specific reference and would only apply the memory barrier to those.
IMO, that type of optimization should be allowed.

That would not be sufficient to comply with the language's requirements.
The JIT conceivably could perform an analysis proving that it could
remove the barrier, but it would be considerably more complicated than
you suggest, as it would involve visibility and use analysis for all
variables visible to the thread at the barrier, relative to *all other
live threads*.

Lee, you seem to have some ideas about synchronization that are
inconsistent with the Java specs (or perhaps I'm totally wrong, but
either way ...), I suggest you read over JLS(2e).17 to see whether you
think it supports your position.
 
C

Chris Uppal

John said:
The JIT conceivably could perform an analysis proving that it could
remove the barrier, but it would be considerably more complicated than
you suggest, as it would involve visibility and use analysis for all
variables visible to the thread at the barrier, relative to *all other
live threads*.

Or, as I suggested above but didn't expand on, the JIT might know that the
synchronisation barriers had no effect.

If it was running on a machine/architecture/mode where the hardware provided
full synchronisation between the memory seen by different processors (e.g. a
single processor box ;-) then there'd be no need to worry about hardware
synchronisation. If it also knew that it's own code-generation strategy didn't
involve caching "global" data in thread-local store (such as the threads'
stacks) then it would not need to worry about flushing that data into main
store. I suppose it would still have to ensure that registers were flushed
back to the stack/main-memory, but that can be achieved with a purely local
analysis.

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top