java8 up to 6x slower than java7

D

Dennis Haupt

i wrote a little pathfinder. i can run it under the jdk7 update 51 and it is done in ~20 seconds. under java8, since a long series of builds, a certain part involving deeeeeeeeeep recursion takes ~60 seconds instead of 10.
i suspected the hashmap change of java8 to be the culprit since i perform alot of put operations, but that's not it - i used the standard hashmap from scala to be 100% sure the hashmap can't be the reason (or is), and indeed, i get the same performance difference between java7 and 8 with the same hashmap
my profiler (yourkit) shows that java8 spends a much bigger amount of time on garbage collection and allocated more objects. if i take a memory snapshot, can see that the final results are equal, so i have more temporary objects. yourkit also shows that more objects are allocated using java8.
what is up with that?
all i do in the critical method is creating arraylists and performing puts,contains and gets on a hashmap, and the recursive call. the time is, according to yourkit, spend in the method itself, not inside the hashmap or arraylist constructor.
any ideas?
 
J

Jan Burse

Dennis said:
i wrote a little pathfinder. i can run it under the jdk7 update 51 and it is done in ~20 seconds. under java8, since a long series of builds, a certain part involving deeeeeeeeeep recursion takes ~60 seconds instead of 10.
i suspected the hashmap change of java8 to be the culprit since i perform a lot of put operations, but that's not it - i used the standard hashmap from scala to be 100% sure the hashmap can't be the reason (or is), and indeed, i get the same performance difference between java7 and 8 with the same hashmap
my profiler (yourkit) shows that java8 spends a much bigger amount of time on garbage collection and allocated more objects. if i take a memory snapshot, can see that the final results are equal, so i have more temporary objects. yourkit also shows that more objects are allocated using java8.
what is up with that?
all i do in the critical method is creating arraylists and performing puts, contains and gets on a hashmap, and the recursive call. the time is, according to yourkit, spend in the method itself, not inside the hashmap or arraylist constructor.
any ideas?

I found that JDK 8 is more speedier than JDK 7, around 5%
better, which is nice. But my testing might not be
representative:

Jekejeke Prolog Runtime on JDK 8
https://plus.google.com/103259555581227445618/posts/Mjw49qhBWku

I didn't yet have a closer look to one of my apps. Usually
I also look how the IDE behaves. But unfortunately IntelliJ
IDEA 13 refuses to run under JDK 8 as of now. Didn't yet
figure out how to start it with JDK 8. Compiling JDK 8
on the other works fine.

Bye
 
J

Jan Burse

Jan said:
I found that JDK 8 is more speedier than JDK 7, around 5%
better, which is nice. But my testing might not be
representative:

But I somehow have the feeling that the JIT is slower.
It took me more iteration until I had a warm run,
where all the JIT has kicked in.

Bye
 
S

Stefan Ram

Dennis Haupt said:
any ideas?

Maybe they rewrote parts of the library to use closures, and
maybe closures create objects behind the scene with source
code looking inconspicuous.
 
J

Jan Burse

Jan said:
i wrote a little pathfinder. i can run it under the jdk7 update 51 and
it is done in ~20 seconds. under java8, since a long series of builds, a
certain part involving deeeeeeeeeep recursion takes ~60 seconds instead
of 10.
i suspected the hashmap change of java8 to be the culprit since i
perform a lot of put operations, but that's not it - i used the
standard hashmap from scala to be 100% sure the hashmap can't be the
reason (or is), and indeed, i get the same performance difference
between java7 and 8 with the same hashmap
my profiler (yourkit) shows that java8 spends a much bigger amount of
time on garbage collection and allocated more objects. if i take a
memory snapshot, can see that the final results are equal, so i have
more temporary objects. yourkit also shows that more objects are
allocated using java8.
what is up with that?
all i do in the critical method is creating arraylists and performing
puts, contains and gets on a hashmap, and the recursive call. the time
is, according to yourkit, spend in the method itself, not inside the
hashmap or arraylist constructor.
any ideas?

I must admit that I don't use the Java hashmap and have written
a custom version. The reason for this custom version was basically
to allow faster iteration, and not allocating an iterator. This
was already done before JDK 8 came out, independent of JDK 8.

But I now read that JDK 8 has introduced a new realization
of HashMap. See for example:


http://blog.credera.com/technology-insights/java/java-8-part-3-hashmap-java-time/

So I understand that you could see dramatic negative changes.
Sorry to hear that.

I will check my code more closely now, and try to eliminate more
of HashMap by my custom code where appropriate.

Bye
 
R

Robert Klemme

i wrote a little pathfinder. i can run it under the jdk7 update 51
and it is done in ~20 seconds. under java8, since a long series of
builds, a certain part involving deeeeeeeeeep recursion takes ~60
seconds instead of 10. i suspected the hashmap change of java8 to be
the culprit since i perform a lot of put operations, but that's not
it - i used the standard hashmap from scala to be 100% sure the
hashmap can't be the reason (or is), and indeed, i get the same
performance difference between java7 and 8 with the same hashmap my
profiler (yourkit) shows that java8 spends a much bigger amount of
time on garbage collection and allocated more objects. if i take a
memory snapshot, can see that the final results are equal, so i have
more temporary objects. yourkit also shows that more objects are
allocated using java8. what is up with that? all i do in the critical
method is creating arraylists and performing puts, contains and gets
on a hashmap, and the recursive call. the time is, according to
yourkit, spend in the method itself, not inside the hashmap or
arraylist constructor. any ideas?

Does your profiler not show allocation hotspots? Other than that it's
just wild speculation without seeing the code.

Cheers

robert
 
S

Stefan Ram

Dennis Haupt said:
all i do in the critical method is creating arraylists and
performing puts, contains and gets on a hashmap, and the
recursive call. the time is, according to yourkit, spend in
the method itself, not inside the hashmap or arraylist
constructor.

When writing my previous reply, I had not yet read your post
to the end. If it's not in the library, it must be in the JVM.
Possibly, they've added some checks for security reasons?
Or possibly, the profiler is not adapted to Java 8 and shows
spurious results? Maybe they have added some »optimizations«
that often make code become faster while it sometimes makes
code slower.

Maybe they changed the default optimization settings for the
JVM. You might try to hand-craft the JVM options. For example,
there are options like:

-XX:+UseFastAccessorMethods
-XX:+AggressiveOpts
-XX:CompileThreshold=1
 
R

Roedy Green

any ideas?

what did you configure for memory pool size in each?

A benchmark should "warm up" for a while before its starts timing.

They may differ when they take time out to generate machine code, or
how much time they spend optimising it.

Java 8 is not released yet. It may contain all kinds of debug code
that will be stripped for release.
 
R

Robert Klemme

I must admit that I don't use the Java hashmap and have written
a custom version. The reason for this custom version was basically
to allow faster iteration, and not allocating an iterator. This
was already done before JDK 8 came out, independent of JDK 8.

I also once wrote a Map using double Hashing. The reason at the time
was to avoid the many Entry instances. But we did only replace certain
uses of HashMap which we knew would create and GC lots of Entry
instances and have rare iterations (Entry instances were created during
iteration only, this was the downside).
But I now read that JDK 8 has introduced a new realization
of HashMap. See for example:

http://blog.credera.com/technology-insights/java/java-8-part-3-hashmap-java-time/

How is this supposed to work for keys which do not implement Comparable?
Or are they using the hash code only for the tree arrangement? Also,
the article does not mention that the fill factor limits the average
length of the chain used for linear probing.

It's also not mentioned in the article how they create the tree. You
can use an object per node but you can sometimes also use an array.
So I understand that you could see dramatic negative changes.
Sorry to hear that.

I am surprised that they do such a dramatic change to a standard library
class. In the past Sun / Oracle seemed to always try to minimize the
impact on programs.
I will check my code more closely now, and try to eliminate more
of HashMap by my custom code where appropriate.

I would first do some analysis before going there to determine how long
linear probing is on average.

Kind regards

robert
 
J

Jan Burse

Robert said:
Also, the article does not mention that the fill factor limits the
average length of the chain used for linear probing.

Yes, that is also my idea. I guess the problem is bound
to bad hashing in the objects that are used as keys, or
hashing that remains konstant even when the fill factor
changes.

I wonder why the web server or any of the business cases
that need that improvement do not implement a combination
uf HashMap and TreeMap by themselfs, so as to leave the
HashMap a more primitive construct.

The source code of the new HashMap is found in the
early access distribution of JDK 8. It seems that
they use a constant threshold:

static final int MIN_TREEIFY_CAPACITY = 64;

And then there is a lot of copying and creating new
objects going on:

final void treeify(Node<K,V>[] tab) {

And the tree seems to be sorted first by the hash
and by compareTo(). Whereby they use reflection to
figure out whether one of the objects implements the
generic interface.

I guess the above can lead to performance degradation,
especially in border cases, for example when one
unitentionally runs into the treeification but then
doesn't make much use of it.

Maybe for non border cases, the new HashMap is ok,
probably a bigger memory foot print. I made some
application tests over the last days. 1 use case is
faster, and 2 use cases are slower.

For the slower use cases I am planing using my custom
HashMap. The custom HashMap has some further advantages
besides sparing the iterator, I also don't do mod
counting.

One trick to develop a custom HashMap is to use the
JDK source code, and create a stripped down variant.
Not sure whether copyright allows this really. So I
didn't mention that here.

Bye
 
J

Jan Burse

Jan said:
For the slower use cases I am planing using my custom
HashMap. The custom HashMap has some further advantages
besides sparing the iterator, I also don't do mod
counting.

Last but not least, one feature I am constantly missing
from HashMap, is that it should grow back in size.
HashMap is only growing and growing, and never
automatically shrinks.
One trick to develop a custom HashMap is to use the
JDK source code, and create a stripped down variant.
Not sure whether copyright allows this really. So I
didn't mention that here.

The shrinking I didn't find in the JDK source code,
so this is genuine implementation. I use a lower
watermark which is 1/4 and I am shrinking to 1/2.
Works pretty well so far.
 
R

Robert Klemme

Last but not least, one feature I am constantly missing
from HashMap, is that it should grow back in size.
HashMap is only growing and growing, and never
automatically shrinks.

Usually there is good reason to not do this: if a long lived collection
reaches a particular size at some point in time chances are that it will
do so in the future again. Shrinking will cause allocation overhead and
- at a later point - another allocation for the larger size which has
higher risk of causing slow GC because of memory fragmentation. Also,
if the long lived collection resizes internally at least for some GC
cycles there is a reference from old gen to new gen.

For short lived objects additional allocations to save memory are even
worse. There it's usually best to just grow.

For the long lived case you can still manually shrink by creating a new
collection if the nature of your application is such that there
temporarily the collection grows huge and afterwards shrinks again.

Kind regards

robert
 
K

Kevin McMurtrie

Jan Burse said:
I must admit that I don't use the Java hashmap and have written
a custom version. The reason for this custom version was basically
to allow faster iteration, and not allocating an iterator. This
was already done before JDK 8 came out, independent of JDK 8.

But I now read that JDK 8 has introduced a new realization
of HashMap. See for example:


http://blog.credera.com/technology-insights/java/java-8-part-3-hashmap-java-ti
me/

So I understand that you could see dramatic negative changes.
Sorry to hear that.

I will check my code more closely now, and try to eliminate more
of HashMap by my custom code where appropriate.

Bye


Wow, that sounds like an intense optimization, though I can't find the
source code online to see it. Much like the String change, it's going
to have cases where performance becomes extremely bad.

As for profiling - don't trust it for small bits of code. Sampling can
only see safepoints, and instrumentation breaks local optimizations. My
favorite glitch is that Integer.hash(), which contains no code, turns up
as a top hot spot in some of my code. It's actually the caller eating
CPU time but that's not seen by any profiler.
 
J

Jan Burse

Hi,

Robert said:
For the long lived case you can still manually shrink by creating a new
collection if the nature of your application is such that there
temporarily the collection grows huge and afterwards shrinks again.

You have this effect and can control this effect by an appropriate
lower water mark choice. Without putting some special compaction
points into your code. Overhead is only a compare instruction
besides the compaction itself.

It should be noted that you are true, that the compaction is not
necessary. It does even make the hashing less good, since you
have fewer bins. Normally a hash map that grows and grows, will
nevertheless have smaller bin lengths when elements are
removed again.

Main reason I opted for automatic compaction of my hash maps was
that my application has miriads of hash maps. Well not miriads,
but a lot of hash maps. They are used for some just in time indexing,
and are even recursively built, i.e. hash values that are again hash
maps and so.

So the requirement was really to have very light weighted hash
maps for this data structure. Even if there is a small time overhead.
Problem was that application crashed without compaction, since
memory consumption was too high.

Bye
 
J

Jan Burse

Jan said:
So the requirement was really to have very light weighted hash
maps for this data structure. Even if there is a small time overhead.
Problem was that application crashed without compaction, since
memory consumption was too high.

The requirement is that the following code should
eat up O(m) space and not O(m*n) space:

HashMap[] maps = new HashMap[m];
for (int i=0; i<m; i++) {
maps = new HashMap();
}
for (int i=0; i<m; i++) {
for (int j=0; j<n; i++) {
maps.put(j,j)
}
for (int j=0; j<n; i++) {
maps.remove(j);
}
}
 
R

Robert Klemme

Jan said:
So the requirement was really to have very light weighted hash
maps for this data structure. Even if there is a small time overhead.
Problem was that application crashed without compaction, since
memory consumption was too high.

The requirement is that the following code should
eat up O(m) space and not O(m*n) space:

HashMap[] maps = new HashMap[m];
for (int i=0; i<m; i++) {
maps = new HashMap();
}
for (int i=0; i<m; i++) {
for (int j=0; j<n; i++) {
maps.put(j,j)
}
for (int j=0; j<n; i++) {
maps.remove(j);
}
}


What is n here? Is that some arbitrary input figure otherwise unrelated
to the shown code?

A small change which would help this code and avoid too much overhead is
to set the table to null once the Map is empty. Since all your Maps are
empty at the end this would help dramatically.

Cheers

robert
 
J

Jan Burse

Wanja said:
I'd say it's hard to help Jan, if he doesn't provide some real code with
some "real" (or generated) data to chew on.

Well better help Dennis Haupt, not me, the original poster
of the thread.

Me myself is anyway a hopeless case. I just discovered
ConcurrentHashMap in JDK 8. Very interesting, it doesn't
use a lock on the outside, but I counted 9 times a
synchronized statement inside the code.

Still thinking about what the implications of
ConcurrentHashMap could be for my further development.

Also I stuck with the following problem:
http://stackoverflow.com/questions/...ique-objects-for-names-with-concurrent-delete

Bye
 
J

Joerg Meier

Me myself is anyway a hopeless case. I just discovered
ConcurrentHashMap in JDK 8. Very interesting, it doesn't
use a lock on the outside, but I counted 9 times a
synchronized statement inside the code.

From
<http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html>

Since:
1.5

Not exactly new ;)

As for your problem, I didn't quite understand your problem description,
but just in case you aren't aware, you don't have to synchronize on the
object you deal with in the block - you can synchronize on anything you
want, and many people use a helper field like

private final Object $LOCK = new Object();

Liebe Gruesse,
Joerg
 
J

Jan Burse

Hi,

It seems to me that the ConcurrentHashMap of JDK 8
adopts the tree structure for lage bins already
found in the HashMap of JDK 8. Namely I find:

ConcurrentHashMap JDK 5:
No method treeifyBin in source code found.
(Related to this thread, i.e. is JDK 8 slower?)
No method computeIfAbsent in source code found.
(Not directly related, but also interesting)

ConcurrentHashMap JDK 8:
A method treeifyBin in source code found.
(Related to this thread, i.e. is JDK 8 slower?)
A method computeIfAbsent in source code found.
(Not directly related, but also interesting,
see also:
http://stackoverflow.com/questions/19278443/how-do-i-use-the-new-computeifabsent-function)

Since ConcurrentHashMap uses more effectively inner
locks, I will destroy all this efficiency when I
use an outer lock.

This is my concern when I would use the concurrent
hash map as follows:

synchronized(map) {
/* do more than one operation on map,
atomically */
}

This concern also holds if I use as an extra lock
for example:

public final lock = new Object();

synchronized(lock) {
/* do more than one operation on map,
atomically */
}

The only benefit respectively drawback of the
extra lock is that more memory is used.

But nevertheless the pattern of an extra lock
is often seen for various reasons.

Bye
 
L

Lew

private final Object $LOCK = new Object();

That violates the coding conventions in both the Coding Conventions document and the JLS:

http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.8
"The $ character should be used only in mechanically generated source code or,
rarely, to access pre-existing names on legacy systems."

http://www.oracle.com/technetwork/java/javase/documentation/codeconventions-135099.html#367
"Variable names should not start with underscore _ or dollar sign $ characters,
even though both are allowed."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top