string case clauses

Roedy Green · Sep 11, 2011

Has anyone looked at he code JavaC generates for the new String case
clauses? It is a simple linear search, a binary search, a HashMap?
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)

Joshua Cranmer · Sep 11, 2011

Has anyone looked at he code JavaC generates for the new String case
clauses? It is a simple linear search, a binary search, a HashMap?

<http://blogs.sun.com/darcy/entry/project_coin_string_switch_break>
discusses the implementation a bit.

In short:
1. Switch on the hashcode.
2. Verify string equalities using a linear switch for all strings that
have the same hashcode.

Roedy Green · Sep 11, 2011

1. Switch on the hashcode.
2. Verify string equalities using a linear switch for all strings that
have the same hashcode.

ouch. That amounts to a linear search comparing hashcodes. So it is
not a good idea to replace HashMap lookups with String switches.

--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)

Arne Vajhøj · Sep 12, 2011

ouch. That amounts to a linear search comparing hashcodes. So it is
not a good idea to replace HashMap lookups with String switches.

Is anything different from int in pre-1.7?

Arne

Lew · Sep 12, 2011

So? How long is your switch statement? Your advice doesn't make sense.

I thought switches worked by compiled-in jumps to a case label. Was I mistaken? So the only linear search would be on the very rare hash collision, and typically no more than two deep.

Even if I'm mistaken, a linear search through a handful of cases is not going to take significantly longer than a HashMap lookup, and will avoid the overhead of creating all those objects and concomitant GC pressure. So calmdown your panic, big boy.

Is anything different from int in pre-1.7?

Not really. If your switch is so long that a linear search is a performance problem, you have deeper problems anyway.

Arne Vajhøj · Sep 12, 2011

Not really.

And I do not recall anyone complaining about switch on int, so
yet another non-problem.

Arne

Roedy Green · Sep 14, 2011

Huh? How so? It's actually basically the same algorithm a full-on hash
table would use. Specifically: hash value gets you to a very small
subset of the possible values (very often just one, as long as the hash
code algorithm is good), then linear search on collisions.

Before case strings, there were two JVM instructions for implementing
an int switch, tableswitch and lookupswitch. The efficient tableswitch
is a jump table used when the switch values are reasonably dense e.g.
numbered 0..N. The less efficient lookupswitch is a binary search or
linear search used when the switch values all over the map.

I understood the way string case labels worked posted earlier was to
use the hashCode as the switch variable, which would necessitate the
use of the inefficient lookupswitch plus a little kludge to deal with
hashCode collisions.

In contrast a String case lookup done manually with HashMap does not
involve any binary search or linear scan. It folds the hashCode to a
reasonable value and does a table lookup.

As usual, there is nothing in the JVM spec that says how lookupswitch
has to be implemented. For maximum speed (but extravagant RAM), the
JIT could analyse the values, and construct a perfect or near perfect
hash table. It has the advantage the values are known at load time and
cannot change. I have seen reference to a number of C algorithms for
doing this. The JIT has to generate thread-safe code. HashMap does
not.

--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)

Lew · Sep 15, 2011

Peter said:
Roedy said:

[...]
I understood the way string case labels worked posted earlier was to
use the hashCode as the switch variable, which would necessitate the
use of the inefficient lookupswitch plus a little kludge to deal with
hashCode collisions.

In contrast a String case lookup done manually with HashMap does not
involve any binary search or linear scan. It folds the hashCode to a
reasonable value and does a table lookup. [...]

Click to expand...

I agree with Patricia's assessment. There is no proof that the current
implementation of an integral switch would be less efficient than a full
hash table implementation, and in fact for relatively few values (as a
switch really ought to be), it's likely to be more efficient. At the
very least, it's difficult to easily get a small hash table without also
having collisions.

I see no reason to be concerned whatsoever about the performance
implications of the choice of implementation for string case labels in
Java. If and when you have a program where that turns out to be a
bottleneck, then of course you can look at alternatives. I doubt you'll
run into that situation though. And if you do, it's quite likely that
you should have been using a hash table structure for other reasons anyway.

These points or similar were made fairly early in this thread by a couple of people.

A 'switch' by any other name is still a 'goto'.

But at least it's forward only.

cropping images	13	Aug 19, 2011
Locale.getDefault bug in JDK 1.7	28	Aug 22, 2011
Fundamentals of Financial Management Concise 7e Brigham Houston	0	May 1, 2011
plz read it	2	Feb 21, 2008
comp.lang.vhdl FAQ part 1 of 4: general	0	Jul 8, 2003

string case clauses

Roedy Green

Joshua Cranmer

Roedy Green

Arne Vajhøj

Lew

Arne Vajhøj

Roedy Green

Lew

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads