string case clauses

R

Roedy Green

Has anyone looked at he code JavaC generates for the new String case
clauses? It is a simple linear search, a binary search, a HashMap?
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
R

Roedy Green

1. Switch on the hashcode.
2. Verify string equalities using a linear switch for all strings that
have the same hashcode.

ouch. That amounts to a linear search comparing hashcodes. So it is
not a good idea to replace HashMap lookups with String switches.

--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
A

Arne Vajhøj

ouch. That amounts to a linear search comparing hashcodes. So it is
not a good idea to replace HashMap lookups with String switches.

Is anything different from int in pre-1.7?

Arne
 
L

Lew

So? How long is your switch statement? Your advice doesn't make sense.

I thought switches worked by compiled-in jumps to a case label. Was I mistaken? So the only linear search would be on the very rare hash collision, and typically no more than two deep.

Even if I'm mistaken, a linear search through a handful of cases is not going to take significantly longer than a HashMap lookup, and will avoid the overhead of creating all those objects and concomitant GC pressure. So calmdown your panic, big boy.
Is anything different from int in pre-1.7?

Not really. If your switch is so long that a linear search is a performance problem, you have deeper problems anyway.
 
R

Roedy Green

Huh? How so? It's actually basically the same algorithm a full-on hash
table would use. Specifically: hash value gets you to a very small
subset of the possible values (very often just one, as long as the hash
code algorithm is good), then linear search on collisions.

Before case strings, there were two JVM instructions for implementing
an int switch, tableswitch and lookupswitch. The efficient tableswitch
is a jump table used when the switch values are reasonably dense e.g.
numbered 0..N. The less efficient lookupswitch is a binary search or
linear search used when the switch values all over the map.

I understood the way string case labels worked posted earlier was to
use the hashCode as the switch variable, which would necessitate the
use of the inefficient lookupswitch plus a little kludge to deal with
hashCode collisions.

In contrast a String case lookup done manually with HashMap does not
involve any binary search or linear scan. It folds the hashCode to a
reasonable value and does a table lookup.

As usual, there is nothing in the JVM spec that says how lookupswitch
has to be implemented. For maximum speed (but extravagant RAM), the
JIT could analyse the values, and construct a perfect or near perfect
hash table. It has the advantage the values are known at load time and
cannot change. I have seen reference to a number of C algorithms for
doing this. The JIT has to generate thread-safe code. HashMap does
not.

--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
L

Lew

Peter said:
Roedy said:
[...]
I understood the way string case labels worked posted earlier was to
use the hashCode as the switch variable, which would necessitate the
use of the inefficient lookupswitch plus a little kludge to deal with
hashCode collisions.

In contrast a String case lookup done manually with HashMap does not
involve any binary search or linear scan. It folds the hashCode to a
reasonable value and does a table lookup. [...]

I agree with Patricia's assessment. There is no proof that the current
implementation of an integral switch would be less efficient than a full
hash table implementation, and in fact for relatively few values (as a
switch really ought to be), it's likely to be more efficient. At the
very least, it's difficult to easily get a small hash table without also
having collisions.

I see no reason to be concerned whatsoever about the performance
implications of the choice of implementation for string case labels in
Java. If and when you have a program where that turns out to be a
bottleneck, then of course you can look at alternatives. I doubt you'll
run into that situation though. And if you do, it's quite likely that
you should have been using a hash table structure for other reasons anyway.

These points or similar were made fairly early in this thread by a couple of people.

A 'switch' by any other name is still a 'goto'.

But at least it's forward only.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,539
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top