Using Enumerated Types as Array Indexes

A

Arne Vajhøj

No, in general.

The original post was:

# In Java array indexes are
# int and you have byte/short/int/long types and that is it.

an it should have been:

# In Java array indexes are
# int and you have byte/short/int/long integer types and that is it.

Arne
 
E

Eric Sosman

I'd be shocked if an int actually overflowed when counting requests like
this, but that's all I can think of. 2^31 is such a huge number, I
didn't think you could get there in practical cases.

He *did* write "two months of uninterrupted [...] usage." Two
months ~= 60 days ~= 5 megaseconds, so a 2G counter overflows if it
increments at ~400 Hz or more, give or take a few backs of envelopes.
Assuming that the AtomicInteger did overflow, then n % y produces
negative numbers if n is negative, a clear AIOOB.

Try : arr[ Math.abs( n ) % arr.length ];

Bzzzzt! Thank you for playing, go back to the rear of the line.
While awaiting your next opportunity, ponder: What is the sign of
Math.abs(Integer.MIN_VALUE)?
 
A

Arne Vajhøj

Sure -- but if Enums had been in the language from the beginning, then
it might have occurred to people that array indexes could be "any type
with a method 'int ordinal()'" (with the appropriate caveat about being
in range 0..length-1)

Non int array indexes exists in many languages before Java, so they
knew, but decided to make it simple.

Arne
 
A

Arne Vajhøj

No it didn't. The Ada language has been extended and updated regularly;
a new standard is due to appear in 2012. Several compilers are
available, one of the best being gnat, which is a part of gcc and is
freely available on many platforms. See libre.adacore.com and
http://en.wikipedia.org/wiki/Ada_(programming_language).


The number of people working with Ada is certainly smaller than for
Java, but far from zero. Even the USENET group comp.lang.ada is quite
active.

Ada is not completely dead.

But its usage has declined to a small fraction of what it once was.

If you look for Ada jobs at job sites there are very few.

A shame that it ended up so, but it did.

Arne
 
M

markspace

an it should have been:

# In Java array indexes are
# int and you have byte/short/int/long integer types and that is it.


OK, you still left char out. char is an integer type. And I personally
don't feel that long is correct -- it's an integer type but it has
potential loss of precision. Normally you should use int, since that
has all the precision you will need.

That's all I'm saying.
 
R

Roedy Green

Java is a pretty handy language in its own right. But in Ada one
could define arrays to be indexed by enumerated types. Can Java do
that? If not, why not?

because you can use a[e.ordinal()]

The general rule has been any language feature that can be kludged
with sufficient verbiage is not considered for inclusion.

The focus was in the early days completely on the JVM. The language
itself was considered a kludge for generating byte code.

I got a chance to talk at length with Bill Joy one of the designers of
Java. One of the thing he explained that various licensees of the JVM
would revolt if there were too many changes made for a new version of
Java. This lead the designers to use implementations that could be
handled with whatever was already present in the JVM.

In Java 7 where was some sugar: underscores in literals, binary
literals, string case labels, so perhaps we could see enumerated type
arrays and hash arrays some day.
 
R

Robert Klemme

Andreas Leitgeb wrote:
Lew wrote:
JLS 10: [...] IOW, you never index arrays with a byte, char or
short. [...]
I'm feeling slow this morning: how do we get problems with
byte->int, short->int, or char->int again?

One gets problems with byte->int and short->int with array indexes
the way one gets into trouble with those widening conversions
generally. The usual suspect is the lack of unsigned versions, so
widening a (byte)0xA0, for example, would result in a negative
index.

One gets into trouble generally in programming when one thinks one
thing is going on ("index takes a byte") whilst ignoring what's
really going on (there's a widening conversion involved). You might
get away with it most of the time, but occasionally such things trip
you up.

Why had a nice issue recently which exactly fits this bill: after over
two months of uninterrupted, completely error free production usage the
application suddenly stopped working giving weird error messages (there
was an issue with the error reporting as well, but let's ignore that for
the moment). Turns out this was the setup:

1. There was an AtomicInteger initialized at startup with 0 which for
every request coming into that system was incremented in a thread safe
manner (incrementAndGet()).

2. The result was used to index into an array with the quite obvious
"arr[n % arr.length]".

3. (left as exercise for the reader)

Anybody who now thinks all is fine should stop coding Java immediately
and go reading the language spec.

Yes, but what has this error to do with widening conversions,
or conversions of any kind? It's GIGO, pure and simple.

It has to do with "One gets into trouble generally in programming when
one thinks one thing is going on [...] whilst ignoring what's really
going on [...]." And of course it has to do with a specialty of int
math in Java which is probably considered too rarely.

Kind regards

robert
 
N

Niklas Holsti

Arne said:
Ada is not completely dead.

But its usage has declined to a small fraction of what it once was.

True when considering the proportion of Ada usage versus all other
programming languages, because of the huge expansion in internet and web
applications that rarely use Ada. But I doubt that Ada usage in absolute
terms has declined so radically.

It is difficult to find reliable figures on language usage, but Ada has
been in the "top 20" category on the TIOBE Programming Community index
for several years, with ups and downs in its rating. See
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html.

Whatever the current level of usage, my main point is that Ada is
*alive* in the sense that good compilers are readily available and the
language is steadily evolving to provide modern features while remaining
true to its original design goals.

(Apologies for promoting Ada in a Java group, but I felt I had to
protest against the OP's claims of deadness, even if these claims were
incidental to the OP's question.)
 
A

Arved Sandstrom

Andreas said:
Arved Sandstrom said:
Lew wrote:
JLS 10: [...]
IOW, you never index arrays with a byte, char or short. [...]
I'm feeling slow this morning: how do we get problems with byte->int,
short->int, or char->int again?

One gets problems with byte->int and short->int with array indexes the way one gets into trouble with those widening conversions generally. The usual suspect is the lack of unsigned versions, so widening a (byte)0xA0, for example, would result in a negative index.

One gets into trouble generally in programming when one thinks one thing is going on ("index takes a byte") whilst ignoring what's really going on (there's a widening conversion involved). You might get away with it most of the time, but occasionally such things trip you up.

Why think about it imprecisely? Tell us the advantage of that, please.
array-indexing takes integers. That you can also throw byte/short/char
values at them without cast doesn't mean that arrays "take" those, but
only that such values get automatically converted to int before being
taken by the array for indexing.

An example for something really "taking" byte-values:
foo(byte b) { ... }
I'll try my answer again, it seems not to have hit the airwaves.

I don't see my imprecision: the JLS very clearly says that integral
widening primitive conversions result in *no* loss of information; this
is just common sense when you consider what's going on with char->int,
byte-int and short->int.

I reiterate: the _JLS_ says that there is *no* loss of information. I
don't see why you believe the JLS is wrong. Exactly how are you getting
into trouble with integral widening primitive conversions?

Furthermore, your assertion about indexing into an array with (byte)0x0A
(or any other legal byte value for that matter) is just plain incorrect,
and can be disproved in a minute or two. Indexing into any array with
(byte)0x0A returns the 11th element of a (sufficiently large) array.

AHS
 
R

RedGrittyBrick

Andreas said:
Lew wrote:
JLS 10: [...]
IOW, you never index arrays with a byte, char or short. [...]
I'm feeling slow this morning: how do we get problems with byte->int,
short->int, or char->int again?

One gets problems with byte->int and short->int with array indexes the way one gets into trouble with those widening conversions generally. The usual suspect is the lack of unsigned versions, so widening a (byte)0xA0, for example, would result in a negative index.

One gets into trouble generally in programming when one thinks one thing is going on ("index takes a byte") whilst ignoring what's really going on (there's a widening conversion involved). You might get away with it most of the time, but occasionally such things trip you up.

Why think about it imprecisely? Tell us the advantage of that, please.
array-indexing takes integers. That you can also throw byte/short/char
values at them without cast doesn't mean that arrays "take" those, but
only that such values get automatically converted to int before being
taken by the array for indexing.

An example for something really "taking" byte-values:
foo(byte b) { ... }
I'll try my answer again, it seems not to have hit the airwaves.

I don't see my imprecision: the JLS very clearly says that integral
widening primitive conversions result in *no* loss of information; this
is just common sense when you consider what's going on with char->int,
byte-int and short->int.

I reiterate: the _JLS_ says that there is *no* loss of information. I
don't see why you believe the JLS is wrong. Exactly how are you getting
into trouble with integral widening primitive conversions?

Furthermore, your assertion about indexing into an array with (byte)0x0A
(or any other legal byte value for that matter) is just plain incorrect,
and can be disproved in a minute or two. Indexing into any array with
(byte)0x0A returns the 11th element of a (sufficiently large) array.

I think you misread Lew, he said (byte)0xA0 (not your 0x0A) doesn't get
you the 161st element of a (sufficiently large) array.
 
M

markspace

On 8/17/2011 4:48 PM, Andreas Leitgeb wrote:
...
Rather: arr[ (n%arr.length + arr.length) % arr.length ]

I believe the root cause of the problem may have been thinking of % as a
modulo operator when it is really the remainder operator. Has any
thought been given to create Math.mod, for both int and long, with
implementation as above?


I think I suggested a modulo operator %% some time ago. It would be
handy. The derivation of the formula above escapes me.
 
L

Lew

RedGrittyBrick said:
Arved said:
Lew said:
Andreas Leitgeb wrote:
Lew wrote:
JLS 10: [...]
IOW, you never index arrays with a byte, char or short. [...]
I'm feeling slow this morning: how do we get problems with byte->int,
short->int, or char->int again?

One gets problems with byte->int and short->int with array indexes the way one gets into trouble with those widening conversions generally. The usual suspect is the lack of unsigned versions, so widening a (byte)0xA0, for example, would result in a negative index.

One gets into trouble generally in programming when one thinks one thing is going on ("index takes a byte") whilst ignoring what's really going on(there's a widening conversion involved). You might get away with it mostof the time, but occasionally such things trip you up.

Why think about it imprecisely? Tell us the advantage of that, please.

array-indexing takes integers. That you can also throw byte/short/char
values at them without cast doesn't mean that arrays "take" those, but
only that such values get automatically converted to int before being
taken by the array for indexing.

An example for something really "taking" byte-values:
foo(byte b) { ... }
I'll try my answer again, it seems not to have hit the airwaves.

I don't see my imprecision: the JLS very clearly says that integral
widening primitive conversions result in *no* loss of information; this

The imprecision is that array indexes do not take byte, short, char or longindexes, only int, so saying that they do is imprecise. Ignoring that there is a widening conversion is imprecise.

The precise statement is that array indexes take an int value. Providing adifferent type results in a conversion. I did not speak to loss of information.

Who said that I believe the JLS is wrong?

Already answered that.

Not my assertion.
I think you misread Lew, he said (byte)0xA0 (not your 0x0A) doesn't get
you the 161st element of a (sufficiently large) array.

He also failed to answer my question: What is the advantage of thinking imprecisely in this matter?

Stating (let alone wrongly) that there is no disadvantage is not the same as stating that there is an advantage.
 
A

Andreas Leitgeb

markspace said:
On 8/17/2011 4:48 PM, Andreas Leitgeb wrote:
...
Rather: arr[ (n%arr.length + arr.length) % arr.length ]

I believe the root cause of the problem may have been thinking of % as a
modulo operator when it is really the remainder operator. Has any
thought been given to create Math.mod, for both int and long, with
implementation as above?

I think I suggested a modulo operator %% some time ago. It would be
handy. The derivation of the formula above escapes me.

Actually, now that I review it, it suffers some weakness:
if arr.length exceeds 2^30, then the formula may suffer from
wraparound for certain values of n

Apart from that, it is a dirty hack to avoid conditionals. It relies on
that the remainder operation for a positive divisor N "almost does the
right thing" and just picks a representative of each modulo-class that
is by N too small, if divident was negative and not a multiple of N.
Adding N to the remainder pushes the result to range [1 .. 2*N-1] while
preserving the modulo-class. Then, redoing the remainder-op yields the
expected range [0 .. N-1].
Alternatively, one could check the first remainder for its sign, and
eventually add N. That would also avoid the possible overflow.
 
M

markspace

Apart from that, it is a dirty hack to avoid conditionals.


One thing to keep in mind here I think is the relative execution time
for each operation:

1. % and / are division. Some CPUs execute this as quickly addition and
subtraction, some require a large number of CPU cycles.

2. Conditionals tend to be a subtraction, followed by a branch. For a
simple branch like this:

(n < 0) ? -n : n

Might be implemented like this

CMPZ n ;; these are not byte codes, just psudeo-x86
JLT +2
NEG n
... ;; result of JLT

In other words, there's little opportunity for pipeline stall here. The
instruction after NEG n is executed regardless, and should be in the
pipe no matter what. I'm pretty sure that most CPUs will deal with
executing one instruction like NEG conditional easily (most will compute
the result and then just not use it if the conditional JLT is taken).

So preferring modulus to ?: or Math.abs() might be a misoptimization.
 
A

Andreas Leitgeb

markspace said:
Apart from that, it is a dirty hack to avoid conditionals.

One thing to keep in mind here I think is the relative execution time
for each operation: [...]

So preferring modulus to ?: or Math.abs() might be a misoptimization.

Math.abs() does an entirely wrong job for the original context.

Now, let's say N is the array's length, and n the result
from the latest increment:

( n % N + N ) % N
versus
( n % N < 0 ) ? n % N + N : n % N
or with an helper var r:
( ( r = n % N ) < 0 ) ? r + N : r

PS: I didn't previously say *why* I avoided conditionals.
 
A

Arved Sandstrom

[ SNIP ]

Fair cop, I am semi-dyslexic I guess. I don't think it's a particularly
valid point - arguing that someone might screw up and not know what the
valid range is for bytes or shorts, for example, and not argue that this
numerically-challenged person will also have problems with ints, is
somewhat disingenuous.

Or is it just that there's so much more room with int that a sloppy
person is much less likely to have problems? That approach doesn't fill
me with confidence.

Furthermore, once this hypothetical coder starts getting negative array
indexes the odds are pretty decent that they may figure out what the
issue is.
He also failed to answer my question: What is the advantage of thinking imprecisely in this matter?

Why did you ask the question? I wasn't thinking imprecisely. I know that
byte/short/char values will get widened when used as array indexes.
Personally I'll also ensure that these values are valid
bytes/shorts/chars...just like I would do for int.

Where's the imprecise thinking? My original reply didn't mention
anything whatsoever about array indexes _not_ being ints: in fact I
started talking about widening conversions right away. If your original
byte/short/char value is valid, then you've got no problems at all. If
you want to suppose that people are using invalid values, well, hell,
that's a bigger problem anyhow.
Stating (let alone wrongly) that there is no disadvantage is not the same as stating that there is an advantage.
I agree.

AHS
 
M

markspace

markspace said:
Apart from that, it is a dirty hack to avoid conditionals.

One thing to keep in mind here I think is the relative execution time
for each operation: [...]

So preferring modulus to ?: or Math.abs() might be a misoptimization.

Math.abs() does an entirely wrong job for the original context.

Now, let's say N is the array's length, and n the result
from the latest increment:

( n % N + N ) % N
versus
( n % N< 0 ) ? n % N + N : n % N
or with an helper var r:
( ( r = n % N )< 0 ) ? r + N : r

PS: I didn't previously say *why* I avoided conditionals.


Yeah, your requirements are unspecified. I was thinking:

arr[ Math.abs( n % N ) ]

Would at least prevent overflow, although I don't think it would provide
a strictly incrementing index for arr.

Or you could just strip off the sign bit:

arr[ (n&0x7FFFFFFF) % N ]

That gets rid of those pesky negative numbers right quick. Though I'm
still not convinced that this is "better."
 
A

Andreas Leitgeb

markspace said:
Math.abs() does an entirely wrong job for the original context.
[...]

Yeah, your requirements are unspecified. I was thinking:
arr[ Math.abs( n % N ) ]
Would at least prevent overflow, although I don't think it would provide
a strictly incrementing index for arr.

The situation wasn't mine. It seemed to stem from someone's
requirement to use the slots of an array in an LRU-way (least
recently used), which boils down to just cycle through them.
Or you could just strip off the sign bit:
arr[ (n&0x7FFFFFFF) % N ]

If N is a power of 2, then you get by with just masking with (N-1).
Otherwise, the array will not always be cyclically used, anyway.
 
R

Robert Klemme

I'd be shocked if an int actually overflowed when counting requests like
this, but that's all I can think of. 2^31 is such a huge number, I
didn't think you could get there in practical cases.

He *did* write "two months of uninterrupted [...] usage." Two
months ~= 60 days ~= 5 megaseconds, so a 2G counter overflows if it
increments at ~400 Hz or more, give or take a few backs of envelopes.

Exactly. A good example where it pays off to do the basic math instead
of relying on gut feelings. :)
Assuming that the AtomicInteger did overflow, then n % y produces
negative numbers if n is negative, a clear AIOOB.

Try : arr[ Math.abs( n ) % arr.length ];

Bzzzzt! Thank you for playing, go back to the rear of the line.
While awaiting your next opportunity, ponder: What is the sign of
Math.abs(Integer.MIN_VALUE)?

All solutions to fix remainder into modulo but keep the wrap around
which have been proposed so far share a common disadvantage: if the
number range is not a multiple of the divisor the value distribution
will not be uniform. Example:

number range: [0,3]
divisor: 3

value sequence:
0 : 0
1 : 1
2 : 2
3 : 0
wrap to 0

0: 50%
1: 25%
2: 25%

Now with Integer.MAX_VALUE and realistic values for the divisor which
are usually much smaller than Integer.MAX_VALUE the effects are not so
dramatic but the repetition of a single value (0 in the example) might
cause trouble for a round robin scheme - even if only once every two
months (in other installations that would mean once every two weeks or
even shorter).

The solution I used to fix this employed a do while loop and
compareAndSet() to reset the value to the beginning of the range (0 in
this case) in a thread safe manner.

http://download.oracle.com/javase/6...ic/AtomicInteger.html#compareAndSet(int, int)

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top