NegativeArraySizeException ... IndexOutOfBoundsException ...

  • Thread starter Albretch Mueller
  • Start date
A

Albretch Mueller

~
java only uses int indexes which is giving me a hard time since I
need to index values over Integer.MAX_VALUE. The size of the data
array itself is not a problem at all and I am using a BitSet
~
I need indexes greater than Integer.MAX_VALUE in connection to
dynamic sieves to calculate prime numbers. You are gonna get bitten by
these exceptions if you go as far as:
~
( 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 ) = 6469693230
~
That number is risibly small but you cannot use it as an index in
Java ...
~
How do you deal with such issues?
~
Thanks
lbrtchx
 
A

Arne Vajhøj

~
java only uses int indexes which is giving me a hard time since I
need to index values over Integer.MAX_VALUE. The size of the data
array itself is not a problem at all and I am using a BitSet
~
I need indexes greater than Integer.MAX_VALUE in connection to
dynamic sieves to calculate prime numbers. You are gonna get bitten by
these exceptions if you go as far as:
~
( 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 ) = 6469693230
~
That number is risibly small but you cannot use it as an index in
Java ...
~
How do you deal with such issues?
~

There are no super good solution. They mad a mistake 15 years
ago in the design and did not make array indexes long.

The most obvious workaround must be to segment the
data in multiple arrays.

Arne
 
M

markspace

Albretch said:
I need indexes greater than Integer.MAX_VALUE in connection to
dynamic sieves to calculate prime numbers. You are gonna get bitten by
these exceptions if you go as far as:


If your calculating really large primes, I suspect you'll need
BigInteger eventually. (Hint: BigInteger has a nextProbablePrime()
method. Cool, huh?)

How do you deal with such issues?

Probably

Map<BigInteger,something> ....

Or a Set<BigInteger>. But I guess you might need a LinkedList or
TreeMap too. Don't forget about LinkedHashMap.
 
A

Arne Vajhøj

If your calculating really large primes, I suspect you'll need
BigInteger eventually. (Hint: BigInteger has a nextProbablePrime()
method. Cool, huh?)


Probably

Map<BigInteger,something> ....

You don't think that uses an array as backing storage?
Or a Set<BigInteger>. But I guess you might need a LinkedList or TreeMap
too. Don't forget about LinkedHashMap.

Those probably don't, but big O for access by index is bad.

Arne
 
J

John B. Matthews

Albretch Mueller said:
~
java only uses int indexes which is giving me a hard time since I
need to index values over Integer.MAX_VALUE. The size of the data
array itself is not a problem at all and I am using a BitSet
~
I need indexes greater than Integer.MAX_VALUE in connection to
dynamic sieves to calculate prime numbers. You are gonna get bitten by
these exceptions if you go as far as:
~
( 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 ) = 6469693230
~
That number is risibly small but you cannot use it as an index in
Java ...
~
How do you deal with such issues?

If only one bit is needed for each sieve element, you can pack 32 in an
int to get 2^63 elements, which is indexable by long.

Alternatively, this looks promising: Dunten, Jones, Sorenson, "A
Space-Efficient Fast Prime Number Sieve."
 
M

markspace

Arne said:
You don't think that uses an array as backing storage?


Well, I'm assuming he's not storing every single integer, just some of
them. Maybe only primes, since non-primes could be determined by being
absent from the list.
 
R

rem

~
 java only uses int indexes which is giving me a hard time since I
need to index values over Integer.MAX_VALUE. The size of the data
array itself is not a problem at all and I am using a BitSet
~
 I need indexes greater than Integer.MAX_VALUE in connection to
dynamic sieves to calculate prime numbers. You are gonna get bitten by
these exceptions if you go as far as:
~
 ( 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 ) = 6469693230
~
 That number is risibly small but you cannot use it as an index in
Java ...
~
 How do you deal with such issues?
~
 Thanks
 lbrtchx

Well, if you really need such big arrays you can model them using
several small arrays that fit in Integer.MAX_VALUE.

Related to the original problem of finding prime numbers with sieve.
Can you describe the problem in more details? Why do you multiply all
primes up to 29?

If the problem is to find all primes up to some N greater than
Integer.MAX_VALUE then you can split interval 1..N into smaller
subintervals of size for example <= 1000000, and then apply sieve to
each such interval. To apply sieve to interval A..B you need to know
all primes upto sqrt(B), so initially precalculate all primes upto sqrt
(N).
 
A

Albretch Mueller

If your calculating really large primes, I suspect you'll need
You don't think that uses an array as backing storage? ~
;-)
~
Dunten, B., Jones, J., Sorenson, J. P.: A space-efficient fast prime number sieve.
~
Unfortunately I couldn't have access to that article (I am not an
institutionalized person)
~
Well, I'm assuming he's not storing every single integer
~
Well, your assumptions are right, but you will need to somehow "mark"
your sieve and for that you need "every single integer" within a sieve
window. Don't you?
~
Can you describe the problem in more details? Why do you multiply all primes up to 29?
~
The non-naive (faster/more efficient ;-)) algorithm I am coding uses
as sieve windows the gap between the squares of consecutive primes,
which you clobbered by multiplying consecutive primes
~
This discussion started with comp.lang.java.programmer:
"java.lang.Rational" I will publish the algorithm/results of my code
within a week. Some of use believe such a library should be included
by Sun
~
I think I will have to deal with this annoying "only ints for array
indexes" thing from sun by using a mappedbuffered file and marking the
bits within each byte
~
lbrtchx
 
T

Tom Anderson

~
Unfortunately I couldn't have access to that article (I am not an
institutionalized person)

You don't have to be:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.3936&rep=rep1&type=pdf
~
Well, your assumptions are right, but you will need to somehow "mark"
your sieve and for that you need "every single integer" within a sieve
window. Don't you?

Ah, there might be a more efficient representation of the windows where
you just store a base and a size. Keep them in an interval tree. That
gives you a compact representation, but still pretty fast lookup.
This discussion started with comp.lang.java.programmer:
"java.lang.Rational" I will publish the algorithm/results of my code
within a week. Some of use believe such a library should be included by
Sun

The problem of a bigger-than-int list has certainly come up before, and a
general-purpose implementation would be useful. In fact, i'd be mildly
surprised if there wasn't already one in either the Apache or Google
collections libraries - have you checked those?

tom
 
J

John B. Matthews

"John B. Matthews said:
If only one bit is needed for each sieve element, you can pack 32 in an
int to get 2^63 elements, which is indexable by long.

Oops, you can pack 32 in an int to get (2^5)(2^31) elements or 64 in a
long to get (2^6)(2^31) elements.
 
T

Tom Anderson

The problem of a bigger-than-int list has certainly come up before, and
a general-purpose implementation would be useful. In fact, i'd be mildly
surprised if there wasn't already one in either the Apache or Google
collections libraries - have you checked those?

Hmm, it seems they don't. Nor does Javolution.

If anyone's interested in writing a long-indexed list, i've had a quick
cut at the groundwork (compiled, but neither tested nor documented!):

http://urchin.earth.li/~twic/Code/BigCollections/

All you'd need to do to implement a big list would be to subclass
AbstractBigList and fill in the blanks.

tom
 
A

Albretch Mueller

~
Now, I wonder if you will go LOL (as I did) or cry about this. As a
straightforward way to index over Integer.MAX_VALUE I wishfully
thought of using a Memory-mapped file as some sort of (very expensive)
buffer, but well sun microsystems is at least consistent in their
nonsense
~
Using a Memory-mapped file and doing the bit crunching I will have my
cake and it is too (at least for all 10 primes less than 30) since
~
( 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 ) = 6469693230 <
17179869176
~
but I will be in for some voodoo and and hax to reach out passed that
number
~
Specially now that 64-bit machines are common, if any of you has a
say at sun could you please raise this issue with them? Yeah sure I
can switch to ANSI C/C++ and leave this whole nonsense behind but I
want to code/test implementations in all three c-fam languages
~
lbrtchx
~
$ java MemMappedIO00Test
/media/hdb2/tmp/test.dat
// __ Setting up Random Access Mapped Byte Buffer for 25769803760 bits
Exception in thread "main" java.lang.IllegalArgumentException: Size
exceeds Integer.MAX_VALUE
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:704)
at MemMappedIO00Test.main(MemMappedIO00Test.java:52)
~
/*
// clobbered from thinking_in_java/TIJ314_029.htm
*/

import java.io.*;
import java.util.*;
import java.text.*;
import java.nio.*;
import java.nio.channels.*;


// __
public class MemMappedIO00Test{
private static final String aNwLn = System.getProperty
("line.separator");

private static final String aEnc = "UTF-8";
// __
private static final void setOutErr(File ODir, String aPrfx, long
lTm00, boolean IsOut, boolean IsErr){
String aLogFl = aPrfx + "_" + (new SimpleDateFormat
("yyyyMMddHHmmss")).format(new Date(lTm00));
try{
if(IsOut){ PrintStream POS = new PrintStream((new FileOutputStream
(new File(ODir, aLogFl + ".out.txt"))), true, aEnc); System.setOut
(POS); }
if(IsErr){ PrintStream PES = new PrintStream((new FileOutputStream
(new File(ODir, aLogFl + ".err.txt"))), true, aEnc); System.setErr
(PES); }
}catch(FileNotFoundException FNFX){ FNFX.printStackTrace(); }
catch(IOException IOX){ IOX.printStackTrace(); }
}

// __
public static void main(String[] args) throws Exception{
// __
long lTm00 = System.currentTimeMillis();
String aKNm = "MemMappedIO00Test";
File ODir = new File(new File((new File(".").getAbsolutePath
())).getParent());
setOutErr(ODir, aKNm, lTm00, true, false);
// __
ODir = new File(".");
String aOFl = "_.dat";

File OFl = new File(ODir, aOFl);

System.err.println(OFl.getAbsolutePath());
long lL = (long)Integer.MAX_VALUE;
lL += (long)Integer.MAX_VALUE/2;
System.out.println("// __ Setting up Random Access Mapped Byte
Buffer for " + (8*lL) + " bits");
System.err.println("// __ Setting up Random Access Mapped Byte
Buffer for " + (8*lL) + " bits");
// __
MappedByteBuffer out = new RandomAccessFile(OFl, "rw").getChannel
().map(FileChannel.MapMode.READ_WRITE, 0, lL);
for(int i = 0; i < lL; ++i){ out.put((byte)255); }
// __
long lTm02 = System.currentTimeMillis();
System.out.println("// __ in " + (lTm02 - lTm00) + " (ms)");
System.err.println("// __ in " + (lTm02 - lTm00) + " (ms)");
// __
}
}
 
J

Joshua Cranmer

There are no super good solution. They mad a mistake 15 years
ago in the design and did not make array indexes long.

15 years ago, I don't think computers existed that could hold enough
memory to make a 63-bit index anything more than redundant. IIRC, it
wasn't until the beginning of this decade or so that arrays grew large
enough to uncover a bug in their binary search implementation. So, 15
years ago, it wasn't a mistake.
 
A

Arne Vajhøj

15 years ago, I don't think computers existed that could hold enough
memory to make a 63-bit index anything more than redundant. IIRC, it
wasn't until the beginning of this decade or so that arrays grew large
enough to uncover a bug in their binary search implementation. So, 15
years ago, it wasn't a mistake.

I think it was.

Because when designing a language one should not just consider
the requirements for today but also the requirements for the
future for all the years the language is expected to be used.

If we say that memory sizes double every two years, then
it is easy to make a prediction.

BTW, even long would be too small for indexes if Java
will be used after 2074, but somehow I doubt that would
be the case. And besides we do not have the verylong datatype
yet.

Arne
 
A

Arne Vajhøj

Well, I'm assuming he's not storing every single integer, just some of
them. Maybe only primes, since non-primes could be determined by being
absent from the list.

Usually sieve means all, but maybe you are right.

Arne
 
M

markspace

Arne said:
Usually sieve means all, but maybe you are right.


I don't think so, this time. I built a small test program and the
performance was horrible. I do wonder if it was technically a sieve at all.

For posterity, here's a lousy way to sieve primes.



public static void sieve1()
{
BigInteger num = BigInteger.ZERO;
final BigInteger LIMIT = BigInteger.ONE.shiftLeft( 16 );
List<BigInteger> primes = new LinkedList<BigInteger>();
int loopCount = 0;
long startTime = System.nanoTime();
mainloop:
while( num.compareTo( LIMIT ) < 0 )
{
if( ++loopCount % 1000 == 0 )
{
System.out.println( primes.size() + " primes @ " +
((System.
nanoTime() - startTime) / 1000000) / 1000.0f +
"s" );
}
num = num.nextProbablePrime();
for( BigInteger prime : primes )
{
if( num.remainder( prime ).equals( BigInteger.ZERO ) )
{
continue mainloop;
}
}
primes.add( num );
}
System.out.println( primes.size() );
}
 
M

Maarten Bodewes

Arne said:
BTW, even long would be too small for indexes if Java
will be used after 2074, but somehow I doubt that would
be the case. And besides we do not have the verylong datatype
yet.
You are expecting memory sizes of 9,223,372,036,854,775,807 bytes????

That's 9,223 PETA bytes. Hmm, weird, may happen. But it is certainly
rather large.

Maarten
 
A

Arne Vajhøj

You are expecting memory sizes of 9,223,372,036,854,775,807 bytes????

That's 9,223 PETA bytes. Hmm, weird, may happen. But it is certainly
rather large.

In 2074 ? Yes !

Arne
 
R

Roedy Green

~
java only uses int indexes which is giving me a hard time since I
need to index values over Integer.MAX_VALUE.

Even with a BitSet, that is one heck of a lot of RAM. If the list is
sufficiently sparse it may be more efficient to use a HashSet.

The docs say you can index with any positive number. Have you got a
SSCCE to demonstrate that does not work? See
http://mindprod.com/jgloss/sscce.html

If it can't handle it, you can create a FatBitSet that uses an array
of BitSets to simulate a big BitSet indexed by long.
 
W

Wojtek

Arne Vajhøj wrote :
In 2074 ? Yes !

I am curious. Just what WOULD need such a large index? Every sand grain
on earth? Every star in every galaxy in the universe?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top