Creating a byte[] of long size

B

Boris Punk

long size = Integer.MAX_VALUE+1;
byte [] b = new byte[size];

-possible loss of precision

How can we make an array of long size?
 
E

Eric Sosman

long size = Integer.MAX_VALUE+1;
byte [] b = new byte[size];

-possible loss of precision

How can we make an array of long size?

Array elements are indexed from [0] through [Integer.MAX_VALUE-1],
and that's that. You can make an array whose elements are larger
than a single byte (`new long[N]', for example) and thus get an array
of more than Integer.MAX_VALUE-1 bytes, but you cannot have more than
that many elements.

Incidentally, you might want to print the value of `size' that
you calculate in your example. It might surprise you ...

(By the way: Yes, I really did intend the `-1's above. An int
index value could go one higher, but how could you create the array?
In `new byte[N]', N cannot exceed Integer.MAX_VALUE, which means that
the index cannot exceed N-1, or Integer.MAX_VALUE-1. You could try
to write `new byte[] { 0,0,0,... }' and so on for two giga-values,
but you'd exceed .class file limits long before you got there.)
 
E

Eric Sosman

long size = Integer.MAX_VALUE+1;
byte [] b = new byte[size];

-possible loss of precision

How can we make an array of long size?

Array elements are indexed from [0] through [Integer.MAX_VALUE-1],
and that's that. You can make an array whose elements are larger
than a single byte (`new long[N]', for example) and thus get an array
of more than Integer.MAX_VALUE-1 bytes, but you cannot have more than
that many elements.
[...]
(By the way: Yes, I really did intend the `-1's above.[...]

Oh, drat. I really did mean the first one, but not the second.
It's been really hot here for the last few days, and my brain is
starting to resemble a poached egg.
 
L

Lew

Boris said:
long size = Integer.MAX_VALUE+1;
byte [] b = new byte[size];

-possible loss of precision

How can we make an array of long size?

You can't.

Why do you want to?

From the JLS, which I strongly urge you to study:
"The type of each dimension expression within a DimExpr must be a type
that is convertible (§5.1.8) to an integral type, or a compile-time
error occurs. Each expression undergoes unary numeric promotion (§).
The promoted type must be int, or a compile-time error occurs; this
means, specifically, that the type of a dimension expression must not
be long."
15.10 Array Creation Expressions

Didn't you get a compiler error?

You can make some other data structure that would hold that much,
assuming you have the address space for it. For sure in a 32-bit
machine you'd have trouble even with 'new byte [Integer.MAX_VALUE]'.

But really, why do you want to?
 
B

Boris Punk

Integer.MAX_VALUE = 2147483647

I might need more items than that. I probably won't, but it's nice to have
extensibility.
 
E

Eric Sosman

Integer.MAX_VALUE = 2147483647

I might need more items than that. I probably won't, but it's nice to have
extensibility.

Then Java is not the language for you. Arrays have int sizes.
Not even Collections can get you out of the woods, because their
..size() method must return an int.

I suppose you could write your own BigList class, with a .size()
method returning long and with .get() and .set() taking long arguments.
Such a class could not implement the List interface (because the method
signatures wouldn't be right), but you could probably implement
Iterable.

Or, you could have BigList implement List but "lie" in its .size()
method, in somewhat the same way TreeSet "lies" about the Set contract.
Then you'd add .realSize(), .realGet(), and .realSet() methods to deal
with the long values (I've probably missed a few).

... but I think you'll need a stronger motivation than "it's nice"
to justify the work, and the resulting ugliness. Your call, though.
 
B

Boris Punk

Eric Sosman said:
Then Java is not the language for you. Arrays have int sizes.
Not even Collections can get you out of the woods, because their
.size() method must return an int.

I suppose you could write your own BigList class, with a .size()
method returning long and with .get() and .set() taking long arguments.
Such a class could not implement the List interface (because the method
signatures wouldn't be right), but you could probably implement
Iterable.

Or, you could have BigList implement List but "lie" in its .size()
method, in somewhat the same way TreeSet "lies" about the Set contract.
Then you'd add .realSize(), .realGet(), and .realSet() methods to deal
with the long values (I've probably missed a few).

... but I think you'll need a stronger motivation than "it's nice"
to justify the work, and the resulting ugliness. Your call, though.

Is there no BigList/BigHash in Java?
 
A

Arne Vajhøj

Integer.MAX_VALUE = 2147483647

I might need more items than that. I probably won't, but it's nice to have
extensibility.

It is a lot of data.

I think you should assume YAGNI.

Arne
 
A

Arne Vajhøj

From the JLS, which I strongly urge you to study:

Unless the poster has a solid programming experience,
then the JLS may not be the best to study.

Sure it is by definition correct, but it is written
to be detailed and correct not to be easy to read.

Arne
 
A

Arne Vajhøj

Historically, each memory size has gone through a sequence of stages:

1. Nobody will ever need more than X bytes.

2. Some people do need to run multiple jobs that need a total of more
than X bytes, but no one job could possibly need that much.

3. Some jobs do need more than X bytes, but no one data structure could
possibly need that much.

4. Some data structures do need more than X bytes.

Any particular reason to believe 32 bit addressing will stick at stage
3, and not follow the normal progression to stage 4?

I am absolutely sure that 64 bit array indexes will be needed and
that it will not take so many years.

But that is not the same as that the app the original poster
is working on will need it.

Arne
 
L

Lew

Unless the poster has a solid programming experience,
then the JLS may not be the best to study.

Sure it is by definition correct, but it is written
to be detailed and correct not to be easy to read.

Well, boo-hoo-hoo, programming is hard! Waaaahhh!
 
K

Kevin McMurtrie

Wayne said:
To me, it is unlikely your system will run well if this one data structure
consumes 2G of memory. (You didn't really state the application or system;
certainly there are exceptions to the rule.) I would suggest you use a
more flexible system, where you keep the data on storage (disk) and use
memory as a cache. Perhaps an ArrayList of soft references would work well.
It might even be possible in your particular case to run a daemon thread
that pre-fetches items into the cache.

Keep in mind a modern general-purpose computer will use virtual memory,
typically with 4kiB pages. Any data structure larger than that will
likely end up swapped to disk anyway. If you need the semantics of
a "BigList", try a custom class, a List of <pagesize> lists with
appropriate set and get methods to access the items.

Questions like yours are missing context. If you want a good answer,
you need to post the problem you are really trying to solve, rather
than posting a question about how to implement the solution you've
already decided on.

Hope this helps!

24GB of RAM is a standard server configuration this year. Even my
laptop has 8GB and can only run 64 bit Java. A Java array indexing
limit of 2147483647 is a growing problem, not a future problem.

Multiplexing to smaller arrays through a class isn't a great solution.
First, it's unlikely that an application needing a 2+ GB array can
tolerate the performance hit of not using an array directly. Some
critical JIT optimizations for memory caching and range checking won't
work because of the multiplexing logic. Second, such a class could not
be compatible with anything else because it can't support the Collection
design. Oracle can't define "Collection64 extends Collection" and be
done with it because such a design can not be compatible in Java.
 
M

Mike Schilling

Arne Vajhøj said:
No.

But You can have a List<List<X>> which can then
store 4*10^18 X'es.

Or you could pretty easily build a class like

public class BigArray<T>
{
T get(long index);
void set(long index, T value);
}

backed by a two-dimensional array.[1] The reason I prefer to say "array"
rather than "List" is that random access into a sparse List is a bit dicey,
while arrays nicely let you set index 826727 even if you haven't touched any
of the earlier indices yet, and will tell you that the entry at 6120584 is
null, instead of throwing an exception.

1. Or two three-dimensional arrays, if you won't settle for 62 bits of
index.
 
M

Mike Schilling

Arne Vajhøj said:
Unless the poster has a solid programming experience,
then the JLS may not be the best to study.

Sure it is by definition correct,

mod typos and misstatements, of course.
 
R

Roedy Green

long size = Integer.MAX_VALUE+1;
byte [] b = new byte[size];

-possible loss of precision

How can we make an array of long size?

Longs are 64 bits. So if you want a byte array big enough to contain
a long, you need new byte[8].

Arrays can only be indexed by ints, not longs. Even if they were,
even Bill Gates could not afford enough RAM for an array of bytes, one
for each possible long.
 
R

RedGrittyBrick

You have a list with up to 2*10^9 elements of type List<X> that

Thunderbird displays that rather nicely.
each can contain up to 2^10^9 elements of type X.

But that looks pretty weird. ITYM 2*10^9

Which demonstrates that attention to typography by Thunderbird's
programmers helps proof-reading by it's users :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,525
Members
44,997
Latest member
mileyka

Latest Threads

Top