Array Initial data gives code to large error

D

dcook

Hello,

Wondering if someone may have a solution for my problem with Initial
Data for object arrays. I have a 2D vector animator program that
auto-generates data for Java code. All the vector objects data is
output for a structures which I have converted to data classes. My
problem is the data can get too large for Java and I get a "code to
large" error from the compiler. So if the 2D project has something
like 5,000 box vectors in it, it outputs 4 coordinate values top, left,
bottom, right and a string for the box's name which is loaded into a
box class that is part of a box class array of 5,000. At first, I tried
something like this;

public class project_data {
public static final int box_count = 5,000;
public static box[] box_array = { new box(0,0,10,10,"box1"), new
box(20,20,40,40,"box2"), ....};
}

I couldn't have very many boxes in the array before I got a "code
to large" error so I then changed it to output the data as a string
array that I can then parse into data which increased the amount of
objects I can have. I created separated data classes for all data
types. I had to make them string arrays because it is mixed data.
Example;

public class box_data {
public static final String[] data = { "0,0,10,10,\"box1\"",
"20,20,40,40,\"box2\"", ...};

public static box[] box_data(int count) {
box[] dobj = new box[count];
for (int i = 0; i < count; i++) {
dobj = new box(data);
}
return dobj;
}
}

Class project_data {
public static final int box_count = 5,000;
public static box[] box_array = box_data(count);
}

Is there another way I can define the data to get past this limit in
Java or any suggestion of a better way so I can support more objects.

Thanks,

Dave
 
J

John C. Bollinger

Wondering if someone may have a solution for my problem with Initial
Data for object arrays. I have a 2D vector animator program that
auto-generates data for Java code. All the vector objects data is
output for a structures which I have converted to data classes. My
problem is the data can get too large for Java and I get a "code to
large" error from the compiler. So if the 2D project has something
like 5,000 box vectors in it, it outputs 4 coordinate values top, left,
bottom, right and a string for the box's name which is loaded into a
box class that is part of a box class array of 5,000. At first, I tried
something like this;

public class project_data {
public static final int box_count = 5,000;
public static box[] box_array = { new box(0,0,10,10,"box1"), new
box(20,20,40,40,"box2"), ....};
}

I couldn't have very many boxes in the array before I got a "code
to large" error so I then changed it to output the data as a string
array that I can then parse into data which increased the amount of
objects I can have. I created separated data classes for all data
types. I had to make them string arrays because it is mixed data.
Example;

public class box_data {
public static final String[] data = { "0,0,10,10,\"box1\"",
"20,20,40,40,\"box2\"", ...};

public static box[] box_data(int count) {
box[] dobj = new box[count];
for (int i = 0; i < count; i++) {
dobj = new box(data);
}
return dobj;
}
}

Class project_data {
public static final int box_count = 5,000;
public static box[] box_array = box_data(count);
}

Is there another way I can define the data to get past this limit in
Java or any suggestion of a better way so I can support more objects.

Write the object data to a file instead of generating Java code with it.
Have your program read the file at startup and create the necessary
objects. This will get around your code size problem very nicely, and
will also solve the problem of needing to have a new program for every
model.
 
J

James McGill

Is there another way I can define the data to get past this limit in
Java or any suggestion of a better way so I can support more objects.

Do you get further if you don't declare all your data "static?"

I'm guessing your JVM has to put all that static data into a "text"
segment. (how this is done, is largely up to the JVM implementation.)

You might be able to tune your JVM, if you really want your program to
work the way it's written. Or you might be able to put that data on the
heap, and be a lot less crowded.

What I take from this, is that there is apparently some hard limit on
class data, which means there's an effective limit on a class size.
I've never run into this or even considered it before.

I have to assume you know what you're doing. What's the rationale for
making all this data static, and does it still break if you make it
volatile?


James
 
C

Chris Uppal

Is there another way I can define the data to get past this limit in
Java or any suggestion of a better way so I can support more objects.

I second John's advice. I just wanted to mention that the reason for this
restriction is that initialisation expressions like the ones in your example
are compiled into code. (The resulting bytecode looks just like a long series
of assignments). There is a limit on the length of any method (which cannot be
circumvented since it's built into the classfile format), so it -- for some
unimaginable reason -- you /have/ to include the data in the code, then all you
can do is split the initialisation into several sub-methods.

-- chris
 
T

Thomas Hawtin

Chris said:
I second John's advice.

Yup. To get close to the generated code solution, I'd suggest using a
resource rather than a loose file, and probably serialisation.
I just wanted to mention that the reason for this
restriction is that initialisation expressions like the ones in your example
are compiled into code. (The resulting bytecode looks just like a long series
of assignments). There is a limit on the length of any method (which cannot be
circumvented since it's built into the classfile format), so it -- for some
unimaginable reason -- you /have/ to include the data in the code, then all you
can do is split the initialisation into several sub-methods.

IIRC, you can exceed the limit. You just can't have debugging
information or try/catch/finally/synchronized blocks in the extended
area. Whether any particular compiler will generate such code is another
matter.

JSR202 was supposed to increase various size limits. However, it does
not appear to have done so. Perhaps huge JSPs from hell are out of fashion.

http://jcp.org/en/jsr/detail?id=202

Tom Hawtin
 
C

Chris Uppal

Thomas said:
IIRC, you can exceed the limit. You just can't have debugging
information or try/catch/finally/synchronized blocks in the extended
area. Whether any particular compiler will generate such code is another
matter.

The compiler would also have to generate wide goto's to supplement the branch
instructions which only come in narrow flavours.

I thought it worth a giggle to try out with some hand-crafted bytecode.
HelloWorld with 1e6 NOPs before the System.out.println(). The resulting
classfile seems to be structurally OK (four different parsers manage to decode
it), but the JVM throws it out with an error:

Exception in thread "main" java.lang.ClassFormatError:
Invalid method Code length 1000009 in class file HelloWorld

Which is a pity, 'cos that would have been another solution for the OP --
generating bytecode directly instead of messing around with Java source ;-)

-- chris
 
D

dcook

John,

Thanks for the reply. This C++ animation app can output source in c or
Java to allow customer to modify and customize. The main customer base
is for c but see great potential for Java. Making the main C++ app
output Java formated data object files is not an option at this point.
Also, output all the data into a flat text file that I can parse into
Class data would increase to an unresonable Java app startup time.
There has to be a better solution.

Thanks,

Dave
 
D

dcook

James,

Yes, this was a syntax error. It would actualy be;

public static final int box_count = 5000;


Dave
 
D

dcook

James,

Thanks for the reply. Using the method of making the data a string
array allows me the largest data elements. I have tryed it as volatile
but was able to define larger amount of string arrays using static
final.

Thanks!

Dave
 
D

dcook

Chris,

Thanks for the reply. That is the same conclusion I came to but since
this C++ Animation app outputs source code for c or Java and the main
customer base is for c, I can not change it to break up the data into
multiple sub-methods at this point. I think Java is a better solution
for the customer base and if it catches on, this is the solution I
plain to use. I may have to support mobile Java (midlets) before I can
get the customer base I need to force a change in the C++ Animation
app.

Thanks!

Dave
 
C

Chris Uppal

Also, output all the data into a flat text file that I can parse into
Class data would increase to an unresonable Java app startup time.

Why ? You already have text parsing code, and you are presumably happy with
its performance. Loading a classfile with the same data embedded as Strings
(if you were allowed to define that much data) would involve loading exactly
the same data from disk as from your datafile, and what's more it involves
loading it through a significantly more expensive code path.
There has to be a better solution.

Why ?

-- chris
 
D

dcook

Chris,

Thanks for the reply. Well it is a lot less code to parse through a
string array than reading from a flat file of unknown number of
possible elements and class type elements. Also, if I try to use any
special tags to mark elements and / or class data elements would
complicate parsing since the data could contain one of these tags. So
yes, it would greatly effect the startup time which is now already slow
because of the parsing.

Thanks!

Dave
 
J

James McGill

James,

Thanks for the reply. Using the method of making the data a string
array allows me the largest data elements. I have tryed it as volatile
but was able to define larger amount of string arrays using static
final.


I'm guessing that's because the static strings are static *references*
to strings...

Cheers,

James
 
J

John C. Bollinger

James said:
What I take from this, is that there is apparently some hard limit on
class data, which means there's an effective limit on a class size.
I've never run into this or even considered it before.

The class file format has a number of hard limits on the sizes of
various parts. You can find all the gory details in the VM spec,
http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html
Among them, each class's number of methods, number of fields, number of
interfaces implemented, and number of constant pool entries is limited
to 65534. The last is the biggest restriction (of those), as most class
file substructures require one or more entries in the constant pool.

If you ever actually run into any of these limits with code written by
hand then it will have long since been time to refactor or to completely
redesign. Code-generating tools (such as the OP's) do sometimes hit one
or another of the limits, however, which can be annoying.
 
C

Chris Uppal

Thanks for the reply. Well it is a lot less code to parse through a
string array than reading from a flat file of unknown number of
possible elements and class type elements. Also, if I try to use any
special tags to mark elements and / or class data elements would
complicate parsing since the data could contain one of these tags. So
yes, it would greatly effect the startup time which is now already slow
because of the parsing.

I think you are not understanding the way that Java (or the JVM) works. If you
have a string in your program, then the route it takes to get into the runtime
is complicated. It goes something like:

JVM finds classfile in some JAR.
JVM decompresses classfile.
JVM /parses/ the classfile to find methods, constants, strings, etc.
... lots of other stuff that's irrelevant here...
JVM interns any strings.
JVM executes code (from the classfile) which creates arrays, and then
assigns String values to each element.

So, loading String data by including it in the Java source is just about the
/slowest/ way you could possibly get that data into the runtime. Given that
you have complete freedom over the datafile format, and that it's
machine-generated so readability is presumably not a big issue, you should be
able to load the data in a lot faster than the JVM can do it for you.

Java may look superficially like C++, but it is a completely different language
with a totally different model of execution. You have to be prepared for the
idea that you won't be able (unless you are lucky) to make a few small "tweaks"
to an existing C++ code-generator and expect everything to work properly
(although it would be easier than going in the opposite direction).

-- chris
 
D

dcook

you should be able to load the data in a lot faster than the JVM can do it for you.

Chris,

Thanks for explaining this. From the way you explained this, it should
be faster parsing my self than letting the JVM do it. I was assuming
that the JVM was usually written in c and executes faster than me
manually parsing using Java. Isn't that why it has System functions
like System.arraycopy() because it runs faster at the JVM level than
copying an array yourself in Java?

Thanks!

Dave
 
C

Chris Uppal

Thanks for explaining this. From the way you explained this, it should
be faster parsing my self than letting the JVM do it. I was assuming
that the JVM was usually written in c and executes faster than me
manually parsing using Java.

The only real reason why parsing in Java might be noticeably slower than C
(assuming you are running a modern JVM on a "normal" machine) is that you'd
probably want to use Java Strings rather than raw data as byte[] arrays. That
(unless you are working in UTF-16 anyway -- which I doubt) would involve
doubling the amount of memory used, and hence taking longer to process. (BTW,
the JVM has to perform an 8- to 16bit conversion when it's reading the Strings
from the classfile too -- I forgot to mention that before).

I'd be inclined, as a first step, just to switch to some simple parsing, with a
format defined to make parsing easy, and see how that goes. Or you could take
Thomas Hawtin's suggestion of using serialisation as the on-disk form. In
either case you can study the ./actual/ performance much better once there is
code there to execute ;-) If the parsing time does turn out to be a major
problem (which I rather doubt assuming the parser is sensibly defined and
written) then I'd try using a binary format instead and probably use
file-mapping to load the data in.

Isn't that why it has System functions
like System.arraycopy() because it runs faster at the JVM level than
copying an array yourself in Java?

Not really. It's more that (A) that routine was defined back in the days when
Java was always interpreted. (B) there may still be interpretive JVMs (for
resource-starved devices, perhaps). And (C) it was thought important to make
that routine as fast as possible, and the way to do that is to include it in
the platform, so that the implementation can use whatever "tricks" are
appropriate to that platform's implementation without making users of the
routine platform-dependent themselves.

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top