Differences between one-dimensional arrays in Java and C

P

Paul Morrison

Hi all,

I need to come up with some differences between arrays in Java and C, I have
searched Google and so far all I have found is the following:

Arrays in Java are reference types with automatic allocation of memory. In
C, arrays are groups of variables of the same type in adjacent memory.
Allocation for dynamic arrays is handled by the programmer.

This is an 8 mark question in an old exam paper, so I am assuming there are
more differences, but where can I find them?!

Thank you for your help.
 
O

osmium

Paul Morrison said:
I need to come up with some differences between arrays in Java and C, I
have searched Google and so far all I have found is the following:

Arrays in Java are reference types with automatic allocation of memory. In
C, arrays are groups of variables of the same type in adjacent memory.
Allocation for dynamic arrays is handled by the programmer.

This is an 8 mark question in an old exam paper, so I am assuming there
are more differences, but where can I find them?!

I would try this search target on google:
<c arrays shortcomings>
 
M

Malcolm

Paul Morrison said:
I need to come up with some differences between arrays in Java and C, I
have searched Google and so far all I have found is the following:

Arrays in Java are reference types with automatic allocation of memory. In
C, arrays are groups of variables of the same type in adjacent memory.
Allocation for dynamic arrays is handled by the programmer.

This is an 8 mark question in an old exam paper, so I am assuming there
are more differences, but where can I find them?!
A Java array is basically this

struct jdouble1d
{
double *mem;
int size;
};

Evey time you read and write, the java compiler checks against the size
member and throws an exception for out of bounds. Internally it calls
malloc() to allocate the memory, and by some magic the garbage collector
knows when the array goes out of scope and calls free().

A Java 2d array is this

struct jdouble2d
{
struct jdouble1d *mem;
int size;
};

You will see that this has the quirk that not all the sub-arrays need be of
the same length.

What is a C array - just a chunk of memory that the program interprets as
being doubles, or chars, or whatever.
To allocate on the stack you use the notation
double buff[123];

To allocate on the heap you call malloc()

double *buff = malloc(123 * sizeof(double));

If you call malloc() you must call free() manually, the compiler doesn't
have a magic garbage collector that does it for you.

Finally a C 2d arrays are a bit tricky. If we declare

double box[8][10];

we get a chunk of memory containing 800 doubles.

box[3][4] = 1;

and
double box2[800];
box[3 * 10 + 4] = 1;

are essentially equivalent.
However you can also construct a 2d array by calling lots of mallocs().

double **box = malloc(8 * sizeof(double *));
for(i=0;i<8;i++)
box = malloc(10 * sizeof(double));

and you use the same notation
box[3][4] = 1;

as for the 2d array, despite the fact that the underlying representation is
quite different.

The advantage of Java is that it is impossible to corrupt the computer's
memory, and you d not need to store the array size separately. The advantge
of C is that array accesses compile to direct memory read/writes, and so are
faster.
 
A

Axter

Malcolm said:
A Java array is basically this

struct jdouble1d
{
double *mem;
int size;
};

Evey time you read and write, the java compiler checks against the size
member and throws an exception for out of bounds. Internally it calls
malloc() to allocate the memory, and by some magic the garbage collector
knows when the array goes out of scope and calls free().

A Java 2d array is this

struct jdouble2d
{
struct jdouble1d *mem;
int size;
};

You will see that this has the quirk that not all the sub-arrays need be of
the same length.

What is a C array - just a chunk of memory that the program interprets as
being doubles, or chars, or whatever.
To allocate on the stack you use the notation
double buff[123];

To allocate on the heap you call malloc()

double *buff = malloc(123 * sizeof(double));

If you call malloc() you must call free() manually, the compiler doesn't
have a magic garbage collector that does it for you.

Finally a C 2d arrays are a bit tricky. If we declare

double box[8][10];

we get a chunk of memory containing 800 doubles.

box[3][4] = 1;

and
double box2[800];
box[3 * 10 + 4] = 1;

are essentially equivalent.
However you can also construct a 2d array by calling lots of
mallocs().

You can also construct a 2D array by just calling malloc twice.
See C FAQ item 6.16, and look at the second example code.
http://www.eskimo.com/~scs/C-faq/q6.16.html

Also take a look at the following links:
http://code.axter.com/allocate­2darray.h
http://code.axter.com/allocate­2darray.c

double **box = malloc(8 * sizeof(double *));
for(i=0;i<8;i++)
box = malloc(10 * sizeof(double));

and you use the same notation
box[3][4] = 1;

as for the 2d array, despite the fact that the underlying representation is
quite different.

The advantage of Java is that it is impossible to corrupt the computer's
memory, and you d not need to store the array size separately. The advantge
of C is that array accesses compile to direct memory read/writes, and so are
faster.
 
C

Christian Kandeler

Malcolm said:
What is a C array - just a chunk of memory that the program interprets as
being doubles, or chars, or whatever.
To allocate on the stack you use the notation
double buff[123];

To allocate on the heap you call malloc()

double *buff = malloc(123 * sizeof(double));

The latter is not an array.


Christian
 
K

Keith Thompson

Christian Kandeler said:
Malcolm said:
What is a C array - just a chunk of memory that the program interprets as
being doubles, or chars, or whatever.
To allocate on the stack you use the notation
double buff[123];

To allocate on the heap you call malloc()

double *buff = malloc(123 * sizeof(double));

The latter is not an array.

Sure it is. Specifically, buff is not an array (it's a pointer), but
the object allocated by malloc() is an array (or at least can be
treated as one).
 
T

Tim Rentsch

Axter said:
Malcolm said:
[unrelated stuff snipped]

However you can also construct a 2d array by calling lots of
mallocs().

You can also construct a 2D array by just calling malloc twice.
See C FAQ item 6.16, and look at the second example code.
http://www.eskimo.com/~scs/C-faq/q6.16.html

Now here's an interesting question. I believe it's possible to
construct a 2D array (with both dimensions non-constant) by calling
malloc only once. For example:

T *elements, **array;
size_t element_size = sizeof *elements;
size_t pointer_size = sizeof *array;

size_t element_space = element_size * M * N; /* array is [M][N] */
size_t pointer_space = pointer_size * M; /* only need M pointers */

size_t extra_space = pointer_size-1 - (element_space-1) % pointer_size;
size_t space_needed = element_space + extra_space + pointer_space;

void *memory = malloc( space_needed );

size_t i;

if( memory == 0 ) exit(1);

elements = memory;

array = (T**)memory + (element_space + extra_space) / pointer_size;

for( i = 0; i < M; i++ ){
array = & elements[N*i];
}

/* array now can be used as a 2D array */
/* to free, use 'free( & array[0][0] )' */


Since the memory returned by malloc is suitably aligned for all types,
both 'elements' and 'array' are suitably aligned for objects of their
respective types. Note that 'extra_space' is calculated so that
'element_space + extra_space' is a multiple of 'pointer_size'.

Each of the two areas of memory (at 'elements' and 'array') is used
only to store/retrieve objects of their respective types.

According to my best understanding the code above should work in
standard C. But I put it as a question - does anyone have an argument
to the contrary?

If someone has a supporting argument I'm happy to hear that also. :)
 
E

Eric Sosman

Tim said:
Now here's an interesting question. I believe it's possible to
construct a 2D array (with both dimensions non-constant) by calling
malloc only once. For example:

> [allocate space for elements, padding, and pointers]

Since the memory returned by malloc is suitably aligned for all types,
both 'elements' and 'array' are suitably aligned for objects of their
respective types. Note that 'extra_space' is calculated so that
'element_space + extra_space' is a multiple of 'pointer_size'.

Each of the two areas of memory (at 'elements' and 'array') is used
only to store/retrieve objects of their respective types.

According to my best understanding the code above should work in
standard C. But I put it as a question - does anyone have an argument
to the contrary?

Looks all right. A type's alignment requirement must be
a divisor of its size (otherwise arrays wouldn't work), so the
`extra_space' will be enough padding to get the pointers to
align properly. In fact, `extra_space' may be more than is
needed; if you really want to be parsimonious you can use a
dodge that someone posted here a year or two ago:

#include <stddef.h>
#define alignof(T) offsetof(struct {char c; T t;}, t)

Manually-inserted padding can also be used to write portable
versions of "the struct hack" for C90 implementations (C99
introduced an easier notation).
 
P

pete

Tim said:
Axter said:
Malcolm said:
[unrelated stuff snipped]

However you can also construct a 2d array by calling lots of
mallocs().

You can also construct a 2D array by just calling malloc twice.
See C FAQ item 6.16, and look at the second example code.
http://www.eskimo.com/~scs/C-faq/q6.16.html

Now here's an interesting question. I believe it's possible to
construct a 2D array (with both dimensions non-constant) by calling
malloc only once. For example:

T *elements, **array;
size_t element_size = sizeof *elements;
size_t pointer_size = sizeof *array;

size_t element_space = element_size * M * N; /* array is [M][N] */
size_t pointer_space = pointer_size * M; /* only need M pointers */

size_t extra_space = pointer_size-1 - (element_space-1) % pointer_size;
size_t space_needed = element_space + extra_space + pointer_space;

void *memory = malloc( space_needed );

size_t i;

if( memory == 0 ) exit(1);

elements = memory;

array = (T**)memory + (element_space + extra_space) / pointer_size;

for( i = 0; i < M; i++ ){
array = & elements[N*i];
}

/* array now can be used as a 2D array */
/* to free, use 'free( & array[0][0] )' */

Since the memory returned by malloc is suitably aligned for all types,
both 'elements' and 'array' are suitably aligned for objects of their
respective types. Note that 'extra_space' is calculated so that
'element_space + extra_space' is a multiple of 'pointer_size'.

Each of the two areas of memory (at 'elements' and 'array') is used
only to store/retrieve objects of their respective types.

According to my best understanding the code above should work in
standard C. But I put it as a question - does anyone have an argument
to the contrary?


Maybe.
Since pointer arithmetic is being done with array,
is it enough for array to be lined up with (T *),
or is the alignment requirement that array be lined up with an
array of (T *), which might have a larger alignment requirement?

I seem to recall that the case of

int i_array_2[2][2] = {0};
int *ptr = (int *)&array;

ptr[3] = 0;

wasn't acceptable to Doug Gwynn and several others on comp.std.c
based on the fact that there was no integer array with four elements,
even though there was certainly an object big enough
and also aligned for int.
 
T

Tim Rentsch

Eric Sosman said:
Tim said:
Now here's an interesting question. I believe it's possible to
construct a 2D array (with both dimensions non-constant) by calling
malloc only once. For example:

[allocate space for elements, padding, and pointers]

[yada yada yada]

Looks all right. A type's alignment requirement must be
a divisor of its size (otherwise arrays wouldn't work), so the
`extra_space' will be enough padding to get the pointers to
align properly. In fact, `extra_space' may be more than is
needed;

Yes it might! Normally the unneed extra space (and even 'extra_space'
itself) won't be too big: I deliberately put the pointer array second
so that at most sizeof(T*)-1 extra space is needed. Usually a few
bytes at most.

if you really want to be parsimonious you can use a
dodge that someone posted here a year or two ago:

#include <stddef.h>
#define alignof(T) offsetof(struct {char c; T t;}, t)

Manually-inserted padding can also be used to write portable
versions of "the struct hack" for C90 implementations (C99
introduced an easier notation).

That's a nice trick. I'll definitely have to add that to my set of
arcane C knowledge.

Strictly speaking the definition shown isn't guaranteed to work. If
sizeof(T) is 8 and alignment_of(T) is 2, the result of alignof(T)
might be 2 or 4 or 6, or even 22. Using GCD( alignof(T), sizeof(T) )
should at least produce a result that is guaranteed to work (right?).
But using GCD still isn't enough to guarantee that the minimum
alignment necessary will result.

Furthermore I can envision cases where it might reasonably not yield
the minimum alignment needed; a compiler might choose to put the 't'
member of the struct above at offset 4 (rather than 2) in an attempt
to get better cache performance, for example.

Despite my protestations, a good technique to know. Thanks.
 
T

Tim Rentsch

pete said:
Tim said:
Now here's an interesting question. I believe it's possible to
construct a 2D array (with both dimensions non-constant) by calling
malloc only once. For example:

T *elements, **array;
[rest of code snipped]

[incidental snippage]

According to my best understanding the code above should work in
standard C. But I put it as a question - does anyone have an argument
to the contrary?

Maybe.
Since pointer arithmetic is being done with array,
is it enough for array to be lined up with (T *),
or is the alignment requirement that array be lined up with an
array of (T *), which might have a larger alignment requirement?

The alignment of types 'T*' and 'T*[]' can't be any larger than
sizeof(T*). If the memory had been used as an array with a definite
size greater than one (such as 'T *(*)[2]') that would be a different
story. But since no array bounds are used in the code that accesses
the array, the granularity of sizeof(T*) is enough.

I seem to recall that the case of

int i_array_2[2][2] = {0};
int *ptr = (int *)&array;

ptr[3] = 0;

wasn't acceptable to Doug Gwynn and several others on comp.std.c
based on the fact that there was no integer array with four elements,
even though there was certainly an object big enough
and also aligned for int.

That's because an array with known bounds is allowed to "know" how big
it is, not because of alignment. When bounds are not known, as in the
(previously) posted code, it's ok to access any memory that's actually
there (assuming suitable alignment, which was mentioned above).
 
E

Eric Sosman

Tim said:
That's a nice trick. I'll definitely have to add that to my set of
arcane C knowledge.

Not Invented By Me; one of many clever things NIBM.
Strictly speaking the definition shown isn't guaranteed to work. If
sizeof(T) is 8 and alignment_of(T) is 2, the result of alignof(T)
might be 2 or 4 or 6, or even 22. Using GCD( alignof(T), sizeof(T) )
should at least produce a result that is guaranteed to work (right?).
But using GCD still isn't enough to guarantee that the minimum
alignment necessary will result.

Yah. It is "guaranteed to work" in the sense that it will
compute an alignment that suffices for type T. As you point out,
though, it is not guaranteed to compute the *minimal* alignment
for type T. (On the other hand, "minimal alignment" is something
that -- as far as I can see -- is not testable in a conforming C
program.)

GCD might improve the answer, but it still isn't guaranteed
to be minimal -- also, it's difficult to compute in the form of
a constant expression, which is often desirable in contexts where
games of this sort are played. Myself, I generally stick with
sizeof(T) as a reasonable approximation to the alignment; it may
well overstate the requirement, but not by much (so long as I
avoid using really silly types for T).
 
L

Lawrence Kirby

Hi all,

I need to come up with some differences between arrays in Java and C, I have
searched Google and so far all I have found is the following:

Arrays in Java are reference types
OK

with automatic allocation of memory. In

I'm not sure what you are trying to say here. You have to allocate memory
for arrays explicitly with the new operator in Java and the size is
fixed once allocated. I don't see much "automatic" here. I suggest you
discuss this further in a Java related newsgroup.
C, arrays are groups of
variables of the same type in adjacent memory.

True. It is also true for arrays of arrays.
Allocation for dynamic arrays is handled by the programmer.

Java doesn't really have dynamic arrays in the sense that C does (with
realloc).
This is an 8 mark question in an old exam paper, so I am assuming there
are more differences, but where can I find them?!

C's arrays are very simple, Java's are more complex. Your best bet is to
understand C's arrays and investigate Java's noting the differences. It
makes more sense to discuss this in a Java related newsgroup.

Lawrence
 
M

Malcolm

Lawrence Kirby said:
I'm not sure what you are trying to say here. You have to allocate memory
for arrays explicitly with the new operator in Java and the size is
fixed once allocated. I don't see much "automatic" here. I suggest you
discuss this further in a Java related newsgroup.
I think what the writer is trying to say is that in C you have direct access
to the memory. In Java you have no control over what the virtual machine is
doing - it may be written in C and calling malloc() internally, or it may be
swapping data in and out of the cache in some wonderful optimisation scheme.
So in Java the memory is managed "automatically".
 
L

Lawrence Kirby

I think what the writer is trying to say is that in C you have direct access
to the memory. In Java you have no control over what the virtual machine is
doing - it may be written in C and calling malloc() internally, or it may be
swapping data in and out of the cache in some wonderful optimisation scheme.
So in Java the memory is managed "automatically".

The same is true in C, you don't control the internal workings of
malloc(). That could call other allocators or whatever magic it wants.
Memory mapping, swapping is nothing unusual for environments running C
programs, and it happens without the C program being aware of it.

Lawrence
 
T

Tim Rentsch

Eric Sosman said:
Yah. It is "guaranteed to work" in the sense that it will
compute an alignment that suffices for type T. As you point out,
though, it is not guaranteed to compute the *minimal* alignment
for type T. (On the other hand, "minimal alignment" is something
that -- as far as I can see -- is not testable in a conforming C
program.)

Right, the result of 'alignof' suffices. When I said the definition
isn't guaranteed to work what I meant was it isn't guaranteed to
produce a result that divides sizeof(T), which the "real" alignment
must do. Similarly using the GCD will produce a result that is
guaranteed to divide sizeof(T), is a multiple of the "real" alignment,
and is the best information available under the circumstances.

I agree with your comment that the "minimal alignment" of a type (the
same as what I called "real" alignment) is not discoverable in a
conforming C program (assuming that it's greater than 1 of course).

GCD might improve the answer, but it still isn't guaranteed
to be minimal -- also, it's difficult to compute in the form of
a constant expression, which is often desirable in contexts where
games of this sort are played. Myself, I generally stick with
sizeof(T) as a reasonable approximation to the alignment; it may
well overstate the requirement, but not by much (so long as I
avoid using really silly types for T).

Yes, I agree, at least for base types; for struct's or arrays it
seems like it can be worthwhile in some circumstances to get a better
estimate using an alignof-like technique.

You're definitely right that it's difficult to compute GCD in the form
of a constant expression. I played around with various approximate
forms, hoping that some approximate form would produce accurate
results in most circumstances of practical interest, but it's not that
easy. So if one wants a "compile time" result I think the best way
to get it is to compile a small program that computes the answer and
feed that back in to a subsequent compile via a generated header or
something similar. What a pain.

Just out of curiosity, has there been any serious discussion about
having an 'alignof( type name )' capability be added to the standard?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top