initialize for a 2-D dynamic array.

M

MBALOVER

HI all,

I want to create a 2D dynamic array and then initialize all elements
with 0.

I did a search and found:

1. array = (** unsigned char) calloc(NROW, sizeof(unsigned char*));

2. for(i = 0; i < NROW ; i++)
3. {
4. array = calloc(NCOL, sizeof(unsigned char));
5. }


I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.

By the way, I am wondering if malloc will be faster than calloc?

In my understanding, calloc is equivalent to malloc plus zero-
initialization. Therefore calloc may be slower than malloc. Is it
right?

Thanks
 
S

Seebs

I want to create a 2D dynamic array and then initialize all elements
with 0.
Really?

Hmm.

I did a search and found:
1. array = (** unsigned char) calloc(NROW, sizeof(unsigned char*));

This is clearly incorrect.
2. for(i = 0; i < NROW ; i++)
3. {
4. array = calloc(NCOL, sizeof(unsigned char));
5. }


Are NROW and NCOL known values? If they are, then you can do this much more
simply.
I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.


No, because malloc doesn't guarantee initialized memory. In general, you
can't allocate a multidimensional array of unknown size.
By the way, I am wondering if malloc will be faster than calloc?

Maybe, maybe not.
In my understanding, calloc is equivalent to malloc plus zero-
initialization. Therefore calloc may be slower than malloc. Is it
right?

Maybe. Honestly: You are not at a stage where you should be trying to
figure things like that out. Understand what they do, don't get caught
up trying to figure out what's "faster". Write for clarity and
understanding for now -- you can work on speed once you can make programs
that do what you want them to do.

-s
 
I

Ike Naar

I want to create a 2D dynamic array and then initialize all elements
with 0.

I did a search and found:

1. array = (** unsigned char) calloc(NROW, sizeof(unsigned char*));

2. for(i = 0; i < NROW ; i++)
3. {
4. array = calloc(NCOL, sizeof(unsigned char));
5. }

I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.


You can; initializing each array with all-bytes-zero in line 1
is not useful since you immediately overwrite each array with a
pointer value in lines 2-5.
In face, even if you wouldn't overwrite each array with a pointer value,
initializing each array with all-bytes-zero would not be very useful,
because all-bytes-zero is not guaranteed to represent a valid pointer.

So I agree that it's better to replace calloc in line 1 by malloc.
 
I

Ike Naar

This is clearly incorrect.

Why? It looks like mbalover is constructing a "dynamic 2D array"
as an array of pointers, pointing to 1D arrays. What's incorrect
about that? It's a well-known idiom.
2. for(i = 0; i < NROW ; i++)
3. {
4. array = calloc(NCOL, sizeof(unsigned char));
5. }


Are NROW and NCOL known values? If they are, then you can do this much more
simply.


You mean, like "char array[NROW][NCOL]"? That's indeed a possibility,
but apparently not what mbalover wants (perhaps he wants to be able to
later resize his "dynamic 2D array", something he can't do with a
regular 2D array).
I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.


No, because malloc doesn't guarantee initialized memory. In general, you
can't allocate a multidimensional array of unknown size.


I think you are using the term "multidimensional array" in a too
restrictive way in this context.
Maybe, maybe not.


Maybe. Honestly: You are not at a stage where you should be trying to
figure things like that out. Understand what they do, don't get caught
up trying to figure out what's "faster". Write for clarity and
understanding for now -- you can work on speed once you can make programs
that do what you want them to do.

Agreed; the speed difference between calloc and malloc (if there is one,
and if it is measurable) is probably the last thing to worry about.
 
B

Ben Bacarisse

Why? It looks like mbalover is constructing a "dynamic 2D array"
as an array of pointers, pointing to 1D arrays. What's incorrect
about that?

It has a syntax error. (** unsigned char) is not a valid cast operator.

<snip>
 
B

Ben Bacarisse

MBALOVER said:
I want to create a 2D dynamic array and then initialize all elements
with 0.

This is a FAQ: http://c-faq.com/aryptr/dynmuldimary.html

I like the second suggested method, though I tend to write it like this:

if (ptrs = malloc(NROWS * sizeof *ptrs))
if (ptrs[0] = malloc(NCOLS * NROWS * sizeof *ptrs[0]))
for (int r = 1; r < NROWS; r++)
ptrs[r] = &ptrs[r - 1][NCOLS];
else free(ptrs);

where 'ptrs' is of type T ** for some type T. Using only two malloc
calls make checking and freeing simpler.
I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.


To get a zero-initialised array insert a memset call in the inner if:

if (ptrs[0] = malloc(NCOLS * NROWS * sizeof *ptrs[0])) {
memset(ptrs[0], 0, NCOLS * NROWS * sizeof *ptrs[0]);
for (int r = 1; r < NROWS; r++)
ptrs[r] = &ptrs[r - 1][NCOLS];
}
else free(ptrs);

but this is only guaranteed to work for integer types (though it is
highly likely to work for floating-point types and pointer types).

By the way, if you really do need a 2D array of char (as in your
example) then you can use only one malloc call if you want since char
needs no special alignment.

<snip>
 
E

Ersek, Laszlo

HI all,

I want to create a 2D dynamic array and then initialize all elements
with 0.

Ignoring the initialization for a second (because you didn't specify the
element type), I think that the following enables the most flexible
usage, if your rows are all full:

#include <stdlib.h>

type *
get_type2d(size_t rows, size_t cols)
{
return (0u < cols && (size_t)-1 / sizeof(type) / cols >= rows)
? malloc(rows * cols * sizeof(type))
: 0;
}

You would index the allocated array like this:

{
type *arr;

arr = get_type2d(nrows, ncols);
if (0 != arr) {
/* Store ... in array[row][col], conceptually. */
arr[row * ncols + col] = ...;

free(arr);
}
}


This can be generalized for N dimensional arrays. The following function
takes "long unsigned" arguments instead of "size_t", because "size_t"
may be promoted by the default argument promotions, and that doesn't
play very well with va_arg().

#include <stdlib.h>
#include <stdarg.h>

/*
Usage:

get_ndim(el_size, dim_1, dim_2, ..., dim_n, 0LU);

Allocate memory for dim_1 * dim_2 * ... * dim_n elements, each being
of size "el_size". All arguments must be of type "long unsigned". The
variable argument list must be terminated with 0LU.

get_ndim(el_size, 0LU)

allocates a zero-dimensional array (= space for a single element).

The caller is responsible for not passing 0LU as el_size.

The function returns a pointer returned by malloc() (which can be 0),
or 0 if the wanted size (in bytes) cannot be represented in size_t.
*/

void *
get_ndim(long unsigned el_size, ...)
{
size_t max_el,
num_el;
va_list ap;
long unsigned next;

max_el = (size_t)-1 / el_size;
num_el = 1u;

va_start(ap, el_size);
while (0LU != (next = va_arg(ap, long unsigned))
&& max_el / next >= num_el) {
num_el *= next;
}
va_end(ap);

return 0LU == next ? malloc(num_el * el_size) : 0;
}


The access pattern implements the Horner scheme.

http://en.wikipedia.org/wiki/Horner_scheme

An example for the Horner scheme with auto arrays might be:

{
type arr[4][5][6]; /* 120 */

arr[1][2][3] = ...; /* [1 * (5*6) + 2 * (6) + 3], [45] */
}

-->

{
type arr[4 * (5 * (6))]; /* 120 */

arr[(1 * 5 + 2) * 6 + 3] = ...; /* [45] */
}


Thus

/*
If arr denotes the non-null return value of

get_ndim(el_size, dim_1, ... dim_n, 0LU)

that is, arr is conceptually

array[dim_1][dim_2] ... [dim_n]

then the conceptual element with valid subscripts

array[i_1][i_2] ... [i_n]

can be accessed with

arr[get_idx(i_1, dim_2, i_2, ..., dim_n, i_n, 0LU)]

For zero-dimensional arrays, the following is valid:

get_ndim(el_size, 0LU)[get_idx(0LU, 0LU)]

*/

size_t
get_idx(long unsigned outermost_idx, ...)
{
size_t sum;
va_list ap;
long unsigned next_size;

sum = outermost_idx;
va_start(ap, outermost_idx);
while (0LU != (next_size = va_arg(ap, long unsigned))) {
sum = sum * next_size + va_arg(ap, long unsigned);
}

return sum;
}


Usage:

int main(void)
{
double *mtr;

/* allocate 100 4x4 matrices */
mtr = get_ndim(sizeof *mtr, 100LU, 4LU, 4LU, 0LU);
if (0 != mtr) {

/* Set the 59th matrix to 0.0 -- starting at &matrix[58][0][0]. */
size_t base,
idx;
base = get_idx(58LU, 4LU, 0LU, 4LU, 0LU, 0LU);
for (idx = 0u; idx < 16u; ++idx) {
mtr[base + idx] = 0.0;
}

free(mtr);
}
return 0;
}


Cheers,
lacos
 
E

Ersek, Laszlo

size_t
get_idx(long unsigned outermost_idx, ...)
{
size_t sum;
va_list ap;
long unsigned next_size;

sum = outermost_idx;
va_start(ap, outermost_idx);
while (0LU != (next_size = va_arg(ap, long unsigned))) {
sum = sum * next_size + va_arg(ap, long unsigned);
}

return sum;
}

Obviously, I forgot va_end(ap) before returning.

Sorry,
lacos
 
M

Malcolm McLean

This can be generalized for N dimensional arrays. The following function
takes "long unsigned" arguments instead of "size_t", because "size_t"
may be promoted by the default argument promotions, and that doesn't
play very well with va_arg().
Ho ho, hadn't thought of that. Neither, I bet, had the committee.
size_t introduces all sorts of subtle difficulties.
 
E

Ersek, Laszlo

Ho ho, hadn't thought of that. Neither, I bet, had the committee.

I'm sure they did; after all, C99 has %z. Since any specific
implementation of the standard library knows intimately what size_t is
promoted to (if at all), they can use the promoted type in va_arg() in
the implementation of *printf().

Granted, if one wishes to take size_t's through the ellipsis in a
portable function, he/she has to transfer "long unsigned"'s instead --
in C90.

In C99, the previous paragraph works with s/long unsigned/uintmax_t/g,
but I think there is also a chance to figure out the type size_t is
promoted to -- C99 has SIZE_MAX.

(I wouldn't try it, though, with padding bits and whatnot :))

Cheers,
lacos
 
S

Seebs

Why? It looks like mbalover is constructing a "dynamic 2D array"
as an array of pointers, pointing to 1D arrays. What's incorrect
about that? It's a well-known idiom.

"(** unsigned char)".

I'm pretty sure the * would have to go after the base type.
You mean, like "char array[NROW][NCOL]"? That's indeed a possibility,
but apparently not what mbalover wants (perhaps he wants to be able to
later resize his "dynamic 2D array", something he can't do with a
regular 2D array).

Probably. But if he knows what NCOL is at compile time, he can allocate space
for an array of arrays of size NCOL, and that'll work.
Agreed; the speed difference between calloc and malloc (if there is one,
and if it is measurable) is probably the last thing to worry about.

It's also extremely hard to predict. I'm aware of a particular test case
where calloc() appeared to be much faster than using malloc() and then
initializing memory... But only if you allocated a large space and then
didn't touch most of it. If the space was smaller, or if you touched most
of it, calloc() was much slower.

(I'm not going to explain why right away, because I bet there's people who
would love a chance to kick that one around a bit.)

-s
 
B

Ben Bacarisse

This can be generalized for N dimensional arrays. The following function
takes "long unsigned" arguments instead of "size_t", because "size_t"
may be promoted by the default argument promotions, and that doesn't
play very well with va_arg().

That's a bit vague. What is the problem with size_t that unsigned
long solves?

<snip>
 
E

Ersek, Laszlo

Ben Bacarisse said:
(e-mail address removed) (Ersek, Laszlo) writes:

That's a bit vague. What is the problem with size_t that unsigned
long solves?

(I chose long unsigned instead of uintmax_t for C90's sake, but that's
not significant now.)

C90 7.8.1.2 The va-arg macro

----v----
Synopsys

#include <stdarg.h>

type va_arg(va_list ap, type);

Description

The *va_arg* macro expands to an expression that has the type and value
of the next argument in the call. The parameter *ap* shall be the same
as the *va_list ap* initialized by *va_start*. Each invocation of
*va_arg* modifies *ap* so that the values of successive arguments are
returned in turn. The parameter /type/ is a type name specified such
that the type of a pointer to an object that has the specified type can
be obtained simply by postfixing a * to /type/. If there is no actual
next argument, or if /type/ is not compatible with the type of the
actual next argument (as promoted according to the default argument
promotions), the behavior is undefined.

[...]
----^----

C90 7.1.6 Common definitions <stddef.h>

----v----
[...]

size_t

which is the unsigned integral type of the result of the *sizeof*
operator

[...]
----^----

C90 6.3.3.4 The sizeof operator

----v----
[...]

The result is an integer constant.

[...]
----^----

In my interpretation all these permit size_t to be "short unsigned".
"short unsigned" is promoted to "int" or "unsigned" by the integral
promotions as part of the default argument promotions ("integral" is the
word in C90). Hence,

- size_t may be "short unsigned",
- it is then promoted to "int" or "unsigned",
- making size_t and its promoted type incompatible,
- which makes va_arg(ap, size_t) undefined behavior.

Choosing "long unsigned" (or uintmax_t under C99) solves this: they are
unaffected by integral (integer) promotions, and they can hold any
size_t value (no matter what size_t is).


(In my interpretation, C90 7.8.1.2 wishes to allow the implementation to
include a dereference like

*(type *)ap

in the implementation of va_arg(). Now if size_t is "short unsigned",
and therefore it is promoted to "int" (or "unsigned"), and if "short
unsigned" and its promoted type actually have different representations
(eg. by having different sizes), then the access above may translate to
something like

{
uint32_t u;

*(uint16_t *)&u;
}

which is kind of lossy, especially on big-endian.)

Cheers,
lacos
 
E

Ersek, Laszlo

C90 6.3.3.4 The sizeof operator

----v----
[...]

The result is an integer constant.

[...]
----^----

In my interpretation all these permit size_t to be "short unsigned".

Oh no! After looking up C90 6.1.3.2 "Integer constants", I realize
nothing with a lower conversion rank than that of "int" can be size_t.
And thus size_t is not affected by the integer promotions.

Sorry :)

lacos
/facepalm
 
E

Ersek, Laszlo

Seebs said:
I'm aware of a particular test case
where calloc() appeared to be much faster than using malloc() and then
initializing memory... But only if you allocated a large space and then
didn't touch most of it. If the space was smaller, or if you touched most
of it, calloc() was much slower.

(I'm not going to explain why right away, because I bet there's people who
would love a chance to kick that one around a bit.)

You are a skilled manipulator (and I'm easy game).

If calloc() sees that the size of the region to allocate exceeds (or
reaches?) the mmap threshold of the malloc() implementation, then it
mmap()'s a new VMA instead of picking an already mapped (and previously
used) part of the heap VMA. (Or it extends the heap VMA with new pages
by way of sbrk() or whatever.) The kernel must ensure (for, among other
reasons, security) that when read, all such pages act as if they had
been initialized with 0. calloc() knows this and can omit the explicit
stores, which saves CPU cycles and page faults (and even initial page
assignments to the VMA or so).

I hope this is inaccurate / wrong enough that others still feel like
kicking it around (supposing I'm the first biting the bait, which is
unlikely).

Cheers,
lacos
 
K

Keith Thompson

C90 6.3.3.4 The sizeof operator

----v----
[...]

The result is an integer constant.

[...]
----^----

In my interpretation all these permit size_t to be "short unsigned".

Oh no! After looking up C90 6.1.3.2 "Integer constants", I realize
nothing with a lower conversion rank than that of "int" can be size_t.
And thus size_t is not affected by the integer promotions.

Sorry :)

I think you've been led astray by some incorrect wording.

C99 6.5.3.4p2 has the same problem; it says:

If the type of the operand is a variable length array type, the
operand is evaluated; otherwise, the operand is not evaluated
and the result is an integer constant.

But the phrase "integer constant" cannot refer to the
syntactic category "integer-constant" defined in C99 6.4.4.1.
An "integer-constant" is a decimal, octal, or hexadecimal constant,
optionally followed by a suffix such as "UL"; it's a single token.
Even "-1" isn't an integer constant; it's a "-" token followed by
an integer constant. Clearly anything that includes the "sizeof"
keyword cannot be an "integer-constant".

What it probably should say is that "sizeof ..." is an "integer
constant expression" as defined in C99 6.6p6. The point is that the
result is constant (evaluated at compile time). The tricky part
is saying that without misusing the word "constant" as defined in
C99 6.4.4.
 
B

Barry Schwarz

HI all,

I want to create a 2D dynamic array and then initialize all elements
with 0.

I did a search and found:

1. array = (** unsigned char) calloc(NROW, sizeof(unsigned char*));

I wonder why you felt the need to cast this call to calloc and not the
one in 4 below.
2. for(i = 0; i < NROW ; i++)
3. {
4. array = calloc(NCOL, sizeof(unsigned char));


calloc can be used to initialize elements of various integer types to
zero but it is not a portable method for non-integer types such as
float, double, and pointer.
5. }


I am wondering if I can replace calloc in line 1 by malloc function
and still get a 2D array with all array[j]=0.


You would have to update the argument also.
By the way, I am wondering if malloc will be faster than calloc?

In my understanding, calloc is equivalent to malloc plus zero-
initialization. Therefore calloc may be slower than malloc. Is it
right?

Yes, calloc does more work than malloc. (It may also have to compute
the total space from the two arguments.) Whether it takes more time
is partly a quality of implementation issue and partly system
specific. Whether it takes more time on your system tells you almost
nothing about what will happen if you change systems or
implementations.
 
E

Ersek, Laszlo

C90 6.3.3.4 The sizeof operator

----v----
[...]

The result is an integer constant.

[...]
----^----

In my interpretation all these permit size_t to be "short unsigned".

Oh no! After looking up C90 6.1.3.2 "Integer constants", I realize
nothing with a lower conversion rank than that of "int" can be size_t.
And thus size_t is not affected by the integer promotions.

Sorry :)

I think you've been led astray by some incorrect wording.

[lotsa insight making me thankful]

What it probably should say is that "sizeof ..." is an "integer
constant expression" as defined in C99 6.6p6.

Yes! I was hoping somebody would say "integer constant expression"! So
it wasn't in vain to type up all that reasoning. From the location you
designate,

"Cast operators in an integer constant expression shall only convert
arithmetic types to integer types [...]"

so "sizeof(struct dummy)" can perfectly well return "((short
unsigned)16)".

Thanks!
lacos
 
B

Ben Bacarisse

C90 7.8.1.2 The va-arg macro

Synopsys

type va_arg(va_list ap, type);

Description

The *va_arg* macro expands to an expression that has the type and value
of the next argument in the call. The parameter *ap* shall be the same
as the *va_list ap* initialized by *va_start*. Each invocation of
*va_arg* modifies *ap* so that the values of successive arguments are
returned in turn. The parameter /type/ is a type name specified such
that the type of a pointer to an object that has the specified type can
be obtained simply by postfixing a * to /type/. If there is no actual
next argument, or if /type/ is not compatible with the type of the
actual next argument (as promoted according to the default argument
promotions), the behavior is undefined.

C99 has the same words with a couple of extra permissions, neither of
which applies in the case in question.

OK, thanks. I get it. I won't loose a lot of sleep over systems where
size_t promotes, but I am now better informed.

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top