problem with memcpy and pointers/arrays confusion - again

=?ISO-8859-1?Q?Martin_J=F8rgensen?= · Mar 9, 2006

Hi,

I'm relatively new with C-programming and even though I've read about
pointers and arrays many times, it's a topic that is a little confusing
to me - at least at this moment:

----
1) What's the difference between these 3 statements:

(i) memcpy(&b, &KoefD, n); // this works somewhere in my code

(ii) memcpy(&b[0], &KoefD[0], n); // but this doesn't

(iii) memcpy(b, KoefD, n); // what's the meaning of this then?

N.B: b is defined such that:

double *b = (double *) malloc(sizeof(double) * (n+1));

and

KoefD is an argument to a sub-function that is called from within
main(). It's prototype (in main-function) is something like:

void sub_function(int n, const double *KoefD);

I understand that this "&" gives the physical address of the pointer
itself... I would assume that was the same thing as &b[0] (the first
element) - but it doesn't seem like it is....

2) Why does this work:

free(indx); // free 1D array

But this doesn't?

free(b);

---------

indx is:
unsigned long *indx = (unsigned long *) malloc( (n+1)*sizeof(unsigned
long) );

b is:
double *b = (double *) malloc(sizeof(double) * (n+1));

I hope I don't have to cut down my program in small pieces to show that
these questions are actually some problems I struggle with now and it
might be I misunderstood something, but any comments would be greatly
appreciated...

If I have to, I'll cut my program down so you can just copy/paste, but
it would be much easier if I could figure out the error myself.

Best regards / Med venlig hilsen
Martin Jørgensen

Michael Mair · Mar 9, 2006

Martin said:
Hi,

I'm relatively new with C-programming and even though I've read about
pointers and arrays many times, it's a topic that is a little confusing
to me - at least at this moment:

Please provide the types before the question -- otherwise,
it is hard to answer.

double *b;
const double &KoefD;
int n;

(i) memcpy(&b, &KoefD, n); // this works somewhere in my code

Copy n byte starting at the address of the pointer KoefD
to the address of b. If n == sizeof KoefD (i.e. the size of
the pointer), then this is equivalent to b = KoefD.
If n > sizeof KoefD, you may run off copying from or to
storage which does not belong to your programme -- if you
are lucky, you get a segfault/access violation or similar.
If n < sizeof KoefD, you may end up with a trap
representation or an address not belonging to your programme
stored in b.
Note: If KoefD were of type array N of double with
N >= n/sizeof (double) and (n%sizeof (double)) == 0 and
b were of type array M of double with M >= n/sizeof (double),
then this would work.

(ii) memcpy(&b[0], &KoefD[0], n); // but this doesn't

(iii) memcpy(b, KoefD, n); // what's the meaning of this then?

These two are equivalent and should work as long as n is the
size of the storage pointed to by KoefD and
0 == (n%sizeof (double)).

N.B: b is defined such that:

double *b = (double *) malloc(sizeof(double) * (n+1));

1) Please do not cast the return value of malloc().
This has been explained to you at least once. Use malloc()
like this:
double *b = malloc((n+1) * sizeof *b);
2) Note that you are memcpy()ing only part of the allocated
storage (you forget the n+1st element) and that if
sizeof(double) != 1, you probably do not achieve what you
want.
memcpy(b, KoefD, (n+1) * sizeof KoefD[0])
should be about right for you.

and

KoefD is an argument to a sub-function that is called from within
main(). It's prototype (in main-function) is something like:

void sub_function(int n, const double *KoefD);

I understand that this "&" gives the physical address of the pointer
itself... I would assume that was the same thing as &b[0] (the first
element) - but it doesn't seem like it is....

No. (i) gives only the same address as (ii) and (iii) if Koef and
b are arrays -- the "type" of the address is still different, though.

2) Why does this work:

free(indx); // free 1D array

But this doesn't?

free(b);

---------

indx is:
unsigned long *indx = (unsigned long *) malloc( (n+1)*sizeof(unsigned
long) );

b is:
double *b = (double *) malloc(sizeof(double) * (n+1));

Same about malloc().
If you need the cast, then you either forgot to #include <stdio.h>
or you are compiling in C++ mode -- both can lead to subtle bugs.

Another thing: Allocating "size + 1" usually only is necessary for
character arrays intended to hold strings of length <= size.
If you need it for your code (e.g. because you prefer 1-based arrays),
you may be in for giving yourself a heavy headache as soon as you
get to arrays of arrays -- you are practically inviting off by one
type of errors.

If free(pointer) does not work, then you usually have either
- changed pointer between pointer = malloc(....) and free(pointer); or
- have corrupted allocated memory handling by writing over allocated
storage bounds or free()ing the same pointer value more than once.

To debug, printf("%p\n", (void *)b); directly after allocating b and
directly before free(b); to make sure it is not the first error.
Then start checking your accesses of allocated storage. If it makes
sense, set free()ed pointers to NULL directly after the free() call.

I hope I don't have to cut down my program in small pieces to show that
these questions are actually some problems I struggle with now and it
might be I misunderstood something, but any comments would be greatly
appreciated...

If I have to, I'll cut my program down so you can just copy/paste, but
it would be much easier if I could figure out the error myself.

In this case, a probable error cause offers itself but I may be
completely off -- so, yes, go for the minimal compiling example.

Cheers
Michael

=?ISO-8859-15?Q?Martin_J=F8rgensen?= · Mar 10, 2006

Michael said:
Martin Jørgensen schrieb: -snip-

Please provide the types before the question -- otherwise,
it is hard to answer.

Ok, sorry for that. As you know, I'm relatively new in this group so
nobody ever told me that before I think, but I'll remember that from now on.

-snip-

1) Please do not cast the return value of malloc().
This has been explained to you at least once. Use malloc()
like this:
double *b = malloc((n+1) * sizeof *b);

Sorry, but I forgot. AFAIR only one person told me that.
And he didn't explain the problem, so I'm citing now: (Robin Haigh) -
"You've got irrelevant complications here because of the change of
subscript base from 1 to 0. You can drop the casts".

So I'm sorry to ask but what was wrong with doing this cast?

I can tell you why it's in my code: It's because I have some source code
from a guy who is a lot more experienced that me in C-programming. I can
change it however (and will do so, because I'm sure you're right even
though I don't know the explanation)...

2) Note that you are memcpy()ing only part of the allocated
storage (you forget the n+1st element) and that if
sizeof(double) != 1, you probably do not achieve what you
want.
memcpy(b, KoefD, (n+1) * sizeof KoefD[0])
should be about right for you.

Douh! Ofcourse I forgot to multiply the size with sizeof KoefD[0]...
Stupid beginner's mistake, I guess.

KoefD is an argument to a sub-function that is called from within
main(). It's prototype (in main-function) is something like:

void sub_function(int n, const double *KoefD);

I understand that this "&" gives the physical address of the pointer
itself... I would assume that was the same thing as &b[0] (the first
element) - but it doesn't seem like it is....

Click to expand...

No. (i) gives only the same address as (ii) and (iii) if Koef and
b are arrays -- the "type" of the address is still different, though.

They are arrays...

Same about malloc().
If you need the cast, then you either forgot to #include <stdio.h>
or you are compiling in C++ mode -- both can lead to subtle bugs.

I think the error was that I've experimented with memcpy(&b, &KoefD,
....) which ofcourse is wrong... The error disappeared by itself now
after I changed it to memcpy(b, KoefD, ...)

Another thing: Allocating "size + 1" usually only is necessary for
character arrays intended to hold strings of length <= size.
If you need it for your code (e.g. because you prefer 1-based arrays),
you may be in for giving yourself a heavy headache as soon as you
get to arrays of arrays -- you are practically inviting off by one
type of errors.

Yeah, I've been struggling with arrays of arrays that starts at offset 1
instead of offset 0. But I think I can manage it now (although you're
right that it is a little complicated)... I just remember to add
row*cols*sizeof(type) to the memory location pointed to.

If free(pointer) does not work, then you usually have either
- changed pointer between pointer = malloc(....) and free(pointer); or
- have corrupted allocated memory handling by writing over allocated
storage bounds or free()ing the same pointer value more than once.

Think it must have been corrupt memory due to memcpy to the wrong
location...

To debug, printf("%p\n", (void *)b); directly after allocating b and
directly before free(b); to make sure it is not the first error.

Good tip - I'll remember that. Might be necessary another time.

Then start checking your accesses of allocated storage. If it makes
sense, set free()ed pointers to NULL directly after the free() call.

What effect will that have?

In this case, a probable error cause offers itself but I may be
completely off -- so, yes, go for the minimal compiling example.

You were completely right. As told: The memcpy(&, &, n) was not what I
needed but instead I needed memcpy(blabla, blabla, n*sizeof(something))
so that was basically causing me some trouble since I forgot to multiply
n with the sizeof(double) which is 8 bytes.

Another thing:

In order to save space in my source code, wouldn't it be clever/possible
for me to do something like (untested - using suggestions from a
previous thread):

- - - - - - - - - - - - - - - - - - - -
#include <stdio.h>
#include <stdlib.h>

int main()
{
unsigned long int total_mem_used = 0; // counter
int N = 20; // whatever number of elements

int *int_array = int_allocate_mem((N+1)*sizeof(int), __FILE__,
__LINE__, &total_mem);
printf("int_array takes up: %i bytes in memory", *int_array[N]);

printf("Total memory occupation is: %li bytes", *total_mem);

free_mem(int_array);

exit(0);
}

//(return type void??? shouldn't it be int*?)

(return type) int_allocate_mem(size_t size, char *file, int line,
*total_mem)
{
int *int_ptr = malloc(size);
if (int_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n", file,
line, size);
exit(EXIT_FAILURE);
}

int_ptr[N] = size; // save memory allocated for this pointer
total_mem += size; // update total memory allocated untill now

return int_ptr;
}
- - - - - - - - - - - - - - - - - - - -

I would also like to have a double_allocate_mem((N+1)*sizeof(double),
__FILE__, __LINE__, &total_mem);

I just have to remember to reserve space for one extra element, at
location pointer[N] (could also be in the beginning of the array, but I
think it's easier to put it in the end).

That way, I can more easily debug and keep track of memory occupation
throughout the whole program at all times.. That would *really* be nice
I think... Suggestions/comments? I hope...

Best regards / Med venlig hilsen
Martin Jørgensen

Richard Heathfield · Mar 10, 2006

Martin Jørgensen said:

So I'm sorry to ask but what was wrong with doing this cast?

I can tell you why it's in my code: It's because I have some source code
from a guy who is a lot more experienced that me in C-programming. I can
change it however (and will do so, because I'm sure you're right even
though I don't know the explanation)...

Here's an explanation for you:

----------------------------------------------------------------------

Casting

Implicit conversions

In the C programming language, there are two kinds of conversions between
expressions of different types. The first is supplied by the compiler
itself. This is known as an implicit conversion. Here's an example of its
use:

long int foo = 314159265L;
double bar = foo;

This kind of conversion is, of course, very natural. In fact, an example of
such a conversion occurs as early as page 12 of K&R2 (where the conversion
is from float to double). Such conversions are woven into the very fabric
of the language, and are so common and natural that we often fail to notice
them.

Explicit conversions (casts)

There is, however, another way to convert an expression from one type into
another; this second method is known as an explicit conversion, or cast.
Here's an example of a cast:

IsLowerCase = islower((unsigned char)*p);

This cast is a good one, by the way. There are circumstances in which
explicit conversions, or casts, are a convenience which allows us to write
better, cleaner code than would otherwise be the case. And yet casts are
poorly understood.

For some reason, casting is very popular among C programmers. It's as if a
cast is a magic wand, that can magically transform one type into another,
irrespective of semantics, logic, or common sense. In fact, there is very
little (if any) sense in casting an expression for which a perfectly
adequate implicit conversion already exists. There are very few exceptions
to this rule of thumb. For example, consider this code:

#include <math.h>
long GetHypotenuse(long height, long base)
{
return sqrt(height * height + base * base);
}

Some compilers will complain about this code, arguing that sqrt returns a
double, and that assigning such a value to a long can result in a loss of
information. Well, that's true. But it's also true that, if such a loss of
information is intended, then there's no problem (because we're only
interested in the integer part of the result); and if such a loss of
information is not intended, then we really ought to pay attention to the
compiler's complaint.

Casts as diagnostic suppressors

We can often suppress the warning by using a cast:

#include <math.h>
long GetHypotenuse(long height, long base)
{
return (long)sqrt(height * height + base * base);
}

The cast, in effect, tells the compiler: "I know what I'm doing! So shut up
already!" Many C programmers do this (even those who only think that they
know what they're doing). But is the cast justified? Well, in this case,
there is a (perhaps spurious) justification, not in technical terms but
purely in terms of getting a clean compilation. Alas, this excuse is used
to justify a great many unnecessary casts. Yes, it's true that a nice clean
compilation is good to see. But if we achieve that state only by gagging
the compiler, without understanding the possible implications of such
gagging, then we are running the risk of suppressing important and useful
diagnostic information. In fact, we are merely indulging, and maybe even
deceiving, ourselves. That first example is innocuous enough, but precisely
the same logic (getting a clean compilation) can lure us into adding casts
that actually hide problems, instead of fixing them.

Casts as bug-hiders

The canonical example of this counter-productive diagnostic suppression is
the malloc function. As I'm sure you know, malloc is prototyped in
<stdlib.h> as void *malloc(size_t); -- that is, malloc is declared to be a
function taking a size_t as a parameter and returning a void pointer (i.e.
a pointer to an object whose type is not known).

Many years ago, before C was standardised by ANSI, malloc returned char *,
rather than void *; this was the most rational choice at the time, because
the void type didn't actually exist then (except, perhaps, as an extension
on some compilers). No implicit conversions between various pointer types
were supplied, so it was necessary to cast the return value of malloc into
the pointer type that you required:

#include <stdlib.h>

T *foo(int x)
{
T *new = (T *)malloc(sizeof *new); /* ancient history */
if(new)
{
new->zog = x;
}
return new;
}

But this ceased to be true in 1989 -- a good 15 years ago as I write this.
ANSI C introduced the void * pointer type, and gave it the very useful
property of being able to represent (without loss of information) any
object pointer whatsoever.

Consequently, it is no longer necessary to cast the return value of malloc.
But is it wise?

Clearly, if your code must be portable to compilers that pre-date the ANSI C
Standard of 1989, then you have no choice but to cast. Fair enough. But
this is true only in a vanishingly small number of cases. By far the
majority of C code written today does not need to cater for prehistoric
compilers. So we can, for the most part, ignore that reason for adding the
cast, and look for other advantages and disadvantages.

All code should either do something good, or stop something bad from
happening. Now, what good does a malloc cast do? One argument that is
occasionally raised in defence of the cast is that "the cast indicates the
type to which the return value is being assigned, so it makes the code more
self-documenting". But if that is true, then why do we not use casts more
often? Consider this example program from K&R2, page 12:

#include <stdio.h>

/* print Fahrenheit-Celsius table
for fahr = 0, 20, ..., 300; floating-point version */
main()
{
float fahr, celsius;
int lower, upper, step;

lower = 0; /* lower limit of temperature table */
upper = 300; /* upper limit */
step = 20; /* step size */

fahr = lower;
while(fahr <= upper) {
celsius = (5.0/9.0) * (fahr-32.0);
printf("%3.0f %6.1f\n", fahr, celsius);
fahr = fahr + step;
}
}

Now let's add those "self-documenting" casts:

#include <stdio.h>

/* print Fahrenheit-Celsius table
for fahr = 0, 20, ..., 300; floating-point version */
main()
{
float fahr, celsius;
int lower, upper, step;

lower = (int)0; /* lower limit of temperature table */
upper = (int)300; /* upper limit */
step = (int)20; /* step size */

fahr = (float)lower;
while((float)fahr <= (int)upper) {
celsius = (((float)
((float)5.0/(float)9.0)) *
(float)((float)
(((float)fahr-(float)32.0))));

(int)((int (*)(const char *, ...))
printf((const char *)"%3.0f %6.1f\n",
(float)fahr, (float)celsius));

fahr = (float)((float)fahr + (float)step);
}
}

Believe it or not, those casts are all "correct". And yes, the code works
just fine. But suddenly the code isn't quite as easy to read, is it? So
much for self-documentation.

Well, all right -- what about C++? In C++, it is necessary to cast void *
into an object pointer type, because the implementation is forbidden from
providing an implicit conversion.

Yes, that's absolutely true, but it's also utterly irrelevant. C and C++ are
very different languages! They are divided by a common syntax. Nobody in
their right mind would dream of saying "I always wrap printf in a function
named writeln, to maintain compatibility with Pascal", would they? But
because C and C++ have superficial similarities at the syntax level, some
people seem to think it's necessary to write C code that compiles with a
C++ compiler. Well, it isn't.

If you are using a C++ compiler, then whether you like it or not, you're
writing C++ code, not C code. The rules are different. If you wish to write
C++ code that casts malloc (instead of using the perfectly serviceable new
allocator, or the STL's std::vector template), then that's entirely up to
you; good luck to you, and I wish you all joy in your use of C++. This
discussion is not directed at C++ users (except, perhaps, to remind them
that they are not writing in C, even if they think they are).

If you wish to use C code in a C++ project, that's easy to do, without
casting malloc. Use a C compiler to compile the C code (duh!), and then use
a linker to link the C code to the C++ code. C++ supplies the extern "C"
construct for precisely this purpose.

So far, we have found no good reasons for casting. But are there any good
reasons why we should not cast? Yes, there are.

Firstly, as I said earlier, all code should either do something good or stop
something bad happening. Casting malloc does neither, so the cast is dead
code, and dead code has no place in a C program.

Secondly, casting malloc can actually hide a serious bug. Let me say quickly
that the cast doesn't cause the bug. But if the bug is there, the cast can
conceal its presence.

The bug in question is that of failing to provide a valid function prototype
for malloc. The function returns a void *, of course, but the C compiler
doesn't know that, unless you tell it. The best way to tell it is to
#include <stdlib.h> which provides a prototype which the compiler uses to
do type-checking and which it can exploit for code generation purposes.

Let's consider a couple of ways in which things can go wrong. They both
hinge on the wording of section 3.3.2.2 of the ANSI C Standard of 1989,
which is as follows:

If the expression that precedes the parenthesized argument list
in a function call consists solely of an identifier, and if no
declaration is visible for this identifier, the identifier is
implicitly declared exactly as if, in the innermost block
containing the function call, the declaration

extern int identifier();

appeared.

The following footnote applies to the above text:

30. That is, a function with external linkage and no information
about its parameters that returns an int. If in fact it is
not defined as having type "function returning int ," the
behavior is undefined.

Whilst footnotes are not normative text, they are useful in helping us to
understand the intent of the committee, and sensible implementors will
generally observe them, so it makes sense for us to take what they say very
seriously.

Or, if you're not convinced by that, consider 3.1.6.2(2) of the Standard,
which says: "All declarations that refer to the same object or function
shall have compatible type; otherwise the behavior is undefined."

Consider the situation from the point of view of the compiler writer. As
part of his implementation, he provides a standard C library. As part of
his C library, he implements the malloc function. He almost certainly uses
his own C compiler to do this. The C compiler generates object code for
malloc, and this object code is placed into the standard C library, which
he then releases. Of course, this object code is based firmly on a malloc
that returns void *.

When you write a C program that calls malloc, the C implementation doesn't
have to compile malloc, because it is already compiled. All it has to do is
link the standard library (or at least, the parts of it that you actually
use) to your program. So nothing has changed, as far as the library is
concerned. The malloc function returns void *, and that's that.

In your program, let's just hypothesise for a moment that you forgot to
#include <stdlib.h>, so you have no prototype for malloc. The C compiler,
on encountering your malloc call, will therefore follow the wording of
3.3.2.2 of the ANSI C Standard, and presume that malloc returns int. It
will therefore generate code that assumes the return value of malloc is an
int. But you don't assign that value to an object of type int; rather, you
assign it to a pointer!

Under normal circumstances, this would require the compiler to issue a
diagnostic. That's because the code would violate a constraint (see
3.3.16.1), and the compiler must issue a diagnostic if the program contains
any constraint violations. But the cast forces the code to satisfy, rather
than violate, the constraint. Consequently, no diagnostic is required.

The wording of the diagnostic is sometimes rather unfortunate. Consider the
wording of the gcc diagnostic for this situation:

initialization makes pointer from integer without a cast

The wording is actually correct, because (by 3.3.2.2) gcc is right to assume
that an undeclared function returns int, and it's right in thinking that
you are trying to stick this int value into a pointer, but it's a mite
misleading, because it leads you to think that the correct fix is to add a
cast!

With the cast in place, you may not get a diagnostic at all. So what will
happen?

Well, of course, it might just work swimmingly well despite the lack of a
prototype. But we can't know that. And even if it does, we have no
guarantee that the same code will also work correctly if we were to switch
to a different compiler.

What sorts of things can go wrong? I offer you two (but by no means the only
two) possibilities. Firstly, what if sizeof(void *) > sizeof(int)? This is
not just a theoretical possibility. It is certainly true for, say, typical
MS-DOS programs using a large memory model. Here's how it would break:
malloc returns a void *, but the compiler is required to turn this value
into an int. There aren't enough bits in the int to store the whole value,
so some information is lost. The int is then coerced back into a pointer,
but the lost information cannot now be retrieved, and if those lost bits
actually affect the value (say, they weren't just a bunch of 0s), then the
effect is that the pointer object receives an incorrect value -- that is,
instead of pointing to the allocated memory block, it points somewhere
completely different instead!

What else could go wrong? Well, consider an implementation which has
separate registers for pointers and integers. On such an implementation,
the library code for malloc will, of course, store the return value, a
pointer, in a pointer register. The compiler, however, doesn't know this
(because we didn't have a prototype for malloc), so it will actually
collect its return value from an integer register. This is a bit like going
to the wrong Post Office when collecting a parcel, seeing a parcel that
looks about the right size, and grabbing it on the assumption that it's the
right one. And yet it can't be the right one, because you're looking in the
wrong place.

Again, the outcome is that your pointer object does not get the correct
value.

As a consequence, we must conclude that to cast malloc is dangerous, and
that a competent C programmer simply should not do it.

Incidentally, precisely the same argument applies to any function that
returns a pointer; functions such as bsearch, strcpy, memmove, other
standard library functions returning pointers, and of course any of your
own custom functions that return pointers.

Under what circumstances is casting correct?

Very few. Casting is almost always wrong, and the places in which it is
correct are rarely the ones you would guess.

One situation in which casting is a good idea is when you are calling any of
the functions prototyped in <ctype.h>. These functions take an int as
input, but the value stored in that int must either be EOF or a value that
can be represented in an unsigned char. Assuming that you're not daft
enough to pass EOF to such a function, then, it makes sense to cast the
value you are passing, unless you have some excellent reason for knowing
that it's bound to be in the appropriate range. So, for example, you could
reasonably call toupper in this way:

ch = toupper((unsigned char)ch);

What you don't have to do is worry about is casting to int (in this case).
Let the ordinary C promotion mechanism handle that for you.

When you are passing a value to the "tail" of a variadic function, you must
get the type just right, because the normal promotions won't be done, which
is in turn because the compiler has no type information to work with. If
the variadic function takes a T *, and you have a void * which you happen
to know points to an object of type T, that's fine, but you must cast the
pointer, to yield an expression of type T *. Conversely, if the function
expects a void *, you should cast to void * unless the pointer you have is
already of that type. Thus, when you call printf with a %p format
specifier, your matching pointer either should be a void * already, or
should be cast to one:

printf("Pointer value: %p\n", (void *)MyTPointer);

For the same basic reason, you should cast a size_t when printing it. I
generally use unsigned long for this purpose:

printf("Size: %lu\n", (unsigned long)sizeof MyTObject);

Summary

One of the characteristics of an expert C programmer is that he or she knows
in what circumstances a cast is required and in what circumstances it is at
best redundant and at worst a source of problems. Most programmers,
however, are guilty of "cargo cult" programming where casts are concerned.
Don't do that. Be an expert. Know why you are casting, whenever you cast,
and remember when maintaining your own or other people's code that almost
all casts in existing code should not actually be there.

----------------------------------------------------------------------

The full text of this article can be found on my Web site, at:

<http://www.cpax.org.uk/prg/writings/casting.php>

=?ISO-8859-1?Q?Martin_J=F8rgensen?= · Mar 11, 2006

Richard Heathfield wrote:
-snip-

Don't do that. Be an expert. Know why you are casting, whenever you cast,
and remember when maintaining your own or other people's code that almost
all casts in existing code should not actually be there.

Ok, thanks a lot - then it all makes sense. I see that I should avoid
these unnecessary castings because they suppress compiler warnings...

Isn't it possible to allocate memory somewhat like this (untested still):

- - - - - - - - - - - - - - - - - - - -
#include <stdio.h>
#include <stdlib.h>

int main()
{
unsigned long int total_mem_used = 0; // counter
int N = 20; // whatever number of elements

int *int_array = allocate_mem((N+1)*sizeof(int), __FILE__,
__LINE__, &total_mem);
printf("int_array takes up: %i bytes in memory", *int_array[N]);

double *double_array = allocate_mem((N+1)*sizeof(double), __FILE__,
__LINE__, &total_mem);
printf("double_array takes up: %i bytes in memory", *double_array[N]);

printf("Total memory occupation is: %li bytes", *total_mem);

free(int_array);
free(double_array);

exit(0);
}

void *allocate_mem(size_t size, char *file, int line, *total_mem)
{
int *void_ptr = malloc(size);
if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n", file,
line, size);
exit(EXIT_FAILURE);
}

void_ptr[N] = size; // save memory allocated for this pointer
total_mem += size; // update total memory allocated untill now

return void_ptr;
}

- - - - - - - - - - - - - - - - - - - -

I changed a little from my post yesterday because now I just made the
return pointer of allocate_mem() void * so I hope it can handle double *
as well as int * pointers at the same time... Not sure although...

I'm considering if there's anything I mistakenly have overseen and
whether or not it is necessary to make a free_mem() function or if it is
ever possible for free() to fail somehow?

Best regards / Med venlig hilsen
Martin Jørgensen

Michael Mair · Mar 11, 2006

Martin said:
Richard Heathfield wrote:
-snip-

Ok, thanks a lot - then it all makes sense. I see that I should avoid
these unnecessary castings because they suppress compiler warnings...
Good.

Isn't it possible to allocate memory somewhat like this (untested still):

As you are proceeding, certain tools become more useful; consider
using a lint tool, e.g. splint or, if you are ready to pay for it,
PCLint. As your code cannot be compiled, I fixed that and repost
the compiling version; I will comment on this version.

- - - - - - - - - - - - - - - - - - - -
#include <stdio.h>
#include <stdlib.h>

void *allocate_mem (size_t size, char *filename,
int line, unsigned long *total_mem);

int main (void)
{
unsigned long int total_mem_used = 0; /* counter */
int N = 20; /* whatever number of elements */
double *double_array = NULL;

int *int_array = allocate_mem((N+1)*sizeof(int),
__FILE__, __LINE__,
&total_mem_used);

Explicitly controlling the number of allocated bytes
is not a good idea. In addition, you want to use this
for double *, too. Consider passing number of elements
and element size instead.
As you already _are_ using size_t, make total_mem_used
a size_t, too (and accordingly the total_mem parameter
of allocate_mem).

printf("int_array takes up: %i bytes in memory",
int_array[N]);

double_array = allocate_mem((N+1)*sizeof(double),
__FILE__, __LINE__,
&total_mem_used);
printf("double_array takes up: %i bytes in memory",
(int) double_array[N]);

The cast is necessary because otherwise, you would be
accessing the representation of the double at
&double_array[N] and interprete it as int.

printf("Total memory occupation is: %li bytes",
(long) total_mem_used);

The same situation. This is uncritical on most modern
host machines but try to keep conversion specifiers and
passed argument types consistent.

free(int_array);
free(double_array);

exit(0);
}

void *allocate_mem (size_t size, char *filename,
int line, unsigned long *total_mem)
{
int *void_ptr = malloc(size);

Calling a pointer to int a void_ptr is not exactly a good
idea. In addition, you want to be able to use the memory
allocation for arbitrary types.
Use a void *.

if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n",
filename, line, (long) size);
exit(EXIT_FAILURE);
}

void_ptr[size/sizeof *void_ptr - 1] = size;

Here, you originally had [N]. This is a bad idea at best.
Even if you passed the number of elements separately,
you would here access the N+1st element of an array of
int -- which might not be at all the location of the
Nth element of the type of array you are allocating memory
for.

/* save memory allocated for this pointer */
*total_mem += size; /* update total memory allocated untill now */

return void_ptr;
}
- - - - - - - - - - - - - - - - - - - -

I changed a little from my post yesterday because now I just made the
return pointer of allocate_mem() void * so I hope it can handle double *
as well as int * pointers at the same time... Not sure although...

No, it cannot.
You _could_ try to salvage the idea with something like

void *alloc_mem (size_t num_elems, size_t elem_size,
char *filename, int line,
size_t *total_mem)
{
void *mem;
size_t size = num_elems*elem_size;
size += (sizeof (size_t) <= elem_size) ? elem_size
: sizeof (size_t);
mem = malloc(size);

if (!mem)
{
fprintf(stderr, "%s: line %d, malloc(%lu) failed.\n",
filename, line, (unsigned long) size);
exit(EXIT_FAILURE);
}

/* save memory allocated for this pointer */
memcpy(((char *)mem)+num_elems*elem_size,
&size, sizeof size);
*total_mem += size; /* update total memory allocated untill now */

return mem;
}

size_t retrieve_memsize (void *mem, size_t num_elems,
size_t elem_size)
{
size_t size = 0;
if (mem) {
memcpy(&size, ((char *)mem)+num_elems*elem_size,
sizeof size);
}
return size;
}

void free_mem (void *mem, size_t num_elems,
size_t elem_size, size_t *total_mem)
{
if (mem) {
size_t size = retrieve_memsize(mem, num_elems, elem_size);
free(mem);
*total_mem -= size;
}
}

but this is ugly beyond believe and means you have to
keep track of the element number nonetheless.
In addition, you have the problem that off-by-one errors
on working with your allocated storage cannot be easily
detected by tools specialized in this because you have
one additional "element" (namely the storage for the size).
So, you could just foul up your allocated sizes but never
have an access violation or segfault to help you find the
mistake.

A better approach, if you need the size, is making the
information explicit:

struct attributed_mem {
size_t size;
void *mem;
};

This means you only have to pass and recieve
struct attributed_mem * arguments.

allocated_mem() might have the following signature:

size_t allocated_mem(struct attributed_mem *memstruct,
size_t size,
size_t *total_mem);

returning the number of allocated bytes on success or 0
on failure, expecting the address of a struct attributed_mem
object and storing in it the size and address of the
allocated memory.
free_mem() then could look like this:
void free_mem (struct attributed_mem *memstruct,
size_t *total_mem)
{
if (memstruct) {
*total_mem -= memstruct->size;
memstruct->size = 0;
free(memstruct->mem);
}
else {
/* Your error handling here */
}
}

I'm considering if there's anything I mistakenly have overseen and
whether or not it is necessary to make a free_mem() function or if it is
ever possible for free() to fail somehow?

Yes, if you want to keep track of the total_mem, you need free_mem.
This way, you can check whether total_mem == 0 at the end of your
programme. Otherwise you have a problem.

Note: Some time ago, I posted a link to a hopefully standard conforming
version of the "store the size in the allocated memory" idea and asked
for a peer review; as I did not get one and have not checked it again,
it may not work correctly; see
<[email protected]>
for more details but follow the discussion in the thread and, of course,
read the documentation

Cheers
Michael

=?ISO-8859-15?Q?Martin_J=F8rgensen?= · Mar 12, 2006

Michael said:
Martin Jørgensen schrieb: -snip-

As you are proceeding, certain tools become more useful; consider
using a lint tool, e.g. splint or, if you are ready to pay for it,
PCLint. As your code cannot be compiled, I fixed that and repost
the compiling version; I will comment on this version.

Ok, thanks. I might consider trying out splint at some moment...

Explicitly controlling the number of allocated bytes
is not a good idea. In addition, you want to use this
for double *, too. Consider passing number of elements
and element size instead.
Ok.

As you already _are_ using size_t, make total_mem_used
a size_t, too (and accordingly the total_mem parameter
of allocate_mem).

Ok. But it doesn't really changes anything or does it? I mean, if it's
size_t or unsigned int/whatever how/why does it make any difference?

printf("int_array takes up: %i bytes in memory",
int_array[N]);

double_array = allocate_mem((N+1)*sizeof(double),
__FILE__, __LINE__,
&total_mem_used);
printf("double_array takes up: %i bytes in memory",
(int) double_array[N]);

Click to expand...

The cast is necessary because otherwise, you would be
accessing the representation of the double at
&double_array[N] and interprete it as int.

I think I understand that. You probably meant: Otherwise the data at
double_array[N] would be double (instead of int), right?

The same situation. This is uncritical on most modern
host machines but try to keep conversion specifiers and
passed argument types consistent.

Ok. size_t I guess...

Calling a pointer to int a void_ptr is not exactly a good
idea. In addition, you want to be able to use the memory
allocation for arbitrary types.
Use a void *.

Oops, that was a mistake that came because I was considering how I
solved the problem with making a function that could both be used for
malloc'ing int * and double * data types (guess it also works for int
**, double ** types and perhaps also with different float-types although
I seldom use these). I started with two functions (int_ and double_
prefix) and forgot the remove this int *-thing from the code...

if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n",
filename, line, (long) size);
exit(EXIT_FAILURE);
}

void_ptr[size/sizeof *void_ptr - 1] = size;

Click to expand...

Here, you originally had [N]. This is a bad idea at best.
Even if you passed the number of elements separately,
you would here access the N+1st element of an array of
int -- which might not be at all the location of the
Nth element of the type of array you are allocating memory
for.

Why not? I thought malloc'ing space for N+1 made memory available so I
could access element [N] like I normally could access anything between
[0 < (N+1)].

No, it cannot.
You _could_ try to salvage the idea with something like

void *alloc_mem (size_t num_elems, size_t elem_size,
char *filename, int line,
size_t *total_mem)
{
void *mem;
size_t size = num_elems*elem_size;
size += (sizeof (size_t) <= elem_size) ? elem_size
: sizeof (size_t);

Hmmm. A stupid question, but isn't there something wrong there? sizeof
(size_t) less than or equal to elem_size gives size += elem_size if true
and sizeof (size_t) else.

According to my C-book, size_t is just a data type and not a specific
number. My book says that size_t is an unsigned integer type and:
"Instead, like the portable types (in32_t and so on), it is defined in
terms of the standard types".

mem = malloc(size);

if (!mem)

It that the same as testing for if (mem == NULL) ?

{
fprintf(stderr, "%s: line %d, malloc(%lu) failed.\n",
filename, line, (unsigned long) size);
exit(EXIT_FAILURE);
}

/* save memory allocated for this pointer */
memcpy(((char *)mem)+num_elems*elem_size,
&size, sizeof size);

Ok, on second thought, perhaps this is connected to the size += (sizeof
(size_t) <= elem_size) ? elem_size...-stuff, since you're copying the
address of &size to location ((char *)mem)+num_elems*elem_size ?

*total_mem += size; /* update total memory allocated untill now */

return mem;
}

size_t retrieve_memsize (void *mem, size_t num_elems,
size_t elem_size)
{
size_t size = 0;
if (mem) {
memcpy(&size, ((char *)mem)+num_elems*elem_size,

And (char *)mem does cast the mem-pointer to pointer to char???

sizeof size);
}
return size;
}

void free_mem (void *mem, size_t num_elems,
size_t elem_size, size_t *total_mem)
{
if (mem) {
size_t size = retrieve_memsize(mem, num_elems, elem_size);
free(mem);
*total_mem -= size;
}
}

but this is ugly beyond believe and means you have to
keep track of the element number nonetheless.

Yeah, okay... At least I think I would learn something more about
pointers and understand them better

In addition, you have the problem that off-by-one errors
on working with your allocated storage cannot be easily
detected by tools specialized in this because you have
one additional "element" (namely the storage for the size).
Yep.

So, you could just foul up your allocated sizes but never
have an access violation or segfault to help you find the
mistake.

A better approach, if you need the size, is making the
information explicit:

struct attributed_mem {
size_t size;
void *mem;
};

That was a good idea.

This means you only have to pass and recieve
struct attributed_mem * arguments.

Then I can also practice working with structures a bit

allocated_mem() might have the following signature:

size_t allocated_mem(struct attributed_mem *memstruct,
size_t size,
size_t *total_mem);

returning the number of allocated bytes on success or 0
on failure, expecting the address of a struct attributed_mem
object and storing in it the size and address of the
allocated memory.
free_mem() then could look like this:
void free_mem (struct attributed_mem *memstruct,
size_t *total_mem)
{
if (memstruct) {
*total_mem -= memstruct->size;
memstruct->size = 0;
free(memstruct->mem);
}
else {
/* Your error handling here */
}
}

Yes, if you want to keep track of the total_mem, you need free_mem.
This way, you can check whether total_mem == 0 at the end of your
programme. Otherwise you have a problem.

Yep... At least I would now have some indication about if everything
behaves as expected and I can debug step-by-step to investigate program
run and malloc'ing/free'ing memory blocks.

Note: Some time ago, I posted a link to a hopefully standard conforming
version of the "store the size in the allocated memory" idea and asked
for a peer review; as I did not get one and have not checked it again,
it may not work correctly; see
<[email protected]>
for more details but follow the discussion in the thread and, of course,
read the documentation

I just shortly read something in the thread but must get back to it
later today... I hope it's not too advanced

Best regards / Med venlig hilsen
Martin Jørgensen

Barry Schwarz · Mar 12, 2006

#include <stdio.h>
#include <stdlib.h>

int main()
{
unsigned long int total_mem_used = 0; // counter
int N = 20; // whatever number of elements

int *int_array = allocate_mem((N+1)*sizeof(int), __FILE__,
__LINE__, &total_mem);

There is no prototype in scope for allocate_mem. The compiler is
forced to assume it returns int. Error 1 is it doesn't so you invoke
undefined behavior. Error 2 is there is no implied conversion between
the assumed return type of int and the pointer on the left of the =
sign. Error 3 is you apparently chose to ignore the warning this
assignment generated. While you are free to ignore warnings (just as
compilers are free to produce meaningless ones), you really should
know why you are ignoring it before you decide to do so.

printf("int_array takes up: %i bytes in memory", *int_array[N]);

[] has higher precedence than *. The expression is parsed as
*(int_array[N]). int_array is a pointer to int. int_array[N] is the
N-th int after the first (which would be int_array[0]). It is not
legal to apply the dereference operator to an int. Why did you ignore
this diagnostic?

double *double_array = allocate_mem((N+1)*sizeof(double), __FILE__,
__LINE__, &total_mem);
printf("double_array takes up: %i bytes in memory", *double_array[N]);

double_array is a pointer to double. Once you fix the conflict
between the [] and * operators as noted above, the second argument
will end up a double. But your %i tells printf to expect an int. Lie
to the compiler and invoke undefined behavior.

printf("Total memory occupation is: %li bytes", *total_mem);

free(int_array);
free(double_array);

exit(0);
}

void *allocate_mem(size_t size, char *file, int line, *total_mem)
{
int *void_ptr = malloc(size);
if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n", file,
line, size);
exit(EXIT_FAILURE);
}

void_ptr[N] = size; // save memory allocated for this pointer

On the second call, you allocate space for N+1 doubles. But since
void_ptr is an int*, on systems where doubles are 8 and ints are 4
bytes, you are not storing data in the last double. Furthermore,
storing an int bit pattern in a double could possibly generate a trap
representation. Furthermore, it would only initialize half the
double, the other half is still indeterminate. Back in main, when you
try to print the N-th double, you invoke undefined behavior by
evaluating an uninitialized variable.

total_mem += size; // update total memory allocated untill now

return void_ptr;
}

Remove del for email

Michael Mair · Mar 12, 2006

Martin said:
Ok. But it doesn't really changes anything or does it? I mean, if it's
size_t or unsigned int/whatever how/why does it make any difference?

Consistency. As soon as you start using unsigned types, you
should use them consistently; this is one of the reasons why
many people rather use int for sizes and indices. As soon
as you mix signed and unsigned, bad things can happen when
you compare them.
If you already _have_doomed yourself to consistency, be
consistent. Even if you "know" that on your machine, size_t
effectively _is_ unsigned int, this may be different elsewhere.
If they are different, you may very well get into trouble with
implicit conversions or type sizes once more -- so, use one and
only one type for sizes and loop indices throughout your
programme.

printf("int_array takes up: %i bytes in memory",
int_array[N]);

double_array = allocate_mem((N+1)*sizeof(double),
__FILE__, __LINE__,
&total_mem_used);
printf("double_array takes up: %i bytes in memory",
(int) double_array[N]);

Click to expand...

The cast is necessary because otherwise, you would be
accessing the representation of the double at
&double_array[N] and interprete it as int.

Click to expand...

I think I understand that. You probably meant: Otherwise the data at
double_array[N] would be double (instead of int), right?

No. You are passing something to a variable argument list
function. This function does not automatically convert arguments
to the right type of argument. The type of the argument is
communicated by _you_ (via %i). Imagine you have 64 bit doubles
and 32 bit ints. Then it is possible that printf() takes 32 bits
of the passed double and interpretes the bit pattern as the bit
pattern of an int. This is not bad yet, but if you have other
arguments, then every other argument also may be "off" by 32 bits
-- your output is rubbish. If you are unlucky, it looks consistent.
The other way round is even worse: You passed an int and claimed it
was a double. With the sizes given above, you are accessing the int
and 32 arbitrary bits. This can lead to an access violation/segfault.
It may even lead to an invalid representation of double which kills
your programme. Maybe at the most inconvenient point.

Ok. size_t I guess...

Note: unsigned long may be more easy to use. If you want to
printf() size_t, you always have to cast to unsigned long in
C89, as there is no conversion specifier for size_t (in C99,
you have 'z', i.e. %zu will do).

Oops, that was a mistake that came because I was considering how I
solved the problem with making a function that could both be used for
malloc'ing int * and double * data types (guess it also works for int
**, double ** types and perhaps also with different float-types although
I seldom use these). I started with two functions (int_ and double_
prefix) and forgot the remove this int *-thing from the code...

Happens often enough in real life, too.
It is often easier to write a function anew instead of trying to
adapt it.

if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n",
filename, line, (long) size);
exit(EXIT_FAILURE);
}

void_ptr[size/sizeof *void_ptr - 1] = size;

Click to expand...

Here, you originally had [N]. This is a bad idea at best.
Even if you passed the number of elements separately,
you would here access the N+1st element of an array of
int -- which might not be at all the location of the
Nth element of the type of array you are allocating memory
for.

Click to expand...

Why not? I thought malloc'ing space for N+1 made memory available so I
could access element [N] like I normally could access anything between
[0 < (N+1)].

Yes. Barry Schwarz already answered this one:
void_ptr[N] is effectively the same as
*((int *)(((char *)void_ptr) + N*sizeof *void_ptr))
This calculation goes wrong as soon as you are not working with
ints but with doubles. If sizeof (double) > sizeof (int) which is
very likely, then you are accessing someplace in the middle of
the array and store the size representation there. Even worse,
if you allocate for char with sizeof (char) < sizeof (int), you
are accessing a place way beyond your allocated memory.
Note: What I did above is nothing better -- it only is a replacement
for N.

This is the reason why you need the element size and number,
so you can calculate the correct address where you can store
the size.

Hmmm. A stupid question, but isn't there something wrong there? sizeof
(size_t) less than or equal to elem_size gives size += elem_size if true
and sizeof (size_t) else.

According to my C-book, size_t is just a data type and not a specific
number. My book says that size_t is an unsigned integer type and:
"Instead, like the portable types (in32_t and so on), it is defined in
terms of the standard types".

Yes. I allocate enough memory to store a size_t in the "element"
beyond the last "official" element of the array.
Obviously, we need at least size_t bytes in case the element size
of the allocated storage is less than sizeof (size_t) -- otherwise
we would store our size outside the allocated storage.
To be on the safe side, I also allocated enough storage for at
least one more element, so that one cannot access storage that does
not belong to the allocated storage via the "last+1st" array element.
This is not strictly necessary but wastes only a few bytes.

It that the same as testing for if (mem == NULL) ?

Yes, sorry for obscurity.

Ok, on second thought, perhaps this is connected to the size += (sizeof
(size_t) <= elem_size) ? elem_size...-stuff, since you're copying the
address of &size to location ((char *)mem)+num_elems*elem_size ?

Yes. It is perfectly possible that the element size is smaller than
the alignment required to store a size_t. The only safe way to store
and retrieve something not a char type[*] to or from an arbitrary
address is via bytewise copy. This is done via memcpy() or memmove().
This is the reason why we need at least sizeof size == sizeof(size_t)
extra bytes.
In order to "ward" against the own stupidity (accessing array element
"N+1" even if you are not supposed to -- for this there is an access
function!} I made sure that we also have enough memory for a full
array element.

[*] i.e. signed char, unsigned char, or char

And (char *)mem does cast the mem-pointer to pointer to char???

Yes. You cannot perform pointer arithmetics on void *. As we really
have to calculate the byte number via the product as the element
sizes may vary, converting to char * is natural.
If we used another pointer type, then we could not necessarily access
each address and accessing the address becomes more difficult due
to pointer arithmetics.
You could also write
&(((char*)mem)[num_elems*elem_size])
where I inserted gratuitous parentheses for clarity.
In C99, you could cast mem to an appropriate variable length array
pointer type ((*char)[elem_size]) and get the address of its
num_elems element.

That was a good idea.

Then I can also practice working with structures a bit

Indeed

I just shortly read something in the thread but must get back to it
later today... I hope it's not too advanced

The same concepts as discussed here; I decided to "hide" the
size in the "negative" regions of the resulting array; rationale:
Most people write beyond the "end" of their allocated space.
This, of course has the side effect that free() will fail.
And as I stated in the original thread: It is a bad idea
nonetheless.
If you have questions, you are of course welcome to ask.

Cheers
Michael

=?ISO-8859-1?Q?Martin_J=F8rgensen?= · Mar 12, 2006

Barry said:
There is no prototype in scope for allocate_mem. The compiler is
forced to assume it returns int. Error 1 is it doesn't so you invoke
undefined behavior. Error 2 is there is no implied conversion between
the assumed return type of int and the pointer on the left of the =
sign. Error 3 is you apparently chose to ignore the warning this
assignment generated. While you are free to ignore warnings (just as

Error 3: I can't ignore anything I didn't see yet.

compilers are free to produce meaningless ones), you really should
know why you are ignoring it before you decide to do so.

As I wrote, that code was/is untested. I didn't/I don't expect people to
copy/paste something that is "untested" into their compiler, just to
look at the code and provide some comments so I can implement it myself.

I can however see that Michael is doing a great job and helping me out
with good suggestions. That is actually more than I hoped for, so
ofcourse I have no problem with that (on the contrary I higly appreciate
the help I get here).

printf("int_array takes up: %i bytes in memory", *int_array[N]);

Click to expand...

[] has higher precedence than *. The expression is parsed as
*(int_array[N]). int_array is a pointer to int. int_array[N] is the
Ok.

N-th int after the first (which would be int_array[0]). It is not
legal to apply the dereference operator to an int. Why did you ignore
this diagnostic?

I didn't ignore it because I haven't seen it. I've had a cold this
weekend and is still a bit ill...

double *double_array = allocate_mem((N+1)*sizeof(double), __FILE__,
__LINE__, &total_mem);
printf("double_array takes up: %i bytes in memory", *double_array[N]);

Click to expand...

double_array is a pointer to double. Once you fix the conflict
between the [] and * operators as noted above, the second argument
will end up a double. But your %i tells printf to expect an int. Lie
to the compiler and invoke undefined behavior.

Yes. That was a mistake. I thought of saving the memory occupation as
integer (should actually be size_t, as Michael proposes). But that is
probably a bit more complicated when dealing with double *-arrays, as I
can see Michael Mair also discusses.

printf("Total memory occupation is: %li bytes", *total_mem);

free(int_array);
free(double_array);

exit(0);
}

void *allocate_mem(size_t size, char *file, int line, *total_mem)
{
int *void_ptr = malloc(size);
if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n", file,
line, size);
exit(EXIT_FAILURE);
}

void_ptr[N] = size; // save memory allocated for this pointer

Click to expand...

On the second call, you allocate space for N+1 doubles. But since
void_ptr is an int*, on systems where doubles are 8 and ints are 4
bytes, you are not storing data in the last double. Furthermore,

Yeah, having void_ptr an int* is completely wrong if it should be
universal for all pointer types. Not sure I got that part about not
storing data in the last double, though...

As I see it (which is probably wrong) I would perhaps not be storing
data in the last int, but size_t size would still be written to the last
element in void_ptr???

storing an int bit pattern in a double could possibly generate a trap
representation. Furthermore, it would only initialize half the
double, the other half is still indeterminate. Back in main, when you

I understand that it would only initialize half of the double, but
that's not a big problem is it?

try to print the N-th double, you invoke undefined behavior by
evaluating an uninitialized variable.

Actually I think I'll start with the code Michael provided... This code
was an early "pre-alpha-version" and Michael came with some better
suggestions so it's clearly a better idea to start with them...

Best regards / Med venlig hilsen
Martin Jørgensen

=?ISO-8859-15?Q?Martin_J=F8rgensen?= · Mar 12, 2006

Michael said:
Martin Jørgensen schrieb:

-snip-

[Understood and agreed everything till now]

Note: unsigned long may be more easy to use. If you want to
printf() size_t, you always have to cast to unsigned long in
C89, as there is no conversion specifier for size_t (in C99,
you have 'z', i.e. %zu will do).

Is size_t always unsigned long or can it on some system "just be"
unsigned int too? Because my book is a little unclear whether I have to
use %u or %lu on a system where %zu doesn't work?

I would prefer the unsigned long type, just to be sure it can handle
large memory numbers too...

Happens often enough in real life, too.
It is often easier to write a function anew instead of trying to
adapt it.

Yeah, probably if you're as experienced as you I guess

I still have to look at something (examples or old code), when it comes
to dealing with casts + pointers + malloc + free-stuff

if (void_ptr == NULL)
{
fprintf(stderr, "%s: line %d, malloc(%ld) failed.\n",
filename, line, (long) size);
exit(EXIT_FAILURE);
}

void_ptr[size/sizeof *void_ptr - 1] = size;

Here, you originally had [N]. This is a bad idea at best.
Even if you passed the number of elements separately,
you would here access the N+1st element of an array of
int -- which might not be at all the location of the
Nth element of the type of array you are allocating memory
for.

Click to expand...

Why not? I thought malloc'ing space for N+1 made memory available so I
could access element [N] like I normally could access anything between
[0 < (N+1)].

Click to expand...

Yes. Barry Schwarz already answered this one:
void_ptr[N] is effectively the same as
*( (int *) ( ( (char *) void_ptr) + N*sizeof *void_ptr) )

So one can write:

*( (int *) ( ( (char *) void_ptr) + N*sizeof *void_ptr) ) = size; ???

That's a long pointer address... But this explanation is nice to look at
I think, although it has a lot of confusing parentheses... I had to fill
in space, as you can see to understand it better...

This calculation goes wrong as soon as you are not working with
ints but with doubles. If sizeof (double) > sizeof (int) which is

I see that...

very likely, then you are accessing someplace in the middle of
the array and store the size representation there. Even worse,
if you allocate for char with sizeof (char) < sizeof (int), you
are accessing a place way beyond your allocated memory.

Let me make sure I understood this. Lets say N=3: 3 * sizeof (char) = 3
bytes. And sizeof(int) is 4 bytes, AFAIR, right?

So, looking at:

*( (int *) ( ( (char *) void_ptr) + N*sizeof *void_ptr) ) = size;

The problem would be that *( (int *) (.... + 3 bytes) ) would be casted
to.... Hmm... would that be 0 or 1 due to the (int *)? My guess is that
would be the first element of void_ptr because it's probably like
dividing the offset 3 bytes by sizeof(int) = 4 bytes, and throwing the
remainder away?

So the problem is that the 3 first bytes (first N=3 char elements) would
be overwritten? Did I understand it?

Note: What I did above is nothing better -- it only is a replacement
for N.

Hm.... I can see there is a problem... I'll have to look at your proposal.

This is the reason why you need the element size and number,
so you can calculate the correct address where you can store
the size.

Ok, I see.

Yes. I allocate enough memory to store a size_t in the "element"
beyond the last "official" element of the array.

Ok, having looked at the code for very long I think it looks okay...

Obviously, we need at least size_t bytes in case the element size
of the allocated storage is less than sizeof (size_t) -- otherwise
we would store our size outside the allocated storage.
Yep.

To be on the safe side, I also allocated enough storage for at
least one more element, so that one cannot access storage that does
not belong to the allocated storage via the "last+1st" array element.
This is not strictly necessary but wastes only a few bytes.

Where did you do that?

From my point of view it looks like the allocated space is exactly what
is needed... I probably overlooked that...

Yes, sorry for obscurity.

That's okay. Nice to learn something new.

Ok, on second thought, perhaps this is connected to the size +=
(sizeof (size_t) <= elem_size) ? elem_size...-stuff, since you're
copying the address of &size to location ((char
*)mem)+num_elems*elem_size ?

Click to expand...

Yes. It is perfectly possible that the element size is smaller than
the alignment required to store a size_t. The only safe way to store
and retrieve something not a char type[*] to or from an arbitrary
address is via bytewise copy. This is done via memcpy() or memmove().
This is the reason why we need at least sizeof size == sizeof(size_t)
extra bytes.

I think I understand that now... That is really great coding... I would
never have figured that out myself, but it is actually very logical now
I'm reading your explanation...

And I feel I'm taking a giant leap towards learning "pointer
acrobatics", by watching that code

In order to "ward" against the own stupidity (accessing array element
"N+1" even if you are not supposed to -- for this there is an access
function!} I made sure that we also have enough memory for a full
array element.

And here you're talking about the "retrieve_memsize"-function? That is
nice... Probably saved me for some trouble

[*] i.e. signed char, unsigned char, or char

And (char *)mem does cast the mem-pointer to pointer to char???

Click to expand...

Yes. You cannot perform pointer arithmetics on void *. As we really
have to calculate the byte number via the product as the element
sizes may vary, converting to char * is natural.
If we used another pointer type, then we could not necessarily access
each address and accessing the address becomes more difficult due
to pointer arithmetics.

That is because you know that char is exactly 1 byte, right (is it that
way on all systems???) ?

That was exactly what I needed to understand this, I think...

You could also write
&(((char*)mem)[num_elems*elem_size])
where I inserted gratuitous parentheses for clarity.

So you're saying that instead of:

( (char *)mem ) + num_elems*elem_size

I could write:

&( ( (char*)mem ) [num_elems*elem_size] )

Then I can see you're converting mem-pointer to "pointer-to-char" and in
both cases you're adding what corresponds to element number [N]. I think
I understand that too now...

In C99, you could cast mem to an appropriate variable length array
pointer type ((*char)[elem_size]) and get the address of its
num_elems element.

Not understood this part about variable length array, but it's luckily
not that important I guess. You would still need something with:
[num_elems*elem_size] right?

I'll still have to get back to this later...

But why don't take this struct-thing a step further like here:

struct attributed_mem {
void *mem;
size_t num_elems,
size_t elem_size
size_t size;

};

?

Then in the end of the week, where I can play with this again, I could
change the program so one of prototypes would become:

size_t retrieve_memsize(attributed_mem attributes);
{
(code that gets the properties from struct type attributed_mem,
variable: attributes and does whatever)
}

-snip-

If you have questions, you are of course welcome to ask.

Thanks, I just have these comments in this post...

Best regards / Med venlig hilsen
Martin Jørgensen

Martin Joergensen · Mar 12, 2006

Martin Jørgensen schrieb:

The code below seem to work. Actually I never understood why I can't have a
prinft("something here"); statement between 2 mallocs in the beginning of my
program? Why is that so? The compiler fails when it comes to malloc number
two...

My code - based on the suggestions from this thread (just copy/paste +
compile + run) - my compiler didn't accept %zd in printf-statements for
size_t-types:
- - - - - - -
#include <stdio.h>

#include <stdlib.h>

#include <string.h> // for memcpy

/* prototypes */

void *alloc_mem (size_t num_elems, size_t elem_size,

char *filename, int line,

size_t *total_mem);

size_t retrieve_memsize (void *mem, size_t num_elems,

size_t elem_size);

void free_mem (void *mem, size_t num_elems,

size_t elem_size, size_t *total_mem);

////////////////////////////////////////////////////////

int main()

{

size_t total_mem = 0; // counter

int N = 20; // whatever number of elements

int *int_array = (int *) alloc_mem( N, sizeof(int_array[0]), __FILE__,
__LINE__, &total_mem);

// Here I can't have a printf-statement, but I never understood why not?

double *double_array = (double *) alloc_mem( N, sizeof(double_array[0]),
__FILE__, __LINE__, &total_mem);

printf("int_array takes up: %lu bytes in memory\n",
retrieve_memsize(int_array, N, sizeof(int_array[0]) ) );

printf("double_array takes up: %lu bytes in memory\n",
retrieve_memsize(double_array, N, sizeof(double_array[0]) ) );

printf("\nTotal memory occupation is: %lu bytes\n", total_mem);

free_mem (int_array, N, sizeof(int_array[0]), &total_mem);

printf("Total memory occupation is: %lu bytes\n", total_mem);

free_mem(double_array, N, sizeof(double_array[0]), &total_mem);

printf("Total memory occupation is: %lu bytes\n", total_mem);

exit(0);

}

///////////////////////////////////////////////////////

void *alloc_mem (size_t num_elems, size_t elem_size,

char *filename, int line,

size_t *total_mem)

{

void *mem;

size_t size = num_elems*elem_size;

size += (sizeof (size_t) <= elem_size) ? elem_size

: sizeof (size_t);

mem = malloc(size);

if (!mem)

{

fprintf(stderr, "%s: line %d, malloc(%lu) failed.\n",

filename, line, (unsigned long) size);

exit(EXIT_FAILURE);

}

/* save memory allocated for this pointer */

memcpy(((char *)mem)+num_elems*elem_size,

&size, sizeof size);

*total_mem += size; /* update total memory allocated untill now */

return mem;

}

size_t retrieve_memsize (void *mem, size_t num_elems,

size_t elem_size)

{

size_t size = 0;

if (mem) {

memcpy(&size, ((char *)mem)+num_elems*elem_size,

sizeof size);

}

return size;

}

void free_mem (void *mem, size_t num_elems,

size_t elem_size, size_t *total_mem)

{

if (mem) {

size_t size = retrieve_memsize(mem, num_elems, elem_size);

free(mem);

*total_mem -= size;

}

}
- - - - - -

Best regards / Med venlig hilsen
Martin Jørgensen

Jordan Abel · Mar 12, 2006

Is size_t always unsigned long or can it on some system "just be"
unsigned int too? Because my book is a little unclear whether I have to
use %u or %lu on a system where %zu doesn't work?

I would prefer the unsigned long type, just to be sure it can handle
large memory numbers too...

cast to unsigned long and use %lu.

Keith Thompson · Mar 12, 2006

Martin Jørgensen said:
Is size_t always unsigned long or can it on some system "just be"
unsigned int too? Because my book is a little unclear whether I have
to use %u or %lu on a system where %zu doesn't work?

I would prefer the unsigned long type, just to be sure it can handle
large memory numbers too...

size_t can be either unsigned int or unsigned long; it's up to the
compiler to decide which one to use. (Theoretically, I suppose it can
be unsigned short.) In C99, it can be unsigned long long, though it's
been argued that it should never be longer than unsigned long.

If you want to print a size_t value, and you can't be sure the
implementation supports "%zu", you should use "%lu" *and* cast the
value to unsigned long:

size_t s = whatever;
printf("s = %lu\n", (unsigned long)s);

This will work regardless of how size_t is defined (unless it's bigger
than unsigned long *and* the value happens to exceed ULONG_MAX).

What you can't safely do is use either "%u" or "%lu" with an
expression of type size_t *unless* you convert it to unsigned int or
unsigned long, respectively.

[...]

Let me make sure I understood this. Lets say N=3: 3 * sizeof (char) =
3 bytes. And sizeof(int) is 4 bytes, AFAIR, right?

sizeof(int) varies from one platform to another. With 8-bit bytes,
I've seen systems with sizeof(int) equal to 2, 4, and 8; there are
probably other cases as well.

[...]

That is because you know that char is exactly 1 byte, right (is it
that way on all systems???) ?

Yes, that's C's definition of "byte". The number of bits in a byte is
specified by CHAR_BIT, which must be at least 8 (it's commonly exactly
8, but it can be larger).

Jordan Abel · Mar 12, 2006

size_t can be either unsigned int or unsigned long; it's up to the
compiler to decide which one to use. (Theoretically, I suppose it can
be unsigned short.) In C99, it can be unsigned long long, though it's
been argued that it should never be longer than unsigned long.

If you want to print a size_t value, and you can't be sure the
implementation supports "%zu", you should use "%lu" *and* cast the
value to unsigned long:

size_t s = whatever;
printf("s = %lu\n", (unsigned long)s);

This will work regardless of how size_t is defined (unless it's bigger
than unsigned long *and* the value happens to exceed ULONG_MAX).

#if __STDC_VERSION__ == 199901l
#define prntZ "z"
#elif SIZE_MAX <= UINT_MAX && sizeof(size_t) <= sizeof(int)
#define prntZ ""
#elif SIZE_MAX == ULONG_MAX && sizeof(size_t) == sizeof(long)
#define prntZ "l"
#else
#error size_t too big
#endif

size_t s = whatever;
printf("s = %"prntZ"u", s);

What you can't safely do is use either "%u" or "%lu" with an
expression of type size_t *unless* you convert it to unsigned int or
unsigned long, respectively.

If it's required to be an standard unsigned integer type, and I believe
it is in c89, yes you can. see above.

[i think that a PRIuSIZE (and friends) from inttypes.h would have been a
better solution than putting in three extra format things, but that's
just me.]

Richard Heathfield · Mar 13, 2006

Keith Thompson said:

size_t can be either unsigned int or unsigned long; it's up to the
compiler to decide which one to use. (Theoretically, I suppose it can
be unsigned short.)

Or even unsigned char.

Keith Thompson · Mar 13, 2006

Richard Heathfield said:
Keith Thompson said:

Or even unsigned char.

But not _Bool, I think.

=?ISO-8859-1?Q?Martin_J=F8rgensen?= · Mar 13, 2006

Jordan said:
cast to unsigned long and use %lu.

Ok, seems like you all agree on that.

Best regards / Med venlig hilsen
Martin Jørgensen

Michael Mair · Mar 15, 2006

Martin said:
The code below seem to work. Actually I never understood why I can't have a
prinft("something here"); statement between 2 mallocs in the beginning of my
program? Why is that so? The compiler fails when it comes to malloc number
two...

My code - based on the suggestions from this thread (just copy/paste +
compile + run) - my compiler didn't accept %zd in printf-statements for
size_t-types:
- - - - - - -

void *alloc_mem (size_t num_elems, size_t elem_size,

int *int_array = (int *) alloc_mem( N, sizeof(int_array[0]), __FILE__,
__LINE__, &total_mem);

Why are you casting? void* can be converted to int* implicitly.
Same argument as for malloc() itself.

// Here I can't have a printf-statement, but I never understood why not?

double *double_array = (double *) alloc_mem( N, sizeof(double_array[0]),
__FILE__, __LINE__, &total_mem);

Because you cannot mix declarations and statements in C89.

Use
int *int_array = NULL;
double *double_array = NULL;

int_array = ....
if (int_array) printf("%p\n", int_array);
else puts("NULL");
double_array = ....

<snip>

BTW: Your code is horribly formatted. Please consider running it
through usenetify2.c (http://www.contrib.andrew.cmu.edu/~ajo/)
or similar; you really lose helpful comments there.
In addition, be careful about // comments -- they do not go well
with usenet.

Cheers
Michael

=?ISO-8859-15?Q?Martin_J=F8rgensen?= · Mar 15, 2006

Michael said:
Martin Joergensen schrieb:

void *alloc_mem (size_t num_elems, size_t elem_size,

Click to expand...

int *int_array = (int *) alloc_mem( N, sizeof(int_array[0]), __FILE__,
__LINE__, &total_mem);

Click to expand...

Why are you casting? void* can be converted to int* implicitly.
Same argument as for malloc() itself.

Hmmm. I don't know. Hopefully it's because I got a compiler warning and
then wanted to get rid of it... Can't exactly remember if there was any
reason at all...

// Here I can't have a printf-statement, but I never understood why not?

double *double_array = (double *) alloc_mem( N,
sizeof(double_array[0]), __FILE__, __LINE__, &total_mem);

Click to expand...

Because you cannot mix declarations and statements in C89.
Oh.

Use
int *int_array = NULL;
double *double_array = NULL;

int_array = ....
if (int_array) printf("%p\n", int_array);
else puts("NULL");
double_array = ....

<snip>

BTW: Your code is horribly formatted. Please consider running it
through usenetify2.c (http://www.contrib.andrew.cmu.edu/~ajo/)
or similar; you really lose helpful comments there.
In addition, be careful about // comments -- they do not go well
with usenet.

Ok, agreed. I think I just made MS visual studio 2005 use space instead
of tabs now and looking at the code I posted earlier today show that as
long as I just copy/paste to code to notepad and then copy/paste it here
again, then I won't get any problems.

Best regards / Med venlig hilsen
Martin Jørgensen

SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Fibonacci	0	May 13, 2023
bitfield confusion	33	Jul 11, 2013
void pointers	36	Oct 5, 2010
memcpy()	14	Sep 22, 2005
Help with raycaster	0	Mar 27, 2025
Universal BMP Steganography Tool (AES-128-CTR + SP800-90A CSPRNG) Full Encoder/Decoder with 3LSB Payload, PasswordDerived Key & External Key File	4	Mar 26, 2026
How do i Do this function(dealing with arrays)	1	Dec 10, 2021

problem with memcpy and pointers/arrays confusion - again

=?ISO-8859-1?Q?Martin_J=F8rgensen?=

Michael Mair

=?ISO-8859-15?Q?Martin_J=F8rgensen?=

Richard Heathfield

=?ISO-8859-1?Q?Martin_J=F8rgensen?=

Michael Mair

=?ISO-8859-15?Q?Martin_J=F8rgensen?=

Barry Schwarz

Michael Mair

=?ISO-8859-1?Q?Martin_J=F8rgensen?=

=?ISO-8859-15?Q?Martin_J=F8rgensen?=

Martin Joergensen

Jordan Abel

Keith Thompson

Jordan Abel

Richard Heathfield

Keith Thompson

=?ISO-8859-1?Q?Martin_J=F8rgensen?=

Michael Mair

=?ISO-8859-15?Q?Martin_J=F8rgensen?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads