Question regarding safety of a malloc()

S

Stephan Beal

Hello group!

i'm currently working on code where i find myself repeatedly having to
allocating two objects to store some private data, e.g.:

whio_stream * whio_stream_for_dev( struct whio_dev * dev, bool
takeOwnership )
{
if( ! dev ) return 0;
whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) );
if( ! str ) return 0;
*str = whio_stream_dev_init;
whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));
if( ! sd )
{
free(str);
return 0;
}
str->implData = sd;
sd->dev = dev;
sd->ownsDev = takeOwnership;
return str;
}

The significant bit there is the two mallocs. Today i figured i'd try
to consolidate that, and turned the above code into:


whio_stream * whio_stream_for_dev( struct whio_dev * dev, bool
takeOwnership )
{
if( ! dev ) return 0;
whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) +
sizeof(whio_stream_dev) );
if( ! str ) return 0;
*str = whio_stream_dev_init;
whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));
str->implData = sd;
sd->dev = dev;
sd->ownsDev = takeOwnership;
return str;
}

and i changed my cleanup routine to not free the sd object (since free
(str) will now free that). That "seems to work", but now to my
question:

Is that second approach safe or am i just asking for alignment-related
(or platform-specific) problems?

If it is NOT safe, can you recommend a portable workaround which will
achieve the goal of using only one malloc instead of two?

Many thanks in advance for your insights!
 
T

Tomás Ó hÉilidhe

    whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) );



Better not to have the cast:

whio_stream *const str = malloc( sizeof(*str) );


    if( ! str ) return 0;
    *str = whio_stream_dev_init;



I'll presume that "whio_stream_dev_init is a global object you use for
initialising objects of the type whio_stream_dev.


    whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));


Here I presume you mean:

whio_stream_dev *sd = (whio_stream_dev*)(str + 1);

Pointer arithmetic takes into account the size of the type (if you
have a pointer to an 8-byte double, then adding 1 to the pointer adds
on 8 bytes).

str points to a region of memory that is big enough for one object of
type "whio_stream". Here, you get the address of the first byte after
the allocated region. It's OK to take the address of this byte, but
don't dereference that pointer.


    if( ! sd )


This will always be true*.

    {
        free(str);
        return 0;
    }
    str->implData = sd;



The next two statements write to memory that you don't own.


    sd->dev = dev;
    sd->ownsDev = takeOwnership;
    return str;

}

The significant bit there is the two mallocs.



I only see one malloc call.


Today i figured i'd try
to consolidate that, and turned the above code into:
    whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) +
sizeof(whio_stream_dev) );


I'd write:

whio_stream *str = malloc(sizeof(*str) + sizeof(whio_stream_dev));


    if( ! str ) return 0;
    *str = whio_stream_dev_init;
    whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));


Again here you want:

whio_stream_dev *sd = (whio_stream_dev*)(str + 1);

You have to be sure though that you're meeting the "alignment
requirements" of your target platform.

    str->implData = sd;
    sd->dev = dev;
    sd->ownsDev = takeOwnership;
    return str;

}

and i changed my cleanup routine to not free the sd object (since free
(str) will now free that). That "seems to work", but now to my
question:

Is that second approach safe or am i just asking for alignment-related
(or platform-specific) problems?

If it is NOT safe, can you recommend a portable workaround which will
achieve the goal of using only one malloc instead of two?


The following will definitely work:

struct Couple {
whio_stream a;
whio_stream_dev b;
};

struct Couple *const p = malloc(sizeof *p);

The C compiler you use will automagically insert padding between A and
B to meet the alignment requirements.
 
B

Barry Schwarz

Hello group!

i'm currently working on code where i find myself repeatedly having to
allocating two objects to store some private data, e.g.:

whio_stream * whio_stream_for_dev( struct whio_dev * dev, bool
takeOwnership )

It certianly would help if we knew what a whio_stream was.
{
if( ! dev ) return 0;

While technically correct, it makes your program easier to read if you
use NULL in this context.
whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) );

Casting malloc like this serves only to hide (not eliminate) undefined
behavior. If you include stdlib.h, it is pointless. If you don't,
the behavior is undefined and if you remove the cast the compiler will
issue a diagnostic to remind you.
if( ! str ) return 0;
*str = whio_stream_dev_init;

Is this defined in some header we don't know about.
whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));

Unless you have C99, declarations must precede executable statements.

Unless the alignment for a whio_stream is at least as restrictive as a
whio_stream_dev, this could invoke undefined behavior.

sd now points exactly 1 byte past the end of the allocated area.
if( ! sd )

How do you think this can ever evaluate to true?
{
free(str);
return 0;
}
str->implData = sd;
sd->dev = dev;

And this invokes undefined behavior because you don't own the memory
sd points to.
sd->ownsDev = takeOwnership;
return str;
}

The significant bit there is the two mallocs. Today i figured i'd try

I give up. Where is the second call to malloc?
to consolidate that, and turned the above code into:


whio_stream * whio_stream_for_dev( struct whio_dev * dev, bool
takeOwnership )
{
if( ! dev ) return 0;
whio_stream * str = (whio_stream*)malloc( sizeof(whio_stream) +
sizeof(whio_stream_dev) );
if( ! str ) return 0;
*str = whio_stream_dev_init;
whio_stream_dev * sd = (whio_stream_dev*) (str + sizeof
(whio_stream));

You now own the memory sd points to but the alignment problem remains.
str->implData = sd;
sd->dev = dev;
sd->ownsDev = takeOwnership;
return str;
}

and i changed my cleanup routine to not free the sd object (since free

Since your first code doesn't allocate sd, you better not pass it to
free().
(str) will now free that). That "seems to work", but now to my
question:

Is that second approach safe or am i just asking for alignment-related
(or platform-specific) problems?

No - Yes.
If it is NOT safe, can you recommend a portable workaround which will
achieve the goal of using only one malloc instead of two?

Create a new struct that consists of a whio_stream member and a
whio_stream_dev member (the order doesn't matter but assume this order
to make the code a little simpler). Allocate space for the new
structure and store the address in str (since the first member always
starts at the beginning of the structure). You can then compute the
address of the whio_stream_dev with
sd =
(whio_stream_dev*)&((new_struct*)str)->whio_stream_dev_member;
 
S

Stephan Beal

     If the cast is needed, it means you're doing something wrong.
(Likely candidates: Forgetting to include <stdlib.h>, or running
C code through a C++ compiler.)

The cast is an old habit. It makes me queezy to rely on implicit casts
from (void*).

That arithmetic was abviously bogus.
     Sorry; I can't seem to count that high.

That was my fault - i mis-pasted the first copy and ended up pasting
the second copy twice.

     You are not just asking for alignment problems, you are begging
for them.

That was my suspicion.
        struct { whio_stream str; whio_stream_dev sd; } *both;
        both = malloc(sizeof *both);

An excellent idea. Thank you very much.
     Insight #1: Don't try to out-clever the compiler, especially if
the compiler knows the language better than you do.  You will wind
up lying to the compiler -- and do you know what happens then?  "If
you lie to the compiler, it will get its revenge."

That's why i asked - i wasn't sure if i was lying to the compiler or
not.
     Insight #2: People who post half-baked approximations to their
code risk getting what they deserve: Half-baked answers that address
the problems with the half-baked approximations and have little to
do with the problems they actually care about.  Where's my ruler?
Ah, here it is.  Extend your wrist, please.

My apologies for the confusion caused by the mis-paste. i think the
intent would have been clearer had i not made that mistake.

Thank you very much for the suggestion - that's exactly what i was
looking for (but it was too obvious for me to see, apparently).
 
K

Keith Thompson

Stephan Beal said:
The cast is an old habit. It makes me queezy to rely on implicit casts
from (void*).
[...]

Then you shouldn't have a problem, since there's no such thing as an
"implicit cast". :cool:}

A cast is an operator that specifies a conversion; it cannot be
implicit. Some conversions are specified by casts, others are
implicit. What you meant (or should have meant) is that relying on
implicit *conversions* makes you queasy. Yes, I'm being picky, but
correct terminology can be important.

Having said that, I understand your queasiness. The problem with
casts (explicit conversions) is that they're too powerful. A cast
converts an expression to the type that you specify, and doesn't allow
the compiler to check whether you've specified the correct type.
You'll find that in the vast majority of cases, the implicit
conversions provided by the language are just what you want; adding a
cast overrides what the compiler would have done for you, and often
the result of the cast will then be subject to an implicit conversion
anyway.

The recommended idiom for calling malloc is:
ptr = malloc(sizeof *ptr);
or
ptr = malloc(COUNT * sizeof *ptr);
Note that you didn't have to specify the type; this form guarantees
that the correct type will be used. It also triggers a diagnostic if
you've forgotten the "#include <stdlib.h>".

See also the comp.lang.c FAQ, <http://www.c-faq.com/>, particularly
questions 7.6, 7.7, 7.7b, 7.7c. See also the rest of section 7,
section 6, and, well, the whole thing.
 
F

Flash Gordon

Malcolm McLean wrote, On 25/12/08 08:53:
You mean the idiom some people recommend. I don't, on the grounds that
the dereference is hard to do mentally,

For you, maybe. Most people don't seem to have a problem with it.
and calls to malloc often
contain the * operator as a multiply, making the whole thing difficult
to read.

Not really.

After a couple of readings it all becomes second nature to most people.
However the cast is unnecesssary and adds visual clutter, so I'm happy
to get rid of it. The stdlib inclusion check is a bonus.

It isn't bogus since by default a significant number of compilers won't
generate a warning otherwise.
 
H

Harald van Dijk

Malcolm McLean wrote, On 25/12/08 08:53:

It isn't bogus since by default a significant number of compilers won't
generate a warning otherwise.

Bonus, not bogus.
 
A

August Karlstrom

Malcolm said:
You mean the idiom some people recommend. I don't, on the grounds that
the dereference is hard to do mentally, and calls to malloc often
contain the * operator as a multiply, making the whole thing difficult
to read.

If you add parentheses around the dereferenced pointer there will be no
multiplication/dereference confusion:

ptr = malloc(COUNT * sizeof (*ptr));


August
 
T

Tomás Ó hÉilidhe

You mean the idiom some people recommend. I don't, on the grounds that the
dereference is hard to do mentally, and calls to malloc often contain the *
operator as a multiply, making the whole thing difficult to read.



If you don't want to confuse a unary operator with a binary operator,
then it's as simple as using parentheses:

malloc(5 * sizeof(*p));

You'll find that the vast majority of C programmers, but proficient
and deficient, use parentheses with sizeof all the time. Most C
programmers nowadays in 2008 are surprised to learn that you can use
sizeof without parentheses (when it's used on expressions as opposed
to types).

As regards the dereference being difficult to comprehend, well I
myself don't find it difficult, nor do many people here, but if you
think so then that's your choice.

I think it's really handy to mention the type in one place, e.g.:

MyFancyType *p;

And then to use stuff like:

malloc(sizeof(*p));

It makes less legwork for when you change the type (not to mention
removes the possibility of introducing bugs).

However the cast is unnecesssary and adds visual clutter, so I'm happy to
get rid of it. The stdlib inclusion check is a bonus.


The cast is redundant as far as I'm concerned. Not only redundant, but
it makes the code less flexible because you have to change the cast if
you change the type. Also you have the stdlib inclusion benefit (but
this should really be a non-issue because everybody should be setting
their compiler options to outlaw implicit declarations).
 
K

Keith Thompson

Malcolm McLean said:
You mean the idiom some people recommend. I don't, on the grounds that
the dereference is hard to do mentally, and calls to malloc often
contain the * operator as a multiply, making the whole thing difficult
to read.
However the cast is unnecesssary and adds visual clutter, so I'm happy
to get rid of it. The stdlib inclusion check is a bonus.

If you prefer to write
ptr = malloc(sizeof (*ptr));
I won't object.

Personally, I prefer *not* to use parentheses for "sizeof
<expression>" (unless they're otherwise necessary to control
precedence). I like to remind myself that "sizeof" is a unary
operator, not a function name, so I write sizeof expressions in a form
that doesn't look like a function call. (The confusion comes from the
fact that "sizeof" is the only operator in the language whose name is
a keyword rather than a symbol. If the symbol were $, it probably
wouldn't occur to most programmers to use parentheses.) It's the same
rationale as the one for not using extra parentheses on return
statements.

Even in the case where the operand is a parenthesized function name, I
like to visually distinguish it from a function call:
func(argument)
sizeof (type_name)
 
S

s0suk3

Having said that, I understand your queasiness. The problem with
casts (explicit conversions) is that they're too powerful. A cast
converts an expression to the type that you specify, and doesn't allow
the compiler to check whether you've specified the correct type.

Exactly, which is why I'd call it "unsafe", not "powerful". Consider
this example (explained in http://www.research.att.com/~bs/bs_faq2.html#void-ptr):

#include<stdio.h>

int main()
{
char i = 0;
char j = 0;
char* p = &i;
void* q = p;
int* pp = q; /* unsafe, legal C, not C++ */

printf("%d %d\n",i,j);
*pp = -1; /* overwrite memory starting at &i */
printf("%d %d\n",i,j);
}

(The conversion from the pointer returned by malloc, of course, is
indeed safe -- provided that the type referenced by the pointer on the
left side of the assignment is at least as wide as the block of memory
allocated by malloc.)
The recommended idiom for calling malloc is:
ptr = malloc(sizeof *ptr);
or
ptr = malloc(COUNT * sizeof *ptr);
Note that you didn't have to specify the type; this form guarantees
that the correct type will be used.

I prefer the sizeof(type-name) form, for the same reasons that Malcolm
explains. In addition: If the pointer's declaration is too far above
the call of malloc (as is sometimes the case in C89, where
declarations must precede statements), the call of malloc provides no
clue about the type of the object you're allocating, and you have to
scroll up to tell.

Sebastian
 
K

Keith Thompson

I prefer the sizeof(type-name) form, for the same reasons that Malcolm
explains.

I think Malcolm's objection is (mostly) addressed by adding extraneous
parentheses around the expression:

ptr = malloc(sizeof (*ptr));

The other issue is that *ptr appears to dereference the pointer, which
is invalid if ptr doesn't point to anything -- except of course that
the operand of sizeof isn't evaluated. I admit that understanding how
this works is moderately complicated -- but once you've figured it
out, it's a very useful idiom.
In addition: If the pointer's declaration is too far above
the call of malloc (as is sometimes the case in C89, where
declarations must precede statements), the call of malloc provides no
clue about the type of the object you're allocating, and you have to
scroll up to tell.

And why *should* the malloc call tell you what type of object you're
allocating? Most expressions don't specify types; if they did, every
expression would be burdened with huge amounts of type information.
The biggest advantage of the "sizeof *ptr" idiom is that you don't
have to know the type.
 
C

c prog fan

Eric said:
You are not just asking for alignment problems, you are begging
for them. You are standing at a bus station playing Christmas carols
on your flugelhorn, hoping the passengers will fill your horn case
with alignment problems. They will -- be sure to thank them -- and
then they'll board the bus while you take the SIGBUS.

[...]

I am afraid if he is looking through the WINDOWS of the nearby MICROshop
he may miss SIGBUS too. :)
 
E

Eric Sosman

Exactly, which is why I'd call it "unsafe", not "powerful". Consider
this example (explained in http://www.research.att.com/~bs/bs_faq2.html#void-ptr):

#include<stdio.h>

int main()
{
char i = 0;
char j = 0;
char* p = &i;
void* q = p;
int* pp = q; /* unsafe, legal C, not C++ */

Unsafe, *not* legal C (6.3.2.3p7, 6.5p7). Admittedly,
few compilers will protest (and they are not obliged to).
I prefer the sizeof(type-name) form, for the same reasons that Malcolm
explains.

I prefer not to take C advice from Malcolm McLean. He's
put a few chapters of his book on-line; go read them, ponder,
and make up your own mind about his qualifications.
In addition: If the pointer's declaration is too far above
the call of malloc (as is sometimes the case in C89, where
declarations must precede statements), the call of malloc provides no
clue about the type of the object you're allocating, and you have to
scroll up to tell.

Strange. The same issue -- physical distance between
declaration and use -- leads me to the opposite conclusion.
Here we go:

p = malloc(sizeof(struct mairzy_doats));
q = malloc(sizeof *q);

Which of these requires scrolling (assuming no complaints
from the compiler)?
 
E

Eric Sosman

Keith said:
Eric Sosman said:
Unsafe, *not* legal C (6.3.2.3p7, 6.5p7). Admittedly,
few compilers will protest (and they are not obliged to).
[...]

How is it not legal (especially given that the C standard doesn't
define the term "legal")?

As it happens, I considered raising the point about "legal"
but decided it would just be obfuscatory. If we want to get
pedantic about it, then yes: "legal" is not defined, and we
should avoid the term. To interpret the inexact language of my
post, for "legal" read "not generative of undefined behavior."
Other definitions are possible (for example, we might want to
avoid unspecified or implementation-defined behavior, too), but
"N.G.O.U.B." is the one I was thinking of when I composed my
earlier message.
6.3.2.3p7 says that the (implicit) conversion from void* to int*
invokes undefined behavior if the void* pointer is not properly
aligned for type int. But it's entirely possible that the value of q
(which is (void*)&i, where i is of type char) *is* properly aligned
for type int. (It's also entirely possible that it isn't.)

Yes, it invokes undefined behavior under circumstances that are not
under the control of the programmer, so Don't Do That. But I don't
think I'd call it illegal.

Call it "immoral," then, or "a nice knock-down argument."

6.3.2.3p7 does not say that the construct *is* illegal (see
above), but says it *may be* illegal. In my view and in Murphy's,
anything that can be illegal will be, and at the worst possible
time.

Also, 6.5p7 says the access *is* illegal, with no ifs, ands,
or buts. Unless, of course, U.B. has already occurred, in which
case the earlier construct was illegal.

Oh, by the way: You're under arrest. "I'm taking you in on
a five-oh-two."

"What's the charge?"

"Devouring maidens out of season."

"Out of season? You'll *never* pin that rap on me! Do you
*hear* me, cop?"

"Yeah, I hear you. I got you on a four-twelve too."

"A *four-twelve*? What's a four-twelve?"

"Overacting. Let's go."
 
K

Keith Thompson

Eric Sosman said:
Keith said:
Eric Sosman said:
(e-mail address removed) wrote:
Having said that, I understand your queasiness. The problem with
casts (explicit conversions) is that they're too powerful. A cast
converts an expression to the type that you specify, and doesn't allow
the compiler to check whether you've specified the correct type.
Exactly, which is why I'd call it "unsafe", not
"powerful". Consider this example (explained in
http://www.research.att.com/~bs/bs_faq2.html#void-ptr):
#include<stdio.h>
int main()
{
char i = 0;
char j = 0;
char* p = &i;
void* q = p;
int* pp = q; /* unsafe, legal C, not C++ */
Unsafe, *not* legal C (6.3.2.3p7, 6.5p7). Admittedly,
few compilers will protest (and they are not obliged to).
[...]
How is it not legal (especially given that the C standard doesn't
define the term "legal")?

As it happens, I considered raising the point about "legal"
but decided it would just be obfuscatory. If we want to get
pedantic about it, then yes: "legal" is not defined, and we
should avoid the term. To interpret the inexact language of my
post, for "legal" read "not generative of undefined behavior."
Other definitions are possible (for example, we might want to
avoid unspecified or implementation-defined behavior, too), but
"N.G.O.U.B." is the one I was thinking of when I composed my
earlier message.
6.3.2.3p7 says that the (implicit) conversion from void* to int*
invokes undefined behavior if the void* pointer is not properly
aligned for type int. But it's entirely possible that the value of q
(which is (void*)&i, where i is of type char) *is* properly aligned
for type int. (It's also entirely possible that it isn't.)
Yes, it invokes undefined behavior under circumstances that are not
under the control of the programmer, so Don't Do That. But I don't
think I'd call it illegal.

Call it "immoral," then, or "a nice knock-down argument."

6.3.2.3p7 does not say that the construct *is* illegal (see
above), but says it *may be* illegal. In my view and in Murphy's,
anything that can be illegal will be, and at the worst possible
time.

Agreed, but ...
Also, 6.5p7 says the access *is* illegal, with no ifs, ands,
or buts. Unless, of course, U.B. has already occurred, in which
case the earlier construct was illegal.

I don't think 6.5p7 applies to the portion of the code that you
quoted, though I think it does apply to later portions of the code.
Converting a (possibly misaligned) void* value to int* may invoke
undefined behavior, but 6.5p7 doesn't apply until you try to
dereference the resulting pointer. (And since 6.5p7 uses the word
"shall" outside a constraint, violating it invokes undefined behavior;
I shall briefly ignore the question of whether that makes it
"illegal".)
Oh, by the way: You're under arrest. "I'm taking you in on
a five-oh-two."

"What's the charge?"

"Devouring maidens out of season."

"Out of season? You'll *never* pin that rap on me! Do you
*hear* me, cop?"

"Yeah, I hear you. I got you on a four-twelve too."

"A *four-twelve*? What's a four-twelve?"

"Overacting. Let's go."

"But ocifer, if I'm *legally* drunk, why are you arresting me?"
 
J

James Kuyper

Eric said:
Keith Thompson wrote: ....

As it happens, I considered raising the point about "legal"
but decided it would just be obfuscatory. If we want to get
pedantic about it, then yes: "legal" is not defined, and we
should avoid the term. To interpret the inexact language of my
post, for "legal" read "not generative of undefined behavior."

I prefer "does not have behavior defined by the standard".
Call it "immoral," then, or "a nice knock-down argument."

6.3.2.3p7 does not say that the construct *is* illegal (see
above), but says it *may be* illegal. In my view and in Murphy's,
anything that can be illegal will be, and at the worst possible
time.

The problem with that formulation is that an implementation is permitted
to provide a definition for the behavior, and code that has a legitimate
reason for doing so could rely upon that definition (at a cost in
portability - but that might be a cost that can legitimately be worth
paying). That's rather unlikely in this particular case, but it's better
to keep the concept of "illegal" out of your thoughts with respect to C;
it's deliberately been constructed in a fashion that makes that a
concept a bad fit.
 
G

Guest

Not really. Hobby code and some minor utility type programs you might write
by yourself and for yourself, but most serious code has to be shared, and
several programmers work on it.

it's a common idiom, explain it to 'em. They have to deal with stuff
like

if ((file = fopen(filename, "r")) == NULL)
return BAD_FILE;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top