gcc knows about malloc()

R

Rafael Almeida

There are two successive UB which might seems to compensate each
other... But it might be false.

In mathematics you can't say : "I divided by zero, but just after that,
I multiplied by zero to compensate".

Is multiplying by zero a ub in math?
 
R

Richard Heathfield

Rafael Almeida said:
Is multiplying by zero a ub in math?

We might reasonably replace "exhibits undefined behaviour" with "is
meaningless" for the purposes of a mathematical analogy.

So: if X is the (meaningless) result of a division by zero, multiplying X by
anything, including zero, is meaningless.
 
O

Old Wolf

James said:
The gcc compiler treats malloc() specially!

gcc assumes that malloc()'s return value is not an
alias for another pointer. This can be confirmed by compiling
and running the following code twice, once with a
-Dgetnode=malloc
option.
void *getnode(int x)
{
return &A[x];
}

Defining an external symbol with the same name as a
standard library function (whether or not you have included
the relevant header), causes undefined behaviour.

Since your code has undefined behaviour, any bizarre
results should not be considered a compiler bug.
To avoid the problem (assuming the <stdlib.h> file has been
accidentally erased, but one knows malloc's correct declaration)
one could write:
void *malloc();
...
foo = malloc(TWELVE_MILLION);

Or even (if your employer docks your pay for explicitly declaring
library functions) :

if (0) malloc(); /* tell gcc that malloc is a function */
foo = ((void *(*)())malloc)(TWELVE_MILLION);

I can't imagine an employer rejecting the first example
but accepting the second!

In either case it is hard to see why you would do anything
other than:
#include <stdlib.h>

If your compiler does not have that file then it is non-conforming
so this whole discussion is moot!
 
K

Keith Thompson

SOMEBODY said:
[Eric Sosman wrote]:
And even that unlikely
problem could be corrected with a small change:

if (0) malloc(); /* tell gcc that malloc is a function */
foo = ((void *(*)(size_t))malloc)(TWELVE_MILLION);

Actually, I think this approach would also fix the "uninitialized
garbage from wrong return register" issue, since when the call is
made, the compiler has the correct return type.

It might happen to work, but it still invokes undefined behavior,
because you're still lying to the compiler. Without a prototype in
scope, there's no guarantee that the compiler knows how to convert
malloc (which it will assume is a pointer to function returning int)
to type void *(*)(size_t).
 
K

Keith Thompson

Jordan Abel said:
READ AGAIN. He didn't cast the result of the call. He cast the word
"malloc" and then called the result of the cast. Only one person here
even has a coherent argument. He's not talking about casting the
expression malloc(whatever), he's talking about casting the bare word
"malloc".

Ok, I read again. I was responding to the material that I quoted, not
(at that time) to James's proposed solution.

And as I said elsethread, casting malloc itself, as opposed to casting
the result of calling malloc(), *still* doesn't avoid undefined
behavior.
 
J

jaysome

You understate the problem. The int being converted is not guaranteed
to be meaningful, whether you convert it or not. The call itself
invokes undefined behavior before you even look at the result.

(On some systems, where int and void* happen to be the same size, and
^^^^^^^^^^^^
that should be "nearly all systems", by my accounts.
 
R

Richard Heathfield

jaysome said:
^^^^^^^^^^^^
that should be "nearly all systems", by my accounts.

I have at least two C compilers right here which use 16-bit int and 32-bit
void *. Both are installed and in reasonably frequent use.

Of course, you might think that Richard's an awkward git who keeps these
things around just to win arguments with, but in fact the real reason is
that some of the people who use my code *require* it to work on such
architectures, and I do so hate to disappoint people.
 
J

jaysome

jaysome said:


I have at least two C compilers right here which use 16-bit int and 32-bit
void *. Both are installed and in reasonably frequent use.

Of course, you might think that Richard's an awkward git who keeps these
things around just to win arguments with, but in fact the real reason is
that some of the people who use my code *require* it to work on such
architectures, and I do so hate to disappoint people.

I did say "nearly all systems", and not "all systems". I think you
helped to confirm my assertion.
 
J

jaysome

Not realy, less than half of mine (most being 64 bit).

I don't doubt you, but you're only one in a thousand, if not a
million. What do you think about that assertion?
 
R

Richard Heathfield

jaysome said:
I did say "nearly all systems", and not "all systems". I think you
helped to confirm my assertion.

Not at all. In fact, 100% of all systems that have been mentioned in
response to your assertion have been counter-examples. 0% is not "nearly
all" by any stretch of the imagination.
 
R

Richard Heathfield

jaysome said:
I don't doubt you, but you're only one in a thousand, if not a
million.

No, he's 1 in 2, so far. And his "less than half", coupled with my own
reply, still means that less than half the systems mentioned so far are in
line with your assertion.
What do you think about that assertion?

I think you need to learn that 2 != 1000, let alone 1000000.
 
S

SuperKoko

Rafael said:
Is multiplying by zero a ub in math?
No. Multiplying by zero is correct in math (so my analogy is not
perfect).
Here, in math, there is a single "undefined thing" : division by zero.
But, the idea is that you can't compensate UB, in any way...
Trying to compensate with well-defined behavior doesn't help (the
multiplication by zero), and trying to compensate with undefined
behavior is not better.



Jordan Abel:
The part you quoted doesn't support that. I'm not saying you're wrong,
i'm saying the quote doesn't support it. It says you can't _call_
a function through a pointer of the wrong type.
No, the standard doesn't say that.
It says that you can convert (with an explicit C cast) a function to a
pointer to function of an incompatible type, and cast it back to a
compatible type... Otherwise, using a pointer converted (with an
explicit C cast... Not any mean of buggy reintepretation of the bytes
of the representation of the pointer) from an original int(*)() pointer
to a void*(*)() pointer and calling a function on this new pointer
creates UB.
Strictly speaking, your program contains a single pointer conversion :
from int(*)() to void*(*)().

The original pointer is int(*)()... Nowhere in your code does appear an
explicit conversion from void*(*)() to int(*)().... You just have a
buggy mean to get a int(*)() pointer from nowhere sensible (from a
symbol which should refer to an int() function but that you interpret
as a void*() function).

For instance, if the compiler documents that the first cast yields a
valid pointer to function. For instance, the compiler (with an
extension) has two different "overloaded" functions : an "int
malloc(void)" and a "void* malloc(size_t)", and the compiler documents
that the explicit declaration of "int malloc()" is valid and refers to
a function which is quite different from "void* malloc()". For
instance, this "int malloc()" function could be an "atom allocator",
yielding a different int at each call (until they are released with
"void free(int)").

Such compiler extension doesn't affect any strictly conforming program
(and thus, the compiler remains ISO-compliant) and seems even quite
sensible (except that overloading malloc is not a good idea).
In that case, the declaration has not UB... But the conversion
int(*)()->void*(*)() yields a pointer that you can't use. It would be
ok, if converted back to int(*)()

Combining UB is not a good idea.
 
I

Ian Collins

jaysome said:
I don't doubt you, but you're only one in a thousand, if not a
million. What do you think about that assertion?
There's at least as many 64bit desktop system shipping these days as
there are 32bit, not including the server space which has even greater
bias to 64bit.
 
J

James Dow Allen

[A poor choice of words: I used "cast" to mean "declare."
But I did show a specific code sample to clarify my intent.
And in one sample, I did "cast" malloc() rather than its
return value.]
This is completely nuts. Without a suitable declaration of malloc() in
scope, the return value is assumed to be an int.

Not by gcc, in sample code depicted.
Yeah. And who doesn't know that? Apparently, it's you,...

I didn't mean to start a flame-war.

I'll respond to some other comments:

Morris said:
James Dow Allen wrote:
| The gcc compiler treats malloc() specially!

You seem surprised. Gcc implementations can sometimes be enhanced to
take advantage of target processor architectural features.

Somewhat surprised, and also impressed. I wondered whether it was
documented and whether there was similar special handling of
any other functions. Since it's a clever type of "noalias"
optimization.
it seemed like it might be of general interest even for those not
using gcc.

Only Morris actually responded to this, the major intent of my post.
(And he just used it as a plug for open-source :) )
Some may find this post ...
off-topic or confusing.

This was prescient! For example:

Mark said:
No. If you omit the header, the compiler must assume malloc returns an
int. It is a common mistake to think that casting the int to a pointer
"fixes" this. Not so.

Mark makes two points, one wrong, the other precisely a point I was
making.

The wrong point is "If you omit the header, the compiler must
assume malloc returns an int." The header provides a declaration
of what malloc does return. Omit the header and declare malloc
yourself for the same effect. This is *wrong* as a matter of
style and maintainability, but not as a matter of C syntax
or semantics.

Mark's correct point is that "casting the int" doesn't in general
work. I explained this, and showed how (at least for gcc)
to fix it.

Jordan said:
But he's NOT making the call until it's the right type.

I'm glad someone at least acknowledges that my peculiar
construction "works"! I'm sure the formula is non-conformant
(I think it's sad the beautiful "portable assembly language"
has been taken over by lawyer-like thinking: frankly I stick
to K&R C), but *with the gcc compiler* it works just like I
(and Jordan) imply.

James Dow Allen
 
S

SuperKoko

jaysome said:
^^^^^^^^^^^^
that should be "nearly all systems", by my accounts.
I don't know how you "count" systems.
Must it be balanced by the number of CPU sold on the market for
personal computers for each system?
If the answer is no, then, your statement is false.

If the answer is yes (since there are mostly IA-32 CPU sold on the
market for personal computers), then you might be correct. But not for
a long time.
In a very near future, most personal computers will effectively use 64
bits architectures.
And the LP64 model (32 bits int, 64 bits long, 64 bits pointers) seems
to be the first choice of x86-64 based platforms (I'm quite sure that
PowerPC/OS-X and x86-64 Linux are using LP64 right now; correct me if
I'm wrong).
It looks like the next Microsoft Windows, will use LLP64 (32 bits int,
32 bits long, and 64 bits pointer), mainly for backward compatibility
with programs which assumed that long were 32 bits.
Here too, sizeof(int)!=sizeof(void*)

In that case nearly all systems will have sizeof(int)!=sizeof(void*)
(in number of CPU for personal computers).

That's funny to see how many programmers assumed
sizeof(int)==sizeof(void*) on the x86-32 architecture while it was
often false on x86-16 and will be often false on x86-64.
 
K

Keith Thompson

James Dow Allen said:
I'm glad someone at least acknowledges that my peculiar
construction "works"! I'm sure the formula is non-conformant
(I think it's sad the beautiful "portable assembly language"
has been taken over by lawyer-like thinking: frankly I stick
to K&R C), but *with the gcc compiler* it works just like I
(and Jordan) imply.

Ok, it happens to work with gcc, but frankly that's no more
interesting than the fact that
int *ptr = (int*)malloc(sizeof int);
(with no visible prototype for malloc()) happens to work on many
systems. Both invoke undefined behavior.
 
E

Eric Sosman

Gordon said:
It's also an error if the return type of malloc() is not int, which
it shouldn't be.

It depends on what you think "it" is. If "it" is the
original code fragment
> if (0) malloc(); /* tell gcc that malloc is a function */
> foo = ((void *(*)())malloc)(TWELVE_MILLION);

.... then there's no "implicit int" in play at the point of
the call. The call uses a function pointer whose type almost
matches that of malloc() -- the only discrepancy being the
omission of the prototype -- so all is well unless, as I
mentioned, size_t is promotable. (I overlooked something:
there's also trouble if, for example, TWELVE_MILLION is an
int or double or some such, because there's no prototype to
force its conversion to size_t.)

Let's try a variation, shall we? In one file we'll have

#include <stdlib.h>
typedef void * (*fptr)(size_t); /* for readability */
fptr get_malloc_ptr(void) {
return malloc;
}

This function just returns a pointer to malloc(), which in turn
is properly declared by the header. Now in a separately-compiled
file we'll have

#include <stddef.h> /* for size_t */
void *my_malloc(size_t bytes) {
void * (*f)(size_t) = get_malloc_ptr();
return f(bytes);
}

This function acquires a pointer to malloc(), calls via the pointer,
and returns the result. Note carefully that no malloc() prototype
is visible at the point of the call -- there isn't a declaration
of any kind visible, with or without a prototype. Yet there is no
error here, no undefined behavior, no contravention of anything
except good sense.

The original code does pretty much the same thing, except
that it uses a different way of acquiring the pointer to malloc().
Both examples make their actual call to malloc() via a function
pointer expression of the proper type.
It is possible (even likely on some architectures, like 680X0) that
a pointer value is returned in the Pointer Return Register (e.g.
a0) and that integers are returned in the Integer Return Register
(e.g. d0) or the Integer Pair Return Registers (e.g. (d1,d0)).

If you use malloc() without a prototype in scope, the compiler will
pick up the uninitialized garbage in the Integer Return Register,
and perhaps move it to the Pointer Return Register when you cast
it, thereby losing the valid result and replacing it with garbage.

I think you have misread "it:" The operand of the cast
is not what you seem to believe it is.
Actually, I think this approach would also fix the "uninitialized
garbage from wrong return register" issue, since when the call is
made, the compiler has the correct return type.

The only difference between this and the original is the
insertion of the prototype. There is no UGFWRR issue in either
the original or the modified version.
 
E

Erik Trulsson

Somewhat surprised, and also impressed. I wondered whether it was
documented and whether there was similar special handling of
any other functions. Since it's a clever type of "noalias"
optimization.
it seemed like it might be of general interest even for those not
using gcc.

GCC knows about and has special handling for a large number of standard C functions
(as well as a smaller number of functions that are not part of Standard C.)

Since GCC knows how these functions are supposed to behave it can do
extra optimizations as well as give extra warnings for them.

Which functions it knows about is documented in the GCC manual (search for
'built-in functions'.) (GCC manuals can be found at http://gcc.gnu.org/onlinedocs/ )


Some other compilers do similar things. This is one reason it can be a very bad
idea to try to define your own versions of standard functions, since the compiler
might generate inlined code for the standard function and never generate a function call
at all, or might do some other optimizations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

malloc 40
array-size/malloc limit and strlen() failure 26
malloc 33
gcc alignment options 19
Malloc question 9
malloc and maximum size 56
malloc and alignment question 8
using my own malloc() 14

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top