Function pointer assignment and memset'ing - What's happening here?

G

ggletestggle

What's exactly is happening in the following example?
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

a = puts;
printf("%d\n", a); /* outputs 4201472 here */

a = &puts;
printf("%d\n", a); /* outputs 4201472 here */

a = *puts;
printf("%d\n", a); /* outputs 4201472 here */

memcpy(&a, puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, &puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, *puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

return 0;
}
/************************/

- I was hoping at least one of the memcpy lines was equivalent to at least one of the assignments lines.
- The first 3 (as well as the last 3) being equivalent is kind of contra intuitive to me.
- How can I make a memcpy to &a, which is equivalent to "Make 'a' store the number corresponding to the address 'puts' resides on"?

Thanks and sorry about the english..
 
E

Eric Sosman

What's exactly is happening in the following example?

Errors, mostly. If you were to turn up the warning levels
on your compiler, it might issue messages like those I've
interspersed in your source.
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

a = puts;
foo.c:7: warning: assignment makes integer from pointer without a cast
printf("%d\n", a); /* outputs 4201472 here */

a = &puts;
foo.c:10: warning: assignment makes integer from pointer without a cast
printf("%d\n", a); /* outputs 4201472 here */

a = *puts;
foo.c:13: warning: assignment makes integer from pointer without a cast
printf("%d\n", a); /* outputs 4201472 here */

memcpy(&a, puts, sizeof(int));
foo.c:16: warning: implicit declaration of function 'memcpy'
foo.c:16: warning: incompatible implicit declaration of built-in
function 'memcpy'
foo.c:16: warning: ISO C forbids passing argument 2 of 'memcpy' between
function pointer and 'void *'
foo.c:16: note: expected 'const void *' but argument is of type 'int
(*)(const char *)'
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, &puts, sizeof(int));
foo.c:19: warning: ISO C forbids passing argument 2 of 'memcpy' between
function pointer and 'void *'
foo.c:19: note: expected 'const void *' but argument is of type 'int
(*)(const char *)'
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, *puts, sizeof(int));
foo.c:22: warning: ISO C forbids passing argument 2 of 'memcpy' between
function pointer and 'void *'
foo.c:22: note: expected 'const void *' but argument is of type 'int
(*)(const char *)'
printf("%d\n", a); /* outputs 16308783 here */

return 0;
}
/************************/

- I was hoping at least one of the memcpy lines was equivalent to at least one of the assignments lines.

They are! All six are illegal, and have no defined meaning
in the C language.
- The first 3 (as well as the last 3) being equivalent is kind of contra intuitive to me.

(Shrug.) Illegal is illegal is illegal.
- How can I make a memcpy to &a, which is equivalent to "Make 'a' store the number corresponding to the address 'puts' resides on"?

There is no well-defined way to do so.

For one thing, `a' is an `int', and there is no guarantee that
an `int' is capable of storing a function pointer's value. On a
system with 64-bit pointers and 32-bit `int', it is exceedingly
unlikely that converting a function pointer to `int' will produce
anything meaningful.

A deeper problem is that C's data pointers and function pointers
need not resemble each other at all: That's why C doesn't define any
way to convert a function pointer to or from `void*'. In principle,
functions and data might occupy entirely different kinds of memory
with different characteristics and addressing modes (some embedded
processors, I'm told, are built this way). Even when executable code
inhabits the same kind of memory that data does, function pointers
may be more complicated values than data pointers: I understand that
on some systems function pointers encode not only the address of a
function's preamble, but information about its signature. Again,
trying to convert such a pointer to an `int' is unlikely to be
useful.

What problem are you trying to solve?
 
B

Ben Bacarisse

What's exactly is happening in the following example?
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

a = puts;
printf("%d\n", a); /* outputs 4201472 here */

First, puts may be a macro. To be sure of exactly what's happening you
should #undef puts or write (puts) instead. Let's assume that puts is
not a macro or, if it is, it gets replaced by something that behaves
exactly like a function name...

'puts' (we are assuming) is a "function designator" it gets converted
(in this and most contexts) to a pointer to the puts function and you
try to convert this function pointer to an int. This is implementation
defined. If 'int' and function pointers have the same size, you are
*probably* getting a decimal representation of the address of the
function.
a = &puts;
printf("%d\n", a); /* outputs 4201472 here */

This is the same as above. The operand of '&' is an exception to the
rule above -- puts is not converted to a pointer in this case, but, of
course, the & operator does that itself.
a = *puts;
printf("%d\n", a); /* outputs 4201472 here */

Again, the same. 'puts' gets converted to a function pointer. When the
operand of * is a function pointer the result is a function designator
and that gets converted to a pointer to the function.
memcpy(&a, puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

The effect of memcpy when copying between things that are not both
objects is undefined, but you can make a reasonable guess as to what
your implementation is doing. 'puts' is converted to a pointer to the
function and copying *probably* takes place from there.
memcpy(&a, &puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, *puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

These two are the same for the same reasons given for the first three.
return 0;
}
/************************/

- I was hoping at least one of the memcpy lines was equivalent to at
least one of the assignments lines. - The first 3 (as well as the
last 3) being equivalent is kind of contra intuitive to me. - How can
I make a memcpy to &a, which is equivalent to "Make 'a' store the
number corresponding to the address 'puts' resides on"?

You can copy a function pointer representation like this:

int a;
int (*fp)(const char *) = (puts);
assert(sizeof a == sizeof fp);
memcpy(&a, &fp, sizeof a);

In C99 and later you avoid the declared pointer object by using a
compound literal:

memcpy(&a, &(int (*)(const char *)){(puts)}, sizeof a);
 
B

Ben Bacarisse

Ben Bacarisse said:
(e-mail address removed) writes:
... This is implementation
defined.

Reading Eric's reply has reminded me that I should have said that this is
implementation defined only with a cast. It's undefined without one. I
don't think that makes very much difference to the behaviour your are
seeing, but I don't want you to think there is an dispute about that.

<snip>
 
T

Testing Tester

     What problem are you trying to solve?

First thank you for the explanation. I'm on x86-32, and doing
something that's very specific and so I'm almost sure there's no way
to stay 'portable'. I'm loading an a.out file on memory, subjected to
some constraints, and I do need to write the 4 byte-address of (say)
puts function in a certain memory location. My code does work as
expected using the following line:

*((int *)(data+curr_reloc->r_address)) = puts;

, but when I tried to replace it with a less ugly one and wasn't
successful and I was kind of confused on how "*puts", "&puts" and
"puts" produced the same result when trying to replace this line and
specially on how I couldn't find an equivalent line using memcpy.
 
K

Keith Thompson

Ben Bacarisse said:
First, puts may be a macro. To be sure of exactly what's happening you
should #undef puts or write (puts) instead. Let's assume that puts is
not a macro or, if it is, it gets replaced by something that behaves
exactly like a function name...

puts may be a *function-like* macro. Even if it is, the name `puts` not
followed by a `(` will not expand that macro; it's a function designator
referring to the standard `puts` function. The standard doesn't permit
names of library functions to be redefined as object-like macros. If it
did, this:

(puts)("hello");

would not reliably call the actual puts function.
'puts' (we are assuming) is a "function designator" it gets converted
(in this and most contexts) to a pointer to the puts function and you
try to convert this function pointer to an int. This is implementation
defined. If 'int' and function pointers have the same size, you are
*probably* getting a decimal representation of the address of the
function.

In fact, `a = puts;` is not merely implementation-defined or undefined
behavior, it's a constraint violation. There is no implicit conversion
from function pointers to integers. The assignment violates the
constraint specified in N1570 6.5.16.1 regarding the permitted operands
of a simple assignment.

After issuing the required diagnostic, an implementation *may* choose to
generate code equivalent to a conversion, as if you had written:

a = (int)puts;

But that conversion has undefined behavior, simply because the standard
does not define the behavior of such a conversion.

It's very likely that your compiler copies the representation of a
pointer to puts, or a part of it, into a. But (a) you shouldn't depend
on that, and (b) there's very little you can usefully do with the result
anyway.

The memcpy() calls are likewise constraint violations, because there's
no implicit conversion from a function pointer to void*.
 
T

Testing Tester

This is the same as above.  The operand of '&' is an exception to the
rule above -- puts is not converted to a pointer in this case, but, of
course, the & operator does that itself.
Again, the same.  'puts' gets converted to a function pointer.  When the
operand of * is a function pointer the result is a function designator
and that gets converted to a pointer to the function.

Ah nice!! it makes (some) sense now. Although those implicit
conversion are very non intuitive to me.
You can copy a function pointer representation like this:

 int a;
 int (*fp)(const char *) = (puts);
 assert(sizeof a == sizeof fp);
 memcpy(&a, &fp, sizeof a);

Nice, that was what I was looking for!
 
J

James Kuyper

I don't know how everyone else was able to type their answers quicker
that me, but my answer is not completely redundant with theirs, so I'll
send it anyway.

What's exactly is happening in the following example?
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

a = puts;

"A function designator is an expression that has function type. Except
when it is the operand of the sizeof operator, the _Alignof operator,65)
or the unary & operator, a function designator with type ‘‘function
returning type’’ is converted to an expression that has type ‘‘pointer
to function returning type’’." (6.3.2.1p4)

"Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any integer
type." (6.3.2.3p6)

While that conversion is allowed, it is not one the implementation is
required to perform implicitly. You have to request it explicitly,
otherwise you run afoul of the constraints that apply to simple
assignments (6.5.16.1p1). A conforming implementation of C is required
to generate a diagnostic message - if yours didn't complain about that
line, you need to either increase the warning levels, or change to use a
better compiler. If, after generating the message, your compiler chooses
to accept your code anyway, and if you choose to execute the compiled
program, the behavior is undefined.

Even if you had inserted an explicit cast to 'int', you've given us no
particular reason to expect that the implementation-defined result of
converting a pointer to puts to an integer results in a value that can
be represented as an 'int', so the behavior of your code might be
undefined for that reason. too. Either way, there's no restrictions on
how your program is allowed to behave.
printf("%d\n", a); /* outputs 4201472 here */

a = &puts;

The implicit conversion of a function designator into a pointer to the
function, described in 6.3.2.1p4 which I quoted above, is explicitly
described as not occurring when the function designator is the operand
of a unary & operator. Therefore, &puts also results in a pointer to
puts, rather than a pointer to a pointer to puts. This produces the
counter-intuitive result that &puts and puts have the same type and
equivalent values.
printf("%d\n", a); /* outputs 4201472 here */

a = *puts;

In the expression *puts, the conversion of puts to a pointer DOES occur.
The result of applying the unary * operator is itself a function
designator; as such, it too gets implicitly converted into a pointer to
puts. The net result is REALLY counter-intuitive: *puts == &puts.

The C Rationale has this to say about it:
The treatment of function designators can lead to some curious, but valid, syntactic forms.
Given the declarations

int f(), (*pf)();

then all of the following expressions are valid function calls:
(&f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();

The first expression on each line was discussed in the previous paragraph. The second is
conventional usage. All subsequent expressions take advantage of the implicit conversion of a
function designator to a pointer value, in nearly all expression contexts. The C89 Committee
saw no real harm in allowing these forms; outlawing forms like (*f)(), while still permitting
*a for a[], simply seemed more trouble than it was worth.


Getting back to your message:
printf("%d\n", a); /* outputs 4201472 here */

memcpy(&a, puts, sizeof(int));

This behavior of memcpy() is defined only when both of it's first two
arguments point at objects. The expression "puts" points at a function,
so the behavior of this code is undefined. It could print out 16308783,
or 4201472, or -13.45 or "You don't understand the distinction between
objects and functions in C".
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, &puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

memcpy(&a, *puts, sizeof(int));
printf("%d\n", a); /* outputs 16308783 here */

Because of the implicit conversions of function designators to function
pointers, those memcpy() calls have behavior that is just as undefined
as the first one, for precisely the same reason.
return 0;
}
/************************/

- I was hoping at least one of the memcpy lines was equivalent to at least one of the assignments lines.
- The first 3 (as well as the last 3) being equivalent is kind of contra intuitive to me.
- How can I make a memcpy to &a, which is equivalent to "Make 'a' store the number corresponding to the address 'puts' resides on"?

There need not be any such address. There need not be any such number.
On implementations where such a number does exist, the expression
(intptr_t)puts is the one that is generically most likely to have that
value. On other implementations, that same expression could have
undefined behavior, so this should only be done in code that's meant to
be used only with that particular implementation.

On a implementation where (intptr_t)puts does generate the value you
want, you can't put that value into 'a' directly using memcpy(). You'd
first have to use assignment to put the value into a different object;
then you can use memcpy() to copy it from that other object into 'a':

intptr_t b = (intptr_t)puts; // Warning: possibly undefined
intptr_t a;
memcpy(&a, &b, sizeof a);

Are you sure you want to use memcpy()? Assignment is simpler.
 
A

Alan Curry

What's exactly is happening in the following example?
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

a = puts;

Ignoring the lack of cast, the possible size mismatch, and other issues that
have already been covered, I want to offer a simple way of understanding the
question: why isn't there a simple memcpy equivalent to the above assignment?

It's because the thing you want to put into `a' (the address of `puts') is
not located in any addressable object.

If your program uses puts normally (with a plain call, no function pointer),
like this:

#include <stdio.h>
int main(void)
{
puts("Hello, world!");
return 0;
}

then the compiler output on x86 might contain an instruction like this:

8048250: e8 eb 0a 00 00 call 8048d40

8048d40 is the address of puts in this program. But the bytes in the
instruction are "e8 eb 0a 00 00" - a relative address. The byte sequence you
want to put into `a' is "40 8d 48 08", which appears nowhere in the program.
You can't construct an appropriate source argument to memcpy because you're
trying to "copy" a value that isn't in memory.

Even if the call had been implemented with a jump to an absolute address, it
would be in an inconvenient place for memcpy'ing. There's no operator to
construct a pointer to the argument of a machine instruction implementing a
function call. Fortunately.
 
E

Eric Sosman

First thank you for the explanation. I'm on x86-32, and doing
something that's very specific and so I'm almost sure there's no way
to stay 'portable'. I'm loading an a.out file on memory, subjected to
some constraints, and I do need to write the 4 byte-address of (say)
puts function in a certain memory location. My code does work as
expected using the following line:

*((int *)(data+curr_reloc->r_address)) = puts;

, but when I tried to replace it with a less ugly one and wasn't
successful and I was kind of confused on how "*puts", "&puts" and
"puts" produced the same result when trying to replace this line and
specially on how I couldn't find an equivalent line using memcpy.

Your older code still suffers from the problem that there is
no automatic conversion from a function pointer to an `int': The
left-hand side of your assignment is incompatible with the right.
That much could be fixed with a cast:

*( (int*)(data+curr_reloc->r_address) ) = (int*)puts;

Alternatively, you could change the cast on the left-hand side
(I've used a typedef to clarify things):

typedef int (*FuncPtr)(const char*);
*( (FuncPtr*)(data+curr_reloc->r_address) ) = puts;

Either of these is likely to do what you want -- although, as I
wrote earlier, the C language does not define what will happen.

The reason you couldn't get memcpy() to work is that you
want to copy the value of a `puts' pointer, but memcpy() copies
bytes. You were trying something similar to

int answer;
memcpy(&answer, 42, sizeof answer);

.... whereas what you needed was something more like

int forty_two = 42;
int answer;
memcpy(&answer, &forty_two, sizeof answer);

That is, memcpy() needs someplace to copy *from*, as well as
someplace to copy to.

By the way, if you plan to execute the code you have loaded
from an `a.out' file, be aware that there may be other system-
specific problems. For example, many machines have separate
caches for instructions and for data, and writing something via
the data path (as you do when you load the `a.out' file into
memory) might not update the instruction cache. Or, you might
need to set an "executable" attribute on the memory pages after
loading them. Or, you might need to turn three times counter-
clockwise while throwing a pinch of henbane over your left shoulder
and calling upon Saint Bitzer -- all of this is well outside what
the C language can do for you on its own.
 
A

army1987

I was kind of confused on how "*puts", "&puts" and "puts"
produced the same result

From N1570, §6.3.2.1p2:

"A function designator is an expression that has function type. Except
when it is the operand of the sizeof operator, the _Alignof operator, or
the unary & operator, a function designator with type 'function returning
type' is converted to an expression that has type 'pointer to function
returning type'."

(This has the funny effect that
int main(void) { ***************puts("hello, world"); return 0; }
works.)
 
A

army1987

On Tue, 12 Feb 2013 00:08:32 +0000, army1987 wrote:

[snip]

And of course I should have noticed Ben Bacarisse making the exact same
point a few hours earlier. :-/
 
B

Ben Bacarisse

army1987 said:
From N1570, §6.3.2.1p2:

"A function designator is an expression that has function type. Except
when it is the operand of the sizeof operator, the _Alignof operator, or
the unary & operator, a function designator with type 'function returning
type' is converted to an expression that has type 'pointer to function
returning type'."

(This has the funny effect that
int main(void) { ***************puts("hello, world"); return 0; }
works.)

No, because the operand of the right-most * is a function call whose
return type is int. You meant to type (***************puts)("hello, world").
 
B

Ben Bacarisse

Testing Tester said:
First thank you for the explanation. I'm on x86-32, and doing
something that's very specific and so I'm almost sure there's no way
to stay 'portable'. I'm loading an a.out file on memory, subjected to
some constraints, and I do need to write the 4 byte-address of (say)
puts function in a certain memory location. My code does work as
expected using the following line:

*((int *)(data+curr_reloc->r_address)) = puts;

, but when I tried to replace it with a less ugly one and wasn't
successful and I was kind of confused on how "*puts", "&puts" and
"puts" produced the same result when trying to replace this line and
specially on how I couldn't find an equivalent line using memcpy.

A couple of points...

First, I don't think memcpy has any advantage here -- you may as well
stick with something similar to what you have right now.

The second point is about tidying that up. You know now to include a
cast on the right hand side to get the implementation-defined conversion
from a pointer to an int:

*((int *)(data+curr_reloc->r_address)) = (int)puts;

but it may better convey your intent to use a function pointer type
instead. I'd use typedef for this. If you are not averse to hiding a
pointer type behind a typedef it might look like this:

typedef void (*code_ptr)(void);
...
*((code_ptr *)(data+curr_reloc->r_address)) = (code_ptr)puts;

If you prefer to make all the pointers explicit it will look fussier.

That might convey your meaning more clearly, unless the use of int to
hold addresses is important to the program in some other way.
 
K

Keith Thompson

[...]

[Context: `int a; a = puts;`]
In fact, `a = puts;` is not merely implementation-defined or undefined
behavior, it's a constraint violation. There is no implicit conversion
from function pointers to integers. The assignment violates the
constraint specified in N1570 6.5.16.1 regarding the permitted operands
of a simple assignment.

After issuing the required diagnostic, an implementation *may* choose to
generate code equivalent to a conversion, as if you had written:

a = (int)puts;

But that conversion has undefined behavior, simply because the standard
does not define the behavior of such a conversion.

Oops, my mistake. The result of a conversion from a function pointer to
an integer, or vice versa, is implementation-defined, not undefined.
See N1570 6.3.2.3 paragraphs 5 and 6.

A conversion from a function pointer to an object pointer, or vice
versa, does have undefined behavior.

*Assigning* an integer to a pointer (object or function), or vice versa
(except for the special case of a null pointer constant) is a constraint
violation; there is no such *implicit* conversion.
 
R

Rosario1903

What's exactly is happening in the following example?
/*******************/
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a;

i image:
1) unsigned puts=(unsigned) AddressWherePutsIs;
2) puts is the immediate address of function
a = puts;
printf("%d\n", a); /* outputs 4201472 here */
1)ok
2)ok

a = &puts;
printf("%d\n", a); /* outputs 4201472 here */

1)this would not be with the same address
it would be the address of puts variable
2)not has address
a = *puts;
printf("%d\n", a); /* outputs 4201472 here */

1)this would not be
2)this would not be

if C say different, they have not one model of machine
i would say that C impose if f() is a function: f=&f=*f
i don't know why...
than i'm agains your ieee float point model
and for fixed point model
 
T

Tim Rentsch

Testing Tester said:
First thank you for the explanation. I'm on x86-32, and doing
something that's very specific and so I'm almost sure there's no
way to stay 'portable'. I'm loading an a.out file on memory,
subjected to some constraints, and I do need to write the 4
byte-address of (say) puts function in a certain memory location.
My code does work as expected using the following line:

*((int *)(data+curr_reloc->r_address)) = puts;

, but when I tried to replace it with a less ugly one and wasn't
successful and I was kind of confused on how "*puts", "&puts" and
"puts" produced the same result when trying to replace this line
and specially on how I couldn't find an equivalent line using
memcpy.

It's almost always a really bad idea to cast an expression to a
pointer type and then store through the resulting pointer value.
Presumably one or the other of 'data' and 'curr_reloc->r_address'
is a pointer, probably a pointer to a character type, and the
other is of integer type. If so it's probably better to use
memcpy, eg,

int (*f)() = puts;
memcpy( &data[curr_reloc->r_address], &f, sizeof f );

or similar, perhaps written inside an inline function of some
sort if reuse is expected.

Having said that, if you are intent on using the pointer casting
technique, one, at least make it clear that an address is what's
being casted, and two, simply use (a pointer to) the type of the
value being assigned. Thus,

* (int(**)()) &data[curr_reloc->r_address] = puts;

(again under the assumption that the LHS subexpressions' types
are suitable) is a cleaner way of writing a direct assignment.
 
T

Tim Rentsch

James Kuyper said:
[snip]
Getting back to your message:
printf("%d\n", a); /* outputs 4201472 here */

memcpy(&a, puts, sizeof(int));

This behavior of memcpy() is defined only when both of its
first two arguments point at objects. The expression "puts"
points at a function, so the behavior of this code is
undefined. [snip elaboration[

More importantly it has a constraint violation, and must
elicit a diagnostic when compiled.
 
T

Tim Rentsch

Eric Sosman said:
C's data pointers and function pointers need not resemble each
other at all: That's why C doesn't define any way to convert a
function pointer to or from `void*'. In principle, functions
and data might occupy entirely different kinds of memory with
different characteristics and addressing modes (some embedded
processors, I'm told, are built this way). Even when executable
code inhabits the same kind of memory that data does, function
pointers may be more complicated values than data pointers: I
understand that on some systems function pointers encode not
only the address of a function's preamble, but information about
its signature.

In C the word is type. Some other languages (notably Russell,
which coined the term) have signatures; what C has is types.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top