Arithmetic on function address

S

Stephen Biggs

Given this code:
void f(void){}
int main(void){return (int)f+5;}

Is there anything wrong with this in terms of the standards? Is this legal
C code? One compiler I'm working with compiles this quietly, even with the
most stringent and pedantic ANSI and warning levels, but generates code
that only loads the address of "f" and fails to make the addition before
returning a value from "main".

GCC "does the right thing".

Is there something I'm missing?
 
E

Eric Sosman

Stephen said:
Given this code:
void f(void){}
int main(void){return (int)f+5;}

Is there anything wrong with this in terms of the standards?

Yes and no. The Standard permits you to cast a pointer
value (even a function pointer value) to an integer, but it
does not guarantee that the result is useful or even usable.
Is this legal C code?

Legal but useless.
One compiler I'm working with compiles this quietly, even with the
most stringent and pedantic ANSI and warning levels, but generates code
that only loads the address of "f" and fails to make the addition before
returning a value from "main".

GCC "does the right thing".

Is there something I'm missing?

Your marbles, perhaps. ;-) What are you trying to
accomplish with this ill-defined operation?
 
S

Stephen Biggs

Yes and no. The Standard permits you to cast a pointer
value (even a function pointer value) to an integer, but it
does not guarantee that the result is useful or even usable.


But, it then should allow you to add a value to that integer, as the
code says, no? Is this what you mean by no guarantees of it being
usable?
Legal but useless.


Ok, fine... I agree completely that it is useless, but shouldn't correct
code be generated for it?
Your marbles, perhaps. ;-) What are you trying to
accomplish with this ill-defined operation?

Thank you for that :)... I am not trying to accomplish anything besides
running the GCC testsuite on some other compiler that I am trying to
analyze. This is part of the testsuite and passes with GCC, no
problem... I was just wondering if this is a bug in the compiler that I
am trying to run on this code? I want to be sure that this should
generate code correctly before I cry "bug". That is, that it should
generate code to add the constant after the pointer is converted to an
integer.

Thanks for any help.
 
E

Eric Sosman

Stephen said:
But, it then should allow you to add a value to that integer, as the
code says, no? Is this what you mean by no guarantees of it being
usable?

If you're testing compilers (as you say later on), you really
ought to get yourself a copy of the Standard -- which says (in
section 6.2.3, paragraph 6):

Any pointer type may be converted to an integer type.
Except as previously specified [not relevant here],
the result is implementation-defined. If the result
cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of
values of any integer type.

So: The result of the conversion is implementation-defined, and
need not be a valid value for an `int', and any attempt to create
an invalid value causes undefined behavior. That's "unusable" in
my book. (Gurus: Contrast this paragraph with the apparently
stronger conditions of 6.3.1.3/3: pointer-to-int can generate
U.B. instead of raising an implementation-defined signal.)
Ok, fine... I agree completely that it is useless, but shouldn't correct
code be generated for it?

It's hard to understand what "correct" means when describing
what might be undefined behavior. I think you need to do two
things before concluding that the generated code is "incorrect:"
You need to consult the compiler's own documentation to find how
it defines the conversion result, and you then need to determine
whether the result is in range for an `int'. Then:

- If the conversion result is defined but out of range,
you have no grounds for complaint. Any and all behaviors
(hence any and all generated code) are "correct."

- If the conversion result is defined and in range, you
may have reason to complain.

- If the conversion result is not defined, you have reason
to complain about the documentation, but not (yet) about
the code generation.

One observation about testing compilers: They usually must
obey several standards, not just one, and some of these may be
just "usual practice" rather than formal standards. That is,
the C Standard represents a sort of "non-negotiable minimum"
for a C implementation, but a C implementation that did *only*
what the Standard required would not enjoy much success. The
prospective users will also want POSIX support and/or Windows
support, they'll want "friendly" behavior when they do things
like clear a pointer to all-bits-zero, they'll want CHAR_BIT
to equal (not exceed) 8, they'll want various guarantees about
signal() behavior, and so on. They will *not* want the strictly-
conforming but user-hostile C compiler of the DeathStation 9000!

Next time you climb into your car, pause and take a look
around. How many of the things you see could be removed without
actually removing the car's ability to get you from Here to
There? Rip out the radio, the air conditioner, the leather
seats, the back seat, the side and rear windows, the power
steering -- you'll still have a strictly-conforming car, but
you might not want to drive it much.
 
A

Alex Fraser

Stephen Biggs said:
Given this code:
void f(void){}
int main(void){return (int)f+5;}

Is there anything wrong with this in terms of the standards? Is this
legal C code? One compiler I'm working with compiles this quietly, even
with the most stringent and pedantic ANSI and warning levels, but
generates code that only loads the address of "f" and fails to make the
addition before returning a value from "main".

Perhaps it optimised the addition away by moving f() ;).
 
A

Arthur J. O'Dwyer

But, it then should allow you to add a value to that integer, as the
code says, no? Is this what you mean by no guarantees of it being
usable?

[This explanation based on N869 6.3.2.3#6.]

Not necessarily. The implementation might for instance map the
address of 'f' onto 'INT_MAX', thus producing signed-int overflow
when you try to add 5 to it. *Then*, and only then, is your program
allowed to defrost your refrigerator.
Alternatively, the implementation could map 'f' directly onto a
trap representation in 'int'; then, *any* attempt to use the value
of '(int)f' at all would trigger undefined behavior. (Note that
'(int)f' itself is still a valid construct on such systems; you
can take the 'sizeof ((int)f)' with impunity, but that's all you can
do.)
Finally, according to the word of the C99 draft standard, the
implementation is allowed to map the address of 'f' directly onto
a number so large that 'int' can't hold it. Instant undefined
behavior! (Except IMO in the case of 'sizeof', as above.)

Ok, fine... I agree completely that it is useless, but shouldn't correct
code be generated for it?

If it's completely useless --- and in fact could legitimately do
*anything at all* to your machine --- then what, pray tell, would be
the "correct code" you'd expect to see generated? "Garbage in,
garbage out" is the rule that applies here. Well, more precisely,
"Something that might or might not produce garbage in, something that
might or might not be garbage out."

You mean that on this compiler, the expressions

(int)f AND (int)f+5

compile to the same machine code? This is odd, but perfectly legitimate,
behavior for a conforming optimizing C compiler as long as it documents
its behavior in a conforming fashion. There's nothing wrong with your
compiler (although I would say it's a weird one); there is something
wrong with your test suite.

[BTW, if any experts could explain what 6.3.2.3 #6 means by
"except as previously specified," I'd love to hear it. I don't
recall any "previously specified" cases of the pointer-to-integer
cast's being defined.]

-Arthur
 
M

Malcolm

"Stephen Biggs"
Given this code:
void f(void){}
int main(void){return (int)f+5;}

One compiler ... generates code that only loads the address of "f"
and fails to make the addition before returning a value from "main".
Something funny is going on. As others have pointed out legally the compiler
can do anything with such a construct, but if it allows a cast from main to
f then it should put the address of main() in the integer register and then
add five to it. However how are you checking this? Is it by writing a second
program in which main() is possibly in a different position?
Why not see if you can get an assembly lisiting of the program to see what
code is being compiled?
 
S

Stephen Biggs

If you're testing compilers (as you say later on), you really
ought to get yourself a copy of the Standard


Yes... I should, but I am doing this as an employee of a company and
they are too cheap to buy it... :(

-- which says (in
section 6.2.3, paragraph 6):

Any pointer type may be converted to an integer type.
Except as previously specified [not relevant here],
the result is implementation-defined. If the result
cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of
values of any integer type.

So: The result of the conversion is implementation-defined, and
need not be a valid value for an `int', and any attempt to create
an invalid value causes undefined behavior. That's "unusable" in
my book. (Gurus: Contrast this paragraph with the apparently
stronger conditions of 6.3.1.3/3: pointer-to-int can generate
U.B. instead of raising an implementation-defined signal.)


Ok... I understand this... that's just it... the code that this compiler
generates does the conversion and actually returns the address of the
function as an integer. It just silently discards the addition. If
this is implementation behavior, shouldn't it (as you say below about
compilers needing to give more than just what the standard says) either
complain about an invalid value or do the addition also, since it
accepts the conversion in the first place? The function address is
converted to an integer since this is what is returned, so the integer
value should be available for more computation if needed.

Any other behavior, such as what is happening here, is a bug in the
compiler IMHO.
It's hard to understand what "correct" means when describing
what might be undefined behavior.


According to the definition above about "undefined behavior", as I read
it, since function addresses as well as all other pointers in this
compiler are the same size as an int (32 bits), then doing this
conversion is definitely defined. Thus, dropping the addition after
making the conversion is a bug.
I think you need to do two
things before concluding that the generated code is "incorrect:"
You need to consult the compiler's own documentation to find how
it defines the conversion result, and you then need to determine
whether the result is in range for an `int'. Then:

- If the conversion result is defined but out of range,
you have no grounds for complaint. Any and all behaviors
(hence any and all generated code) are "correct."

- If the conversion result is defined and in range, you
may have reason to complain.


That is exactly what is happening here... a function pointer or any
other pointer is the same size as an unsigned int (32 bits). Smaller
actually, here, so it fits in a signed int.
- If the conversion result is not defined, you have reason
to complain about the documentation, but not (yet) about
the code generation.


See above. If I am reading right what you quoted from the standard, then
the behavior is defined. If the conversion result was not defined, then
shouldn't, at least, a warning (or even a pedantic remark) be generated?
This compiler is extremely anal about other aspects of ANSI/ISO C.
One observation about testing compilers: They usually must
obey several standards, not just one, and some of these may be
just "usual practice" rather than formal standards. That is,
the C Standard represents a sort of "non-negotiable minimum"
for a C implementation, but a C implementation that did *only*
what the Standard required would not enjoy much success. The
prospective users will also want POSIX support and/or Windows
support, they'll want "friendly" behavior when they do things
like clear a pointer to all-bits-zero, they'll want CHAR_BIT
to equal (not exceed) 8, they'll want various guarantees about
signal() behavior, and so on. They will *not* want the strictly-
conforming but user-hostile C compiler of the DeathStation 9000!

Next time you climb into your car, pause and take a look
around. How many of the things you see could be removed without
actually removing the car's ability to get you from Here to
There? Rip out the radio, the air conditioner, the leather
seats, the back seat, the side and rear windows, the power
steering -- you'll still have a strictly-conforming car, but
you might not want to drive it much.


Yes... but what "standard" is this car "conforming" to?
Thanks for the help, Eric.
 
K

Keith Thompson

Stephen Biggs said:
Given this code:
void f(void){}
int main(void){return (int)f+5;}

Is there anything wrong with this in terms of the standards? Is this legal
C code? One compiler I'm working with compiles this quietly, even with the
most stringent and pedantic ANSI and warning levels, but generates code
that only loads the address of "f" and fails to make the addition before
returning a value from "main".

GCC "does the right thing".

Is there something I'm missing?

That's a very odd way to examine the value of an integer expression.
On many systems, the value returned from main doesn't directly map to
the status returned by the program (returning just the low-order 8
bits is common).

You shouldn't expect the result of casting a function address to int
to be at all meaningful, but it does seem odd that you're not seeing
the addition of 5 in the generated code. Is it possible that the
addition is being done during compilation or linking?

Try something like this:

#include <stdio.h>
void f(void){}
int main(void){printf("(int)f+5 = %d\n", (int)f+5);return 0;}

with and without the "+5" (but be aware that adding code to main()
could change the address of f()).
 
C

CBFalconer

Eric said:
.... snip ...

Next time you climb into your car, pause and take a look
around. How many of the things you see could be removed without
actually removing the car's ability to get you from Here to
There? Rip out the radio, the air conditioner, the leather
seats, the back seat, the side and rear windows, the power
steering -- you'll still have a strictly-conforming car, but
you might not want to drive it much.

Apart from the leather seats, that sounds like a 1954 MG TD. I
would be happy to drive one. :)
 
M

Mark McIntyre

Yes... I should, but I am doing this as an employee of a company and
they are too cheap to buy it... :(

If your employer can't affort an $18 download from the ANSI website, you
need a different job. At the very least buy it yourself.
 
J

Jack Klein

On Fri, 23 Apr 2004 23:01:14 +0000 (UTC), Stephen Biggs

[snip]
See above. If I am reading right what you quoted from the standard, then
the behavior is defined. If the conversion result was not defined, then
shouldn't, at least, a warning (or even a pedantic remark) be generated?
This compiler is extremely anal about other aspects of ANSI/ISO C.

You or your employer really need to spend the $18.00 to download a PDF
copy of the standard. You fail to understand the full meaning of
undefined behavior, which is a very important concept in C.

Undefined behavior relieves the implementation (i.e., compiler) of any
and all obligations as far as the C standard is concerned. Most
specifically, no diagnostic of any kind is ever required in the event
of undefined behavior.
 
M

Malcolm

Jack Klein said:
Most specifically, no diagnostic of any kind is ever required in the
event of undefined behavior.
Though a program that evokes UB is not a correct C program. So a compiler
should issue a diagnostic if possible, and the reason this is not mandated
is that it is often technically too hard for the compiler to do.
 
K

Keith Thompson

Mark McIntyre said:
If your employer can't affort an $18 download from the ANSI website, you
need a different job. At the very least buy it yourself.

But read the licensing terms first. Paying $18 for a copy of the
standard and making it generally available to employees would violate
the license. The safest thing to do would probably be to spend $18
per programmer and give each programmer his/her own copy.

IANAL, please do not trust any statements I make about copyright law
and licensing.
 
D

Dan Pop

In said:
Stephen said:
But, it then should allow you to add a value to that integer, as the
code says, no? Is this what you mean by no guarantees of it being
usable?

If you're testing compilers (as you say later on), you really
ought to get yourself a copy of the Standard -- which says (in
section 6.2.3, paragraph 6):

Any pointer type may be converted to an integer type.
Except as previously specified [not relevant here],
the result is implementation-defined. If the result
cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of
values of any integer type.

So: The result of the conversion is implementation-defined, and
need not be a valid value for an `int', and any attempt to create
an invalid value causes undefined behavior. That's "unusable" in
my book.

Not in mine. The feature works in an *implementation-defined* way.
If I want to test it on a given compiler, I *must* first read its
documentation. If the documentation says: "the feature is useless on
this compiler", then, there is no point in testing it. If it assigns
meaningful and useful semantics to it, then I can test whether it
really works as advertised.

The feature is far from useless, as pointed out in a different thread,
because it is the *only* way of printing the value of a pointer to
function without automatically invoking undefined behaviour, unless the
implementation explicitly makes it useless for this purpose.

Granted, it can't be used in strictly conforming programs. So what,
neither can fopen and friends, yet we keep recommending them all the time.

Dan
 
E

Eric Sosman

Stephen said:
Yes... I should, but I am doing this as an employee of a company and
they are too cheap to buy it... :(

I find this difficult to believe. If it is true, the
credibility of bug reports and other evaluations issuing
from your company will be essentially nil. Make up your
mind: Are you testing compilers, or are you just throwing
random source at them and drawing random conclusions?
[...]
Ok... I understand this... that's just it... the code that this compiler
generates does the conversion and actually returns the address of the
function as an integer. It just silently discards the addition. [...]

It occurs to me that you may have overlooked a compiler
optimization. How do you know that "fetch function pointer
and add five" has not been optimized into "fetch function
pointer plus five?" (I'm actually not interested in the
answer, since any attempt to determine it would necessarily
involve beyond-the-language off-topic mechanisms. Something
to consider, that's all.)
 
O

Old Wolf

main can't portably return anything other than 0, EXIT_SUCCESS, or
EXIT_FAILURE.
section 6.2.3, paragraph 6):

Any pointer type may be converted to an integer type.
Except as previously specified [not relevant here],
the result is implementation-defined. If the result
cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of
values of any integer type.

So: The result of the conversion is implementation-defined, and
need not be a valid value for an `int', and any attempt to create
an invalid value causes undefined behavior. That's "unusable" in
my book.

Some things to try (could still be undefined, but practically speaking,
might have more chance of working):

return (unsigned long long)f + 5;
return 5 + (int)f;
The feature is far from useless, as pointed out in a different thread,
because it is the *only* way of printing the value of a pointer to
function without automatically invoking undefined behaviour, unless the
implementation explicitly makes it useless for this purpose.

Didn't that thread conclude that this is well-defined:

void (*fp)(void) = f;
print_hex((char *)&fp, sizeof fp);
Granted, it can't be used in strictly conforming programs. So what,
neither can fopen and friends, yet we keep recommending them all the time.

What's wrong with fopen?
 
S

Sam Dennis

Old said:
[converting pointers to integers] is far from useless, as pointed out
in a different thread, because it is the *only* way of printing the
value of a pointer to function [without UB]

Didn't that thread conclude that this is well-defined:

void (*fp)(void) = f;
print_hex((char *)&fp, sizeof fp);

ITYM unsigned char; it's well defined, but that's not the value.
What's wrong with fopen?

I'm guessing that it's mainly 7.19.3p8: `The rules for composing valid
file names are implementation-defined.'

One can, technically, get around this with tmpnam, but that's not very
useful for most purposes.
 
D

Dan Pop

In said:
main can't portably return anything other than 0, EXIT_SUCCESS, or
EXIT_FAILURE.

main can't portably return *anything*. No matter what you return from
main, it has implementation-defined semantics.
section 6.2.3, paragraph 6):

Any pointer type may be converted to an integer type.
Except as previously specified [not relevant here],
the result is implementation-defined. If the result
cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of
values of any integer type.

So: The result of the conversion is implementation-defined, and
need not be a valid value for an `int', and any attempt to create
an invalid value causes undefined behavior. That's "unusable" in
my book.

Some things to try (could still be undefined, but practically speaking,
might have more chance of working):

return (unsigned long long)f + 5;

Who said it was a C99 compiler?
return 5 + (int)f;


Didn't that thread conclude that this is well-defined:

void (*fp)(void) = f;
print_hex((char *)&fp, sizeof fp);

It's also meaningless, as it displays the *representation*, not the
*value*. Imagine a well behaved, little endian implementation, where
the actual value is 0x12345678. Your output will be 78563412.
What's wrong with fopen?

It can fail for unspecified reasons. Which basically means that all fopen
calls may fail on a conforming implementation. And this is true for most
of the <stdio.h>, so any program trying to produce output is relying on
unspecified behaviour (the success of the output function call(s) it
makes).

Dan
 
O

Old Wolf

main can't portably return *anything*. No matter what you return from
main, it has implementation-defined semantics.

Not at all. EXIT_SUCCESS is required to make the caller aware of
the fact that "successful termination" occurred. The question of which
bits of data to set at what memory locations to achieve this is of course
implementation-defined, but that doesn't equate to non-portability.
Who said it was a C99 compiler?

Who said it wasn't? The OP was looking for something to work on his
one particular system (that part of the discussion has long since
been snipped though)
It's also meaningless, as it displays the *representation*, not the
*value*. Imagine a well behaved, little endian implementation, where
the actual value is 0x12345678. Your output will be 78563412.

How would you propose to display the *value* of any variable ?
Not:
unsigned int i = 42; printf("%u\n%x\n", i, i);
The squiggles '4' '2' juxtaposed are a representation. "%u" means to
present the value in base-10 form. "%x" means to present it in base-16
form. You cannot say that one of the two things printed here is
"the value" and the other is merely "the representation". You cannot
say that they are both "the value", because they are different.

The systems "%u", "%x", and print_hex((void *)&i, sizeof i), all have
the property that you can deduce the value given the representation.
With "%u" it is quite easy for humans to deduce the value, in fact,
we can "do it without thinking". With the print_hex() method, it would
still be possible to deduce the value (but might take a little longer).
The only thing you can say about the print_hex() version is that
the mapping from representation->value is not specified by the Standard.

Getting back to the OP's desires (ie. to find the value of his
function's address), he can in fact now do so without invoking UB,
by obtaining the representation of it with print_hex(), and deducing
from his system what the mapping from representation->value is (which
would not be beyond any of us, I'm sure).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top