Odd behavior with odd code

M

Michael Speer

The problem with casting a pointer-to-label to a pointer-to-function is that
a label is not a function. The beginning of a function contains a prologue
which builds the appropriate stack frame, retrieves arguments from wherever
they are, etc.

I my understanding of the cdecl convention it states the code calling
the function sets up the arguments for the stack frame then calls the
function. Then the call command pushes the current base and stack
registers onto the stack. Then only local variables would require the
base and stack pointers to be moved again. The ret command undoes the
stack frame register push on the stack. Finally the calling code
undoes the argument push and the frame is back where it started.

So the code in the function itself need only create local variables.
So using the my hack above should break on functions that have local
variables, but not ones that do not.

So it should work to jump in halfway, as long as there are no local
variables that would have missed being setup.
The compiler doesn't know it needs to do that at a label
used as a function entry point -- in fact it probably can't do it even if it
does know it needs to, since you can (usually) arrive at the label when it's
_not_ being used as a function entry point.

To the compiler it should not matter. I have told the compiler
through casting that my label points to a function entry taking an int
and char** argument set, so it should do a normal cdecl argument push
and call the address at the label.
My guess is that this extension was created so that labels could be stored
in variables and one could later do "goto variable;",
... which is so far out there it's
not even just "wrong".

Correct on both accusations. In defense of the hack, a pointer to a
memory location is a pointer to a memory location. There aren't
actually pointers to objects, functions, labels, etc. Just pointers
that we inform the compiler will be used for these purposes so it
knows how to setup any pointer arithmetic as it needs to do. Outside
of that, one could goto 0x08374322 and it should work, if the code
they want to jump to is really located at that point.
There's probably a portable way to implement whatever you're trying to do;
give us the problem first, not the solution, and we can try to help.

S

Unfortunately this is a solution without a problem. Or a problem
starting with the phrase "I wonder if I could...", depending on how
you look at it.
 
M

Michael Speer

The problem with casting a pointer-to-label to a pointer-to-function is that
a label is not a function. The beginning of a function contains a prologue
which builds the appropriate stack frame, retrieves arguments from wherever
they are, etc. The compiler doesn't know it needs to do that at a label
used as a function entry point -- in fact it probably can't do it even if it
does know it needs to, since you can (usually) arrive at the label when it's
_not_ being used as a function entry point.
My guess is that this extension was created so that labels could be stored
in variables and one could later do "goto variable;",

A compiler must treat

void f(void) {
void *p;
goto p;
p:
return;

}

as a jump to label p, not to the address referenced by variable p.

[OT] In "GNU C", the syntax for a jump to a stored address is goto *p;
[/OT]

gcc allows `goto address' using the `&&' operator to take the address
of a label. It was intended to allow a programmer to store addresses
in an array and then `goto label_array[ 3 ] ;' which is presumably
faster than a switch would be as the assembly for a switch would have
to check a series of values for equivalence waiting to jump to the
address specified by the first match.

Of course, with this behavior being an extension of C, it is doubtful
to me that this sort of code is common anywhere at all. Merely
possible.
 
F

Flash Gordon

Michael Speer wrote, On 17/02/07 20:59:
I my understanding of the cdecl convention it states the code calling
the function sets up the arguments for the stack frame then calls the
function. Then the call command pushes the current base and stack

<snip>

The C standard does not define calling conventions and a number of
different calling conventions are used even on a PC running Windows let
alone when you move beyond that in to the wide world.
Correct on both accusations. In defense of the hack, a pointer to a
memory location is a pointer to a memory location. There aren't
actually pointers to objects, functions, labels, etc. Just pointers

Wrong. I've worked on processors where data and program were in
completely separate address spaces. Others have worked on even stranger
systems.
that we inform the compiler will be used for these purposes so it
knows how to setup any pointer arithmetic as it needs to do. Outside
of that, one could goto 0x08374322 and it should work, if the code
they want to jump to is really located at that point.

The world is far more complex than you imagine.
Unfortunately this is a solution without a problem. Or a problem
starting with the phrase "I wonder if I could...", depending on how
you look at it.

If you want to discuss what hacks you can do in your implementation then
please take it to a group dealing with your implementation.
 
G

Guest

Michael said:
Stephen said:
Are you trying to implement coroutines?
The whole of my immediate intent was to discern if it was even
possible to take the address of a label and use casting to access the
code in a function at an intermediate point. I had ideas ( the
generator or partial recursion ) of things it might be used for. From
what I know of them coroutines would be something that could be
implemented using this type of behavior.
The problem with casting a pointer-to-label to a pointer-to-function is that
a label is not a function. The beginning of a function contains a prologue
which builds the appropriate stack frame, retrieves arguments from wherever
they are, etc. The compiler doesn't know it needs to do that at a label
used as a function entry point -- in fact it probably can't do it even if it
does know it needs to, since you can (usually) arrive at the label when it's
_not_ being used as a function entry point.
My guess is that this extension was created so that labels could be stored
in variables and one could later do "goto variable;",

A compiler must treat

void f(void) {
void *p;
goto p;
p:
return;

}

as a jump to label p, not to the address referenced by variable p.

[OT] In "GNU C", the syntax for a jump to a stored address is goto *p;
[/OT]

gcc allows `goto address' using the `&&' operator to take the address
of a label.

No, it doesn't. That was the point of my previous message. It allows
`goto *address', not `goto address'.
 
S

Stephen Sprunk

Michael Speer said:
I my understanding of the cdecl convention

The calling convention varies by implementation, and some implementations
even allow you to set it as a compile-time option. Making generalizations
is very tricky here. Of course, you're playing with an off-topic extension,
so you're already playing with fire.
it states the code calling the function sets up the arguments for the
stack
frame then calls the function. Then the call command pushes the current
base and stack registers onto the stack.

That's not how the ABIs I'm most familiar with work. (Assume from here down
that we're talking x86 or x64)

The caller pushes the arguments onto the stack, or stuffs them in registers,
or both. Then the CALL instruction pushes the return address onto the
stack, and jumps to the target. The callee then sets up a new stack frame
by pushing the caller's frame pointer onto the stack and moving the new
stack pointer into the frame pointer register.
Then only local variables would require the base and stack pointers to be
moved again.

The frame (or base) pointer shouldn't be changed, only the stack pointer.
The ret command undoes the stack frame register push on the stack.

Nope. The callee must pop the old frame pointer before using RET.
Finally the calling code undoes the argument push and the frame is back
where it started.
Correct.

So the code in the function itself need only create local variables.
So using the my hack above should break on functions that have local
variables, but not ones that do not.

So it should work to jump in halfway, as long as there are no local
variables that would have missed being setup.

No, it won't. Since the label isn't known to be a function entry point, it
won't have the prolog, which creates a stack frame, and it won't work as you
expect -- it'll keep using the callee's stack frame. When it returns,
provided you manage to somehow pop the correct number of locals off the
stack, it'll return to the callee, still in the same stack frame.

This is why your debugger is confused. No additional stack frames are being
created by your calls. Your callee function, no matter how many times it
recurses, are all executing with the caller's stack frame. That also means
if you access arguments in the callee, you're likely to get the caller's
arguments since they're read via the frame pointer, not the stack pointer.
It doesn't surprise me that your function recurses infinitely, since the 0
you're passing doesn't have a way to get read -- you just push another copy
of argv, 0, and the return address onto the stack with each iteration until
you run out of stack space.
To the compiler it should not matter. I have told the compiler
through casting that my label points to a function entry taking an int
and char** argument set, so it should do a normal cdecl argument push
and call the address at the label.

A label, even if you could coerce it into be a function entry point, does
not take arguments. If you know the implementation details, you might be
able to retrieve them with inline asm, but there's no way for the compiler
to know to do that. Casting it to a function that takes no arguments (and
sending the data you need via one or more global variables) at least has a
tiny chance of working.

Correction; the syntax is "goto *variable;".
Correct on both accusations. In defense of the hack, a pointer to a
memory location is a pointer to a memory location. There aren't
actually pointers to objects, functions, labels, etc. Just pointers
that we inform the compiler will be used for these purposes so it
knows how to setup any pointer arithmetic as it needs to do. Outside
of that, one could goto 0x08374322 and it should work, if the code
they want to jump to is really located at that point.

Correct at the machine level, at least for many implementations, but the
compiler needs to know what type something is so it knows what instructions
to use to do interesting things with that address. If you tell it an
address is something different than it really is, you're going to get bad
results in most cases.

A function call and a goto/label pair are very, very different beasts.
Trying to coerce one into acting like the other is unlikely to work, and I'm
not sure why you'd want to try anyways. I'd be more curious to see what the
pointer-to-label extension is successfully used for and what problems it
solves, and whether it's really cleaner than something that would be more
portable.

S
 
C

CBFalconer

Michael said:
.... snip ...

I my understanding of the cdecl convention it states the code calling
the function sets up the arguments for the stack frame then calls the
function. Then the call command pushes the current base and stack
registers onto the stack. Then only local variables would require the
base and stack pointers to be moved again. The ret command undoes the
stack frame register push on the stack. Finally the calling code
undoes the argument push and the frame is back where it started.

There is no such thing as a cdecl convention in standard C. Try a
newsgroup that deals with the peculiar system you are using.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
M

Michael Speer

Michael Speer wrote, On 17/02/07 20:59:





<snip>

The C standard does not define calling conventions and a number of
different calling conventions are used even on a PC running Windows let
alone when you move beyond that in to the wide world.

Point taken. My hack will only operate if the calling convention used
requires the caller to setup and take down the stack frame, leaving
the callee only having to `ret' in addition to the limitation of no
local variables I already described. In addition whatever cast is
used to force the label memory location to be treated as a function
memory location must specify the calling convention as being the same
as necessary and where possible.
Wrong. I've worked on processors where data and program were in
completely separate address spaces. Others have worked on even stranger
systems.

In a separate data and program memory layout, if you could take ( or
have precomputed ) the address of a label you wanted to jump to in the
program address space and then pass that value to a jmp or call then
it would operate the same. My hack does not require data and code to
overlap. I do, however, understand your point. This is not standard C
and it will not work out in the world because there are many ways the
language and standard library are implemented.
The world is far more complex than you imagine.

I agree. Aren't `labels' just markers in the code to be passed to the
assembler to be converted into memory addresses for the runtime to
jump or call to?
If you want to discuss what hacks you can do in your implementation then
please take it to a group dealing with your implementation.

santosh pointed this out and suggested I move to a gcc group when I
first arrived. I don't plan on continuing this indefinitely. I am
only responding to replies.
 
K

Keith Thompson

Stephen Sprunk said:
The problem with casting a pointer-to-label to a pointer-to-function is that
a label is not a function.
[...]

No, the problem with casting a pointer-to-label to a
pointer-to-function is that there's no such thing in standard C as a
pointer-to-label.

If you want to ask about gcc's pointer-to-label construct, you'll need
to do so in a gcc-specific newsgroup, probably gnu.gcc.help.
 
K

Keith Thompson

Michael Speer said:
I agree. Aren't `labels' just markers in the code to be passed to the
assembler to be converted into memory addresses for the runtime to
jump or call to?

Labels are C source code constructs that can be used as the targets of
goto statements. As far as C is concerned, there is no "address"
associated with a label.
 
D

Dik T. Winter

> I my understanding of the cdecl convention it states the code calling
> the function sets up the arguments for the stack frame then calls the
> function. Then the call command pushes the current base and stack
> registers onto the stack. Then only local variables would require the
> base and stack pointers to be moved again. The ret command undoes the
> stack frame register push on the stack. Finally the calling code
> undoes the argument push and the frame is back where it started.

Assuming one of the many ways C allows things to be done. I know also
variations where arguments are passed through registers and the function
prologue puts them on the stack. Or where arguments are passed through
a data block that is passed to the function, which the function unpacks
in the prologue. And many more variants.
> So the code in the function itself need only create local variables.

Or it may have to do with putting arguments on the stack, or whatever
else is needed.
> So using the my hack above should break on functions that have local
> variables, but not ones that do not.
>
> So it should work to jump in halfway, as long as there are no local
> variables that would have missed being setup.

Perhaps if there are no local variables and no arguments. Moreover, it
will fail horribly if a function pointer is larger than a void pointer.
> To the compiler it should not matter. I have told the compiler
> through casting that my label points to a function entry taking an int
> and char** argument set, so it should do a normal cdecl argument push
> and call the address at the label.

What you *assume* is a normal argument push.
 
M

Michael Speer

The caller pushes the arguments onto the stack, or stuffs them in registers,
or both. Then the CALL instruction pushes the return address onto the
stack, and jumps to the target. The callee then sets up a new stack frame
by pushing the caller's frame pointer onto the stack and moving the new
stack pointer into the frame pointer register.
Ah.


The frame (or base) pointer shouldn't be changed, only the stack pointer.

Combined with your description previous, ah ha.
Nope. The callee must pop the old frame pointer before using RET.

Makes sense given the above.

No, it won't. Since the label isn't known to be a function entry point, it
won't have the prolog, which creates a stack frame, and it won't work as you
expect -- it'll keep using the callee's stack frame. When it returns,
provided you manage to somehow pop the correct number of locals off the
stack, it'll return to the callee, still in the same stack frame.

This is why your debugger is confused. No additional stack frames are being
created by your calls. Your callee function, no matter how many times it
recurses, are all executing with the caller's stack frame. That also means
if you access arguments in the callee, you're likely to get the caller's
arguments since they're read via the frame pointer, not the stack pointer.
It doesn't surprise me that your function recurses infinitely, since the 0
you're passing doesn't have a way to get read -- you just push another copy
of argv, 0, and the return address onto the stack with each iteration until
you run out of stack space.

That explains what it does in `-O0' mode. `-O6' must cause the label
to point to the memory location wherein the callee pushes the callers
information instead of just after as it normally does. ( the label in
the test code directly followed the function name-line ) Hence why `-
O6' seemed to work while `-O0' did not. Perhaps `-O6' took note that
there was never a goto for the label and let it fall to the same
position as the main function since there would be nothing jumping
there. Having it there then made it nothing more than a complexly
gathered function pointer, and caused the code to work because it was
now calling a genuine function entry point.

Your entire post was very insightful, and you have directly shown the
error in my calling convention assumption and why it caused the
behavior it did. Bravo, sir.

Thank you.
 
G

Guest

Dik said:
Perhaps if there are no local variables and no arguments. Moreover, it
will fail horribly if a function pointer is larger than a void pointer.

If a function pointer is larger than a void pointer and void pointers
cannot meaningfully be converted to function pointers, yet the
compiler supports such conversions as an extension anyway, this is not
just useless, it's malicious. In the real world, I wouldn't worry
about such an implementation.
 
D

Dik T. Winter

>
> Point taken. My hack will only operate if the calling convention used
> requires the caller to setup and take down the stack frame, leaving
> the callee only having to `ret' in addition to the limitation of no
> local variables I already described.

And in addition the restriction that all arguments are passed on the stack,
and not through registers or by any other method. I do not now of *any*
implementation that follows that model.
>
> I agree. Aren't `labels' just markers in the code to be passed to the
> assembler to be converted into memory addresses for the runtime to
> jump or call to?

In most assemblers they are still labels. And most assemblers will convert
them to relative addresses with respect to the start of the function. Then,
when a jump is executed, the assembler will determine whether it is a short
jump or a long jump (when there is such a distinction), and generate a
relative or absolute jump, where the latter is resolved by the linker. The
assembler does not know about memory addresses.
 
R

Richard Heathfield

CBFalconer said:

There is no such thing as a cdecl convention in standard C.

Yes there is - by convention, cdecl is a program for converting a type
description from C to English (or some other human language).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,785
Messages
2,569,624
Members
45,318
Latest member
LuisWestma

Latest Threads

Top