Call function address stored in type of size_t?

A

Adam Warner

Hello all,

I'm very new to C but I have a number of years of Common Lisp programming
experience. I'm trying to figure out ways of translating higher order
concepts such as closures into C. The code will not be idiomatic C.

GCC has an extension to ISO C that permits nested functions:
<http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html>

For implementing closures they have a serious limitation:

If you try to call the nested function through its address after the
containing function has exited, all hell will break loose. If you try
to call it after a containing scope level has exited, and if it refers
to some of the variables that are no longer in scope, you may be lucky,
but it's not wise to take the risk. If, however, the nested function
does not refer to anything that has gone out of scope, you should be
safe.

I'm hopeful that if I heap allocate all closed over variables that I will
simulate closures. I'm aware of the distaste some have for the extension:
<http://groups.google.co.nz/[email protected]>

At this stage my question is elementary: How do I make all hell break
loose in the code below?

#include <stdio.h>

size_t glfn1() {
int x=1;
int inc() {
return printf("%d\n",x=x+1);
}
return (size_t) &inc;
}

int main() {
size_t fn_address=glfn1();
/* All hell will break loose */
fn_address();
return 0;
}

I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

How do I induce the compiler to accept the address of the function inc as
a function type? The called object of type size_t is not a function type.

Regards,
Adam
 
C

Charlie Gordon

Adam Warner said:
Hello all,

I'm very new to C but I have a number of years of Common Lisp programming
experience. I'm trying to figure out ways of translating higher order
concepts such as closures into C. The code will not be idiomatic C.

GCC has an extension to ISO C that permits nested functions:
<http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html>

Of course this question is off topic here, ask on a gcc forum.
For implementing closures they have a serious limitation:

If you try to call the nested function through its address after the
containing function has exited, all hell will break loose. If you try
to call it after a containing scope level has exited, and if it refers
to some of the variables that are no longer in scope, you may be lucky,
but it's not wise to take the risk. If, however, the nested function
does not refer to anything that has gone out of scope, you should be
safe.

I'm hopeful that if I heap allocate all closed over variables that I will
simulate closures. I'm aware of the distaste some have for the extension:
<http://groups.google.co.nz/[email protected]>

No, that won't work.
You can simulate closures with allocation by storing all variables referred to
by your function in a structure, and refer to structure members instead of
global identifiers, but that is not what you really want, and C++ will provide
the syntactical sugar to make it more appealing to you (but more distasteful to
most of us).
At this stage my question is elementary: How do I make all hell break
loose in the code below?

#include <stdio.h>

size_t glfn1() {
int x=1;
int inc() {
return printf("%d\n",x=x+1);
}
return (size_t) &inc;
}

easy: the function inc() increments a location on the stack relative to a frame
pointer passed as an implicit argument. The gcc implementation generates code
dynamically on the stack upon entering glfn1() consisting of a register
assignment to the frame pointer and a jump to the code for glfn1. This is
called a trampoline thunk. &inc points indeed to automatic storage in the glfn1
activation. dereferencing this pointer after glfn1 returns may well crash !
if not, who knowns what location will be changed by inc() ?
int main() {
size_t fn_address=glfn1();
/* All hell will break loose */
fn_address();
return 0;
}

I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

This is not a good assumption. intptr_t is the type you are referring to, but
that will not solve your problem.
How do I induce the compiler to accept the address of the function inc as
a function type? The called object of type size_t is not a function type.

with a cast:

(*(size_t(*)())fn_address)();

but again, this will not suffice for your closure to stand.
 
G

Gordon Burditt

I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

Why will it work for storing a FUNCTION pointer? If a function pointer
takes more bits than a size_t, you have trouble.

Data Pointer Code Pointer MS-DOS Model Name Works?
16 bits 16 bits small yes
32 bits 16 bits compact? yes
16 bits 32 bits middle? NO!
32 bits 32 bits large yes
64 bits 32 bits humongous yes
32 bits 64 bits gigantic NO!
64 bits 64 bits intergalactic yes
How do I induce the compiler to accept the address of the function inc as
a function type? The called object of type size_t is not a function type.

If ugly casts or assignment to a variable of type
pointer-to-function-returning-crap don't work, consider memmove()
between a size_t and such a function pointer. If you're going to
invoke undefined behavior anyway, this probably won't make it that
much worse.

Gordon L. Burditt
 
C

CBFalconer

Adam said:
I'm very new to C but I have a number of years of Common Lisp
programming experience. I'm trying to figure out ways of
translating higher order concepts such as closures into C. The
code will not be idiomatic C.

GCC has an extension to ISO C that permits nested functions:
<http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html>

You cannot nest functions in ISO standard C (the only form
discussed here). Any non-standard C is, by definition,
non-portable and off-topic on c.l.c. You can get some of the
effects by suitable break-up into files, and still have standard
C. Such a discussion would be on-topic.

If you really want nested functions use a language whose ISO
standard supports them. Pascal, Extended Pascal, and Ada all come
to mind.
 
C

Charlie Gordon

Gordon Burditt said:
Why will it work for storing a FUNCTION pointer? If a function pointer
takes more bits than a size_t, you have trouble.

Data Pointer Code Pointer MS-DOS Model Name Works?
16 bits 16 bits small yes
32 bits 16 bits compact? yes
16 bits 32 bits middle? NO!
32 bits 32 bits large yes
64 bits 32 bits humongous yes
32 bits 64 bits gigantic NO!
64 bits 64 bits intergalactic yes

Let's all have a thought for Gary Kildall who died ten years ago.
He was the creator of CP/M and the GEM Desktop GUI.
He was famed for having taken the most expensive vacation of all time in 1980,
but that is a legend.
His contribution to the PC industry is quite amazing, but he wasn't recognized
until after his death of uncertain causes.
Read all about him on : http://en.wikipedia.org/wiki/Gary_Kildall
Incidentally, a lesser known fact connects him to this OT post :
His company Digital Research, Inc. was originally named "Intergalactic Digital
Research"
If ugly casts or assignment to a variable of type
pointer-to-function-returning-crap don't work, consider memmove()
between a size_t and such a function pointer. If you're going to
invoke undefined behavior anyway, this probably won't make it that
much worse.

In this case memcpy() would do as well, as the objects do not overlap ;-)
 
S

Stephen Sprunk

Adam Warner said:
I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

There is no guarantee that size_t is large enough to hold an object pointer,
much less a function pointer.
How do I induce the compiler to accept the address of the function
inc as a function type? The called object of type size_t is not a
function type.

Declare fn_address as a function pointer, of course:

size_t (*fn_address)() = glfn1();

Of course, I can't figure out the right syntax to declare glfn1() as
returning that type, but surely someone else here will know...

S
 
K

Keith Thompson

Adam Warner said:
I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

No, a pointer type is the ideal type for storing a pointer. In
particular, a function pointer type is the ideal type for storing a
pointer to a function.
How do I induce the compiler to accept the address of the function inc as
a function type? The called object of type size_t is not a function type.

Here's a version of your program using the correct type for the
function pointer. (I've also un-nested the inc function, since
standard C doesn't support nested functions.)

#include <stdio.h>

typedef int (*func_ptr)(void);

int x = 1;

int inc(void)
{
return printf("%d\n", x++);
}

func_ptr glfn1(void)
{
return inc; /* or "return &inc;" */
}

int main(void)
{
func_ptr fn_address = glfn1();
fn_address();
return 0;
}
 
I

infobahn

Adam Warner wrote:

I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.

The ideal type for storing a pointer to a type T is T *.

If you don't know the type of an object to which you're pointing,
you can use void *. If you don't know the type of a function to
which you're pointing, but do at least know the return type T, use
T (*)().
 
A

Adam Warner

Hi CBFalconer,
You cannot nest functions in ISO standard C (the only form discussed
here). Any non-standard C is, by definition, non-portable and off-topic
on c.l.c.

OK. We discuss implementation-specific extensions of ANSI Common Lisp in
comp.lang.lisp so I was unaware this kind of discussion was verboten here.
You can get some of the effects by suitable break-up into files, and
still have standard C. Such a discussion would be on-topic.

I would love to see how this can be achieved!

Regards,
Adam
 
A

Adam Warner

Hi Charlie Gordon,
This is not a good assumption. intptr_t is the type you are referring to, but
that will not solve your problem.


with a cast:

(*(size_t(*)())fn_address)();

but again, this will not suffice for your closure to stand.

And who says Lisp has too many parentheses ;-)

Many thanks for the pointers. Now having found 7.18.1.4 of C99 I see it
declares the intptr_t and uintptr_t types as optional, so even their use
appears to potentially make my code non-portable. I suspect these types
will be widely defined by implementations.

I've also discovered I have to make significant use of local variables
simply to define the order of evaluation. I've checked that, at least in
simple cases, an optimising compiler like GCC can remove these temporaries
from generated assembly when they turn out to be unnecessary.

Regards,
Adam
 
C

CBFalconer

Adam said:
OK. We discuss implementation-specific extensions of ANSI Common
Lisp in comp.lang.lisp so I was unaware this kind of discussion
was verboten here.


I would love to see how this can be achieved!

Please don't strip attribution lines for quoted material.

Basically you devote a complete file to what would be a global
procedure/function in Pascal. Within that you declare all the
local functions as static, thus hiding them from the rest of the
system when linked. You can't nest this sort of thing further.
The only thing you need to publish in the associated .h file is the
"global" function.
 
M

Michael Wojcik

There is no guarantee that size_t is large enough to hold an object pointer,
much less a function pointer.

Gordon Burditt posted one counterexample; another is the set of
conforming implementations available for the AS/400, where all C
pointers are 128 bits, and the largest integer types are either 32 or
64 bits, depending on implementation. Fortunately, the AS/400
implementations have extremely polite Undefined Behavior in most
cases, including this one; overflowing a size_t, or copying a size_t's
contents into a pointer and then attempting to dereference it, will
suspend the job and send a message to the user's message queue, and
the user will be offered the options of cancelling or debugging it.

Thus size_t is only "ideal" for this purpose if your ideal is rather
flexible.

(And no, I don't expect the AS/400 implementations will get C99's
intptr_t and uintptr_t anytime soon, or indeed at all. They wouldn't
be useful there anyway.)
Declare fn_address as a function pointer, of course:

size_t (*fn_address)() = glfn1();

Of course, I can't figure out the right syntax to declare glfn1() as
returning that type, but surely someone else here will know...

Most people seem to prefer using typedefs for function pointers (I
don't, in all cases, but I seem to be in the minority), and that's
the obvious solution here:

typedef size_t (*fptr)();
fptr glnf1(void);

This can be done without a typedef. (There are cases where a typedef
is necessary, eg when extracting function pointer arguments in a
variadic function using va_arg.) It's a trifle hairy:

size_t (*glfn1(void))();

If the function pointer specified its parameters - which I'd prefer
(why subvert the type system?) - it might look like eg

size_t (*glfn1(void))(int, int);

if the function pointed to by the pointer returned by glfn1 took two
integer parameters. Basically, this means "glfn1 is a function of no
parameters which returns a pointer to a function taking two ints and
returning size_t". For this sort of thing I *would* use a typedef;
it's easy for a maintainer (who might be me) to be confused about
which parameter list is which.

K&R2 5.12 ("Complicated Declarations") has an example of a function
returning a pointer to an array of pointers to functions returning
char.
 
A

Andrey Tarasevich

Adam said:
...
I understand size_t is the ideal type for storing a pointer because it
will also work upon 64-bit platforms that have 32-bit ints.
...

That's not true. 'size_t' is a type that is required to be able to store
the total size of any object. The concept of 'object size' is
significantly different from the concept of an 'address'. For example,
there's noting that prevents the implementation from limiting object
sizes to 32-bit values even on 64-bit platform. C implementations on
hardware platforms with segmented memory will typically have object size
limited by segment size, meaning that most likely 'size_t' on such a
platform will be too small to store a pointer.

The type that fits your needs closer would be 'ptrdiff_t', not 'size_t',
although personally I wouldn't feel right using a signed integral type
for pointer storage. A good strategy would be to choose an unsigned
integral type which is at least as large as 'ptrdiff_t'.

The whole thing is, of course, lies deep in the "implementation defined"
area.
 
C

Chris Torek

I'm very new to C but I have a number of years of Common Lisp programming
experience. I'm trying to figure out ways of translating higher order
concepts such as closures into C. The code will not be idiomatic C.

Various people have answered the original questions; but no one has
addressed the basic idea, "how to handle closures in C" (in strictly
conforming, 100% portable C, that is).

What you need is to collect up the closed-over variables into a
data structure, and pass an explicit pointer to that data structure.
This is, of course, what Common Lisp and other closure-containing
(and continuation-passing and so on) language implementations do
internally (perhaps with optimization, and perhaps using techniques
like cactus stacks instead of heap allocation).

Here is how I usually do it. Suppose we have a family of functions
f1 through fN that all use some group(s) of variables. Define a
structure:

/*
* Variables for functions f1 through fN.
*/
struct f_context {
int xyzzy;
char plugh[100];
/* etc */
};

Then define the functions as taking either "struct f_context *" or,
if they are to be called through "generic function pointer" interfaces,
"void *" as their first (and perhaps only) parameters:

void f1(void *context) {
struct f_context *fc = context;

fc->xyzzy++;
fc->plugh[3] = 'a';
}

void f2(void *context) {
struct f_context *fc = context;

fc->xyzzy *= 3;
fc->plugh[32] = '\0';
}

/* etc */

Now each function can be called through a function-pointer of the
correct type:

void (*fp)(void *);
void *context;

...
context = <some expression that obtains the correct f_context>;
fp = f2; /* or whichever fX function */
...
fp(contextp);

State machines are typical users of such functions, with an "init"
function creating the state and, e.g., the "next function to call"
being the return value of each. A destructor function would then
return NULL:

/*
* I resort to a typedef here partly to avoid declaration ugliness.
* At the same time, though, this particular type is shared by
* whoever calls the state machine, so this typedef should probably
* appear in a shared header.
*/
typedef void (*state_func_ptr)(void *);

/* the rest of this is NOT shared, and is private to one file */
struct state_context { ... };

/* forward declarations: */
static state_func_ptr st1(void *);
static state_func_ptr st2(void *);
...
static state_func_ptr st_end(void *);

/*
* This code aborts if malloc fails -- other strategies are of
* course possible. Note that the context to be passed to the
* state functions is filled in via the supplied pointer.
*/
state_func_ptr state_init(void **vp) {
struct state_context *sc = malloc(sizeof *sc);

if (sc == NULL)
panic("out of memory in state_init");
*vp = sc;
/* st1 always gets called first, as it is the initial state */
return st1;
}

static state_func_ptr st1(void *context) {
struct state_context *sc = context;
...
return expr ? st2 : st3;
}

static state_func_ptr st2(void *context) {
struct state_context *sc = context;
...
return st_end;
}

...

static state_func_ptr st_end(void *context) {
free(context); /* release the space */
return NULL; /* and terminate the state machine */
}

Note that only the name "state_init" is exported from the entire
module -- the entire state machine is invisible to the caller,
who only knows to iterate until finished:

typedef void (*state_func_ptr)(void *); /* from shared header */

void call_it(void) {
state_func_ptr fp;
void *ctx;

/* initialize state machine */
fp = state_init(&ctx);

while (fp != NULL) {
/* any appropriate actions here */

/* call state machine and get new state */
fp = (*fp)(ctx);

/* more actions here as needed */
}
}

Other methods are of course possible, and one can expose as much
of the "context" data structure as one likes -- although in C, it
is an all-or-nothing deal: entire the contents are visible, or it
is an opaque type known either by name ("struct foo", as an incomplete
type) or dealt with via generic-data-pointer ("void *", with
conversions to and from the internal "struct whatever", as in these
examples). If you want to expose only part of the struct, expose
the whole thing, and tell people not to use the parts they are
supposed to keep their fingers off of. :) (Or, use a language
with such features, perhaps even C++. C++ has a lot of syntactic
sugar to make this prettier, using "class" instead of "struct".
Note, however, that C++ is a much larger language, and even the
things that share syntax with C sometimes have different semantics.)

As an aside, note that it is "struct" that defines user-defined
abstract data types in C. I sometimes claim the keyword stands
for "STRange spelling for User-defined abstraCt Type", because of
this type-creating property. The poorly-named "typedef" keyword
does *not* define new types; rather, it defines aliases for existing
types. The sequence "typedef struct foo { ... } alias", which one
commonly finds in C, uses "struct foo {" to define the type --
which is thus named "struct foo" -- and then gives it an alias that
omits the "struct" keyword. If the tag (foo, in this case) is
omitted, only the alias remains; the "true name" of the type can
no longer be uttered. (I prefer to write out the struct keyword:
I find it very helpful in keeping track of which names are type
names. "struct foo" immediately tells me that foo is the type-name.
C's declaration syntax is peculiar enough that those who use typedefs
heavily wind up having to invent a convention, such as the common
"_t" suffix or FunkyCapitalLetters or ALL_CAPS, to mark which ones
are typedefs. Just use the "struct" keyword and avoid this pain,
I say; resort to typedefs only when really necessary, as for
state_func_ptr -- which could use a better name anyway.)
 
A

Adam Warner

Hi Chris Torek,

Many thanks for the comprehensive reply. I knew it would be extraordinary
because I've come across many of your replies in the archives.

I'm still getting stuck upon syntax so it will take a while to digest your
closure implementation. Thanks for the tip that it is struct that defines
user-defined abstract data types in C in contrast to typedef which defines
aliases for preexisting abstract data types.

Regards,
Adam
 
M

Michael Mair

Chris Torek wrote:
[snip: Comprehensive article about closure and example code
for a "state machine" using function pointers]
Other methods are of course possible, and one can expose as much
of the "context" data structure as one likes -- although in C, it
is an all-or-nothing deal: entire the contents are visible, or it
is an opaque type known either by name ("struct foo", as an incomplete
type) or dealt with via generic-data-pointer ("void *", with
conversions to and from the internal "struct whatever", as in these
examples). If you want to expose only part of the struct, expose
the whole thing, and tell people not to use the parts they are
supposed to keep their fingers off of. :) (Or, use a language
with such features, perhaps even C++. C++ has a lot of syntactic
sugar to make this prettier, using "class" instead of "struct".
Note, however, that C++ is a much larger language, and even the
things that share syntax with C sometimes have different semantics.)

Question: Is the following a conforming way to hide parts of a struct?

Create a struct type containing the public information only and
one which contains a struct of this type as first member (or
contains the public information in the same order) and then the
private information and operate only through pointers.
Only the "public" structure is made available to the user.

struct pub_struct { float public1; char public2; };
struct full_struct { struct pub_struct public; int private; };

Or would we need to wrap it into a union?
Or am I completely off the track?

Obvious problems are that a user might have the "intelligent"
idea to pass the address of a struct pub_struct variable he
"created" himself. OTOH, "taking away" the structure from the
user by giving him only a type pointer to struct pub_struct
only scares away the not-too-determined user...


(Only giving out void * and returning and modifying data via
access macros/functions certainly is feasible but effectively
hides the structure.)


Cheers
Michael
 
A

Adam Warner

Hi Michael Mair,
Question: Is the following a conforming way to hide parts of a struct?

You may be interested in this reference I've just come across:

"Object-Oriented Programming in ANSI C"
by Axel-Tobias Schreiner (October 1993)

The first chapter is devoted to informing hiding in abstract data types.
The English translation is freely available as a PDF from the author:
<http://www.cs.rit.edu/~ats/>
<http://www.cs.rit.edu/~ats/books/index.html>
<http://www.cs.rit.edu/~ats/books/ooc.pdf>

Regards,
Adam
 
M

Michael Wojcik

Question: Is the following a conforming way to hide parts of a struct?

Create a struct type containing the public information only and
one which contains a struct of this type as first member (or
contains the public information in the same order) and then the
private information and operate only through pointers.
Only the "public" structure is made available to the user.

struct pub_struct { float public1; char public2; };
struct full_struct { struct pub_struct public; int private; };

full_struct isn't allowed to put any padding before its first member,
so a pointer to a full_struct can be cast to a pointer to a pub_struct
and safely dereferenced.

I might prefer, though, to rely on the "common initial members" rule,
and define them thus:

struct pub_data { /* public fields */ };
struct pub_struct { struct pub_data public; };
struct full_struct { struct pub_data public; int private; };

This has an advantage for maintenance: if at a later date you decide
you need more public members, you can create a new struct pub_data2,
put that after "public" in full_struct, and create a new struct
pub_struct2 to "publish" it - all without affecting existing code
that only needs to know about pub_data. (With your method, you'd be
changing the size of pub_struct if you added new public fields, for
example.)

(Technically, I believe the standard says that accessing the common
initial members of one structure type through another structure type
is only guaranteed if they're in a union, but since it's tough for
the implementation to know whether they might be in one or not in
some TU, in practice it's probably fine to just cast pointers rather
than create a union purely for aliasing purposes.)

However, it may be best simply to keep the entire thing opaque and
provide access functions. There's a performance tradeoff, but in
most cases it won't be important.
 
M

Mark McIntyre

Hello all,

I'm very new to C but I have a number of years of Common Lisp programming
experience. I'm trying to figure out ways of translating higher order
concepts such as closures into C.

I don't think you can. The language doesn't really support these concepts.
The code will not be idiomatic C.

for sure :)
GCC has an extension to ISO C that permits nested functions:

Which are an Abhomination. Abhor them....
 
M

Mark McIntyre

At this stage my question is elementary: How do I make all hell break
loose in the code below?

#include <stdio.h>

size_t glfn1() {
int x=1;
int inc() {
return printf("%d\n",x=x+1);
}
return (size_t) &inc;

Firstly the & is not neccessary - the name of a function is a pointer to
it.

More importantly tho, since inc() is local to glfn1(), when you return from
glfn1, it disappears. Hence its address becomes invalid and referencing it
will probably cause a segfault. .
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top