smuggling data in and out of alarm handlers and the like

D

David Mathog

Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

Is there a better way of doing this than the three methods described below?

1. use global variables. This is fine for small monolithic programs
but isn't great in larger programs using lots of code in separate
files and/or shared libraries.

2. create a function like this:

SOME_STRUCT *smuggle_pointer(SOME_STRUCT *pointer){
static SOME_STRUCT *hold_pointer;
if(pointer)hold_pointer = pointer;
return(hold_pointer);
}

and load it with data before any possible calls to the
problem functions. The target function contains:

SOME_STRUCT *pointer;
pointer = smuggle_pointer(NULL);

which gives that function access to this data.

3. use setenv/getenv. This is not part of ANSI C, but is
usually available. Only useful for one or two simple
values.


Is there a better way?

Thanks,

David Mathog
 
B

Ben Pfaff

David Mathog said:
Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

For qsort and bsearch, I use replacements that take an extra void
* parameter and pass it to the comparison function. The void *
can be used to pass arbitrary data. (There's really no excuse
for these functions failing to provide any way to pass auxiliary
data. They are badly designed in my opinion.)
 
E

Eric Sosman

David said:
Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

Is there a better way of doing this than the three methods described below?

1. use global variables. This is fine for small monolithic programs
but isn't great in larger programs using lots of code in separate
files and/or shared libraries.
> [...]

Alternatives 2 and 3 strike me as "global variables in
disguise," just a syntactic variation on the basic global
variable method.

If the extra argument is chosen from a fairly small
fixed set of values that are known at compile time, you
can write a "real" comparator that takes the two item
pointers plus your third argument, and a bunch of two-
argument wrappers that can be passed to qsort:

static int realCompare(const void *p, const void *q,
/* const? */ MagicCookie *r) { ... }

int compare1(const void *p, const void *q) {
/* static? const? */ MagicCookie c = { 1 };
return realCompare (p, q, &c);
}

int compare2(const void *p, const void *q) {
/* static? const? */ MagicCookie c = { 2 };
return realCompare (p, q, &c);
}

In effect, this uses the identity of the comparator as
a surrogate for the extra parameter.

If that's not feasible -- maybe the set of potential
argument values in inconveniently large, or isn't known in
advance -- then a file-scope variable is probably best:

static MagicCookie *cookie;

void setCookieForSort(MagicCookie *cp) {
cookie = cp;
}

int compare(const void *p, const void *q) {
/* use `cookie' as part of the decision */
}

/* elsewhere: */
setCookieForSort (&theCookie);
qsort (..., compare);

A filthy dirty variant would allow the "global" to
be cached inside the comparator:

int compare(const void *p, const void *q) {
static /* const? */ MagicCookie *cookie;
if (p == NULL) {
cookie = /* (MagicCookie*) */ q;
return 0;
}
/* usual comparison, using `cookie' */
}

/* elsewhere: */
compare (NULL, &theCookie);
qsort (..., compare);

This only works if the smuggling destination (1) can accept
a pointer to the smuggled item and (2) can distinguish the
smuggler's visitation from ordinary calls. These are not
problems for the qsort() or bsearch() comparators, but might
be troublesome in some other "callback" situations.

Personally, I'd use wrappers where feasible or a file-
scope variable with a setter where not. The latter has many
of the usual disadvantages of statics, but at least it's
honest about its shortcomings.
 
P

pete

David said:
Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

When there's only a small number of values
that your variable can have:

For example,
if you had an array of pointers
to strings of a few columns,
and you wanted to qsort the strings
based on an arbitrary column,

then you could define an array of pointers to
different compar functions,
which you would also have to write,
and then a variable indexing the array of function pointers
would be part of the qsort call.
 
R

Robert Latest

David said:
Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort.

Such functions are typically called "callback functions". Just to clarify.
In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

Yeah, that's a botch. For the qsort() callback you can at least pass in an
array of pointers to structs that have an additional member which points to
your "smuggled" data, which is clumsy because it eats memory and
performance.

When I started programmming GUIs using GTK+ (which revolves around callback
functions), it at first struck me as odd that each and every callback had
some curious "void *user_data" parameter which seemed useless. Boy, is it
not, as I realized after about 10 minutes. It's odd that the designers of
the C (and other) libraries didn't think of this mecessity.

robert
 
S

SM Ryan

# When I started programmming GUIs using GTK+ (which revolves around callback
# functions), it at first struck me as odd that each and every callback had
# some curious "void *user_data" parameter which seemed useless. Boy, is it
# not, as I realized after about 10 minutes. It's odd that the designers of
# the C (and other) libraries didn't think of this mecessity.

This is a primitive way of doing closures. The full notion of a closure
came after the beginning of C and Unix. Proper implementation of closures
requires garbage collection which many in C are still reluctant to use.
 
P

Paul Hsieh

Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

Is there a better way of doing this than the three methods described below?

Yeah, the methods you describe are all basically "global" in one
fashion or another, which prevents re-entrancy. The right answer to
use interfaces that always carry a "user pointer" around with it --
something opaque to the calling mechanism, but known to the originator
and target function.

By this criteria, qsort() is basically broken since the cmp() function
has no simple way to gain access to the contextual state.
Fortunately, its fairly easy and straightforward to write your own
sort. (Probably *more* fortunately, being forced to do so will make
you realize that the sort function should be a macro, not a function
call. This allows your "cmp" to itself be a macro, and thus eliminate
the major cost of qsort(), and you can pass contextual parameters to
macros just as easily as to functions.)

Thus, if its possible, just make sure some extra pointer to your
context is passed, and received in your leaf functions and demand that
your interface/transport pass along those pointers. In the case of a
generic transport, you should make it a void *, otherwise if its
custom you can declare the pointer exactly.

If you are stuck with an interface that didn't see fit to pass along
that extra pointer, then you can try to hide your context in a data
pointer, but of course that doesn't work for something like qsort.
But again, we don't care about qsort, since you should never use it
anyways (its always slow, always algorithmically sub-optimal, and
always inflexible -- qsort is just one of many examples of where its
clear that the C-library was designed by amateurs.)

If you *have* to use global variables, then you are going to want to
shield it from re-entrancy or multiple threads with some sort of
mutex, then just use a global or static depending on your taste (but
its like waxing a beat-up jalopy).
 
C

CBFalconer

Paul said:
.... snip ...

By this criteria, qsort() is basically broken since the cmp()
function has no simple way to gain access to the contextual state.
Fortunately, its fairly easy and straightforward to write your own
sort. (Probably *more* fortunately, being forced to do so will
make you realize that the sort function should be a macro, not a
function call. This allows your "cmp" to itself be a macro, and
thus eliminate the major cost of qsort(), and you can pass
contextual parameters to macros just as easily as to functions.)

Utter bushwah. Qsort and similar operations are normally
precompiled, and will attempt to call a routine passed in via a
parameter. This cannot be a macro. All you will get is a bunch of
linkage errors or a total blowup, depending on the sophistication
of the linker.
 
R

Robert Latest

Paul said:
But again, we don't care about qsort, since you should never use it
anyways (its always slow, always algorithmically sub-optimal, and
always inflexible -- qsort is just one of many examples of where its
clear that the C-library was designed by amateurs.)

I can't find the passage in the C standard that requires the qsort function
to be implemented in some way or another.

robert
 
E

Eric Sosman

Paul said:
Every so often I find a need to move data in or out of functions which
are not called directly. For instance, alarm handlers or the compare
function used by qsort. In these cases there is no possibility of
passing the data as a function parameter, so it needs to be "smuggled"
in.

Is there a better way of doing this than the three methods described below?

Yeah, the methods you describe are all basically "global" in one
fashion or another, which prevents re-entrancy. [...]

If you *have* to use global variables, then you are going to want to
shield it from re-entrancy or multiple threads with some sort of
mutex, then just use a global or static depending on your taste (but
its like waxing a beat-up jalopy).

Note that qsort() itself is not guaranteed to be reentrant
(7.1.4p4), so the use or non-use of static variables doesn't
really change the picture. For a multi-threaded program, this
means you either need to rely on standards that go beyond C's,
or else use some kind of mutual exclusion as Paul suggests. Even
for single-threaded programs (the only kind C really understands),
this means that qsort()'s comparison function must not use a
subsidiary qsort(). Similarly, bsearch()'s comparator mustn't
use bsearch().
 
S

santosh

Robert said:
I can't find the passage in the C standard that requires the qsort
function to be implemented in some way or another.

robert

Nevertheless it's interface cannot be changed, which is what, I think,
Paul Hsieh was talking about.
 
J

James Kuyper

santosh said:
Nevertheless it's interface cannot be changed, which is what, I think,
Paul Hsieh was talking about.

You can't derive the conclusion that qsort() is "... always slow, always
algorithmically sub-optimal, ..." from the interface specification, or
indeed from anything else that the standard says about qsort().
 
K

Keith Thompson

James Kuyper said:
You can't derive the conclusion that qsort() is "... always slow,
always algorithmically sub-optimal, ..." from the interface
specification, or indeed from anything else that the standard says
about qsort().

No you can't (except that using a callback function for each
comparison is bound to impose some overhead). But I don't think Paul
did so. I suspect that he based his conclusion on observations of
actual qsort() implementations.

Whether Paul's conclusion is correct is a different matter.
 
K

Keith Thompson

CBFalconer said:
Utter bushwah. Qsort and similar operations are normally
precompiled, and will attempt to call a routine passed in via a
parameter. This cannot be a macro. All you will get is a bunch of
linkage errors or a total blowup, depending on the sophistication
of the linker.

Clearly the cmp operation used by qsort() can't be a macro, since it
has to be passed to qsort() as an argument of a pointer-to-function
type. I don't think Paul was claiming otherwise.

But if you write *your own* sort routine, as Paul suggests, then the
comparison routine can easily be a macro, or even simple inline code.
Such a sort routine would lack qsort()'s flexibility, in that it would
only be able to sort arrays of a single type.

qsort() pays for its flexibility (being able to sort arrays of any
type) with a loss of efficiency (due to the indirect calls to the
comparison routine). Sometimes this cost is worth paying, sometimes
it isn't. (A language with a template facility could give you the
best of both. Even in C, you *might* be able to define the entire
sorting routine as a macro, but it's not something I'd care to try.)

As for letting the comparison function have access to contextual
state, yes, that's a limitation of qsort(), and one that could be
fixed without losing its flexibility. But adding an extra required
parameter to the comparison routine would likely cause another small
loss of performance.

There are open-source C implementations of qsort(). If you need
something that the standard qsort() doesn't provide, you can always
grab one of them and modify it to your taste. Nothing in the C
standard says that qsort() is the only way to sort things.
 
P

Paul Hsieh

Clearly the cmp operation used by qsort() can't be a macro, since it
has to be passed to qsort() as an argument of a pointer-to-function
type. I don't think Paul was claiming otherwise.

But if you write *your own* sort routine, as Paul suggests, then the
comparison routine can easily be a macro, or even simple inline code.
Such a sort routine would lack qsort()'s flexibility, in that it would
only be able to sort arrays of a single type.

Really?

#define hsortT(type, context, array, length, cmpT) ... \
#define cmpT(type, context, el1, el2) ... \

I *DO* this all the time. The point is that this is the *MOST*
flexible solution. Even if you *wanted* to use function callbacks,
you just fill them into the macros:

#define typeCmp(type, context, el1, el2) cmpPOS (context, el1, el2)

void PaycostofcallOverheadSort_type (
void * context,
type * array,
size_t len,
int (*cmpPOS) (void * context, type * el1, type * el2))
{
hsortT (type, context, array, len, typeCmp);
}

Of course, there's little point in doing this, but you can't claim its
less flexible than qsort.
qsort() pays for its flexibility (being able to sort arrays of any
type) with a loss of efficiency (due to the indirect calls to the
comparison routine). Sometimes this cost is worth paying, sometimes
it isn't.

Its worth paying if you don't care about performance. The only case
where qsort offers any compensating use is when in a single program
you are sorting very many, widely varied types, and you could care
less about its performance. (Maybe a proof of concept of something
running on an embedded system or something like that.)
[...] (A language with a template facility could give you the
best of both. Even in C, you *might* be able to define the entire
sorting routine as a macro, but it's not something I'd care to try.)

Why not? It really is not that hard. I mean, the only challenge is
writing a sort, which, for real computer programmers ... well it was
supposed to be an exercise in college. If you were asleep during
those classes, you can pull stuff from the web.
As for letting the comparison function have access to contextual
state, yes, that's a limitation of qsort(), and one that could be
fixed without losing its flexibility. But adding an extra required
parameter to the comparison routine would likely cause another small
loss of performance.

Thus maximizing the mediocrity of their solution. Its neither fast,
nor flexible.
There are open-source C implementations of qsort().

Lol! Ever watch the movie Office Space? Where they are looking up
the definition of "money laundering" in the dictionary? That's what
looking up open source for qsort() reminds me of.
[...] If you need
something that the standard qsort() doesn't provide, you can always
grab one of them and modify it to your taste. Nothing in the C
standard says that qsort() is the only way to sort things.

Sorting without a guarantee on the lower bound of performance and
necessarily paying an unnecessary overhead cost -- it makes you wonder
why computer science is even taught in universities at all.
 
L

lawrence.jones

Keith Thompson said:
qsort() pays for its flexibility (being able to sort arrays of any
type) with a loss of efficiency (due to the indirect calls to the
comparison routine).

There's no reason an implementation can't inline qsort() and once you do
that, you can also inline the comparison function.

-Larry Jones

My upbringing is filled with inconsistent messages. -- Calvin
 
K

Keith Thompson

There's no reason an implementation can't inline qsort() and once you do
that, you can also inline the comparison function.

[I'm piggybacking this followup off my own message, so the Reference
header isn't going to be quite right. Larry's followup appears on the
server where I read news, but not on the one where I post.]

Yes, I believe you're right.

I would think that inlining an indirect call (one made via a function
pointer parameter) would be more difficult than inlining an ordinary
call. Do any implementations actually do this?

Of course, it's still possible to write a qsort() call for which the
comparison function can't be inlined, for example by passing an
expression, rather than just a function name, for the compar
parameter. But a sufficiently clever compiler could certainly handle
that.
 
L

lawrence.jones

Keith Thompson said:
I would think that inlining an indirect call (one made via a function
pointer parameter) would be more difficult than inlining an ordinary
call. Do any implementations actually do this?

In C, function calls are always made via a pointer. And in the case of
qsort(), the argument corresponding to the pointer parameter is nearly
always a function designator (i.e., the name of a function as opposed to
a pointer variable), so it shouldn't be any more difficult to inline
than any other function call. But I don't know of any implementations
that actually inline qsort(), presumably because its performance isn't
critical often enough to be worth while.

-Larry Jones

Mr. Subtlety drives home another point. -- Calvin
 
K

Keith Thompson

In C, function calls are always made via a pointer. And in the case of
qsort(), the argument corresponding to the pointer parameter is nearly
always a function designator (i.e., the name of a function as opposed to
a pointer variable), so it shouldn't be any more difficult to inline
than any other function call. But I don't know of any implementations
that actually inline qsort(), presumably because its performance isn't
critical often enough to be worth while.

[ Again, I'm piggybacking this onto my own article, since Larry's
article isn't on this server. It looks like Larry's articles aren't
showing up on aioe.org. Perhaps homeip.net, like rr.com, is under a
UDP? ]

Yes, but if it's simpler in assembly or machine code to call a
function directly than to call it indirectly via a pointer object or
value, then presumably the fact that ``func(42)'' involves a
conversion of ``func'' to a pointer goes away early in the compilation
process. C function calls are always done via pointers, but the
compiler doesn't have to treat them that way, as long as the same
effect is achieved.

qsort()'s comparison function argument is typically a simple function
designator, but if the qsort() call is inlined, then it's probably
going to look like the designator is converted to a pointer, then
copied to a local object (the parameter object), then used in an
indirect call. Of course any optimizer worth its salt will collapse
that into a simple call, which can then be inlined. So yes, I agree.

Going off on a bit of a tangent, paul Hsieh said elsewhere in this
thread that the overhead imposed by qsort() is acceptable only if you
don't care about performance. I disagree. In my opinion, the
overhead is acceptable if it doesn't significantly affect the
performance of the particular program you're using it in. The saving
in programming effort (avoiding the time needed either to write your
own sorting routine or find one elsewhere), and in risk (getting the
sorting routine wrong), often (but by no means always) can outweigh
the performance penalty.

And if you think that a sorting routine is simple enough that there's
no significant risk of getting it wrong, consider this (Jon Bentley,
"Programming Pearls"):

While the first binary search was published in 1946, the first
binary search that works correctly for all values of n did not
appear until 1962.

The most reliable code is the code that isn't there.
 
P

Paul Hsieh

[...]
And if you think that a sorting routine is simple enough that there's
no significant risk of getting it wrong, consider this (Jon Bentley,
"Programming Pearls"):

While the first binary search was published in 1946, the first
binary search that works correctly for all values of n did not
appear until 1962.

The most reliable code is the code that isn't there.

So you think compiler vendors implement qsort() without any code?
qsort() also mashes things through void * pointers, which means you
*LOSE* type safety -- so this introduces other risk to the code.

All this is besides the point that you just equated sorting with
binary search.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top