Dave Vandervies wrote:
void map(void *dst, size_t dst_size, const void *src,
size_t src_size, void *ext_var,
void (*trans_func)(void *dst, const void *src, void
*ext_var),
size_t arr_len)
{
for (size_t i=0; i<arr_len; i++)
{
trans_func((unsigned char *)dst+(i*dst_size),
(const unsigned char *)src+(i*src_size), ext_var);
}
}
It's probably worth getting in the habit of declaring local non-void
pointers for void-pointer arguments that you plan to use instead of
casting them where you use them, since pointer casts often mean there's
something sketchy going on. (Though in this case it's worth noting that
the casts are only doing what the castless conversion would do anyways.)
This lets you keep the "Pointer cast -> suspicious" neurons active
while you're using this technique to implement type-agnostic functions.
If you ever have the misfortune to find yourself working somewhere
where productivity is measured in lines of code, it also gives you a way
to pad your numbers while actually *reducing* the cognitive burden on
both yourself and maintenance programmers (since they get to keep their
"pointer cast -> suspicious" neurons active too).
I struggled a bit with this: Why cast to a (char *)? Why not a char? At
first I was doing:
trans_func( (void *)((char)dst+(i*dst_size)), ... with a cast to
increment the pointer, and then casting back to a (void *).
The former needed a char * for some reason, and the latter seems to be
unneccesary.
Remember that in C, "char" is a synonym for "byte" (as well as being
"holds a character", except when it's too small for that).
When you put "test_src" (or "test_dst") in the list of arguments to map(),
the compiler does a few things for you:
-Converts the array (int[10]) to a pointer to its first argument (int *)
(This happens any time you use an array name in a value context)
-Converts the int * (type of the actual value given as an argument) to a
void *-to-void (type of the function argument)
void * has a few special properties that make it a generic pointer:
-Any data pointer type can be converted to a void * (and compilers
shouldn't complain)
-A void * can be converted to any data pointer type (and compilers
shouldn't be complain)
-For any data type T, a void * that was obtained by converting a T *
into a void * can be converted back to a T * that will be equivalent
to the original one
map() doing its magic ends up depending on all of these, but actually
passing it to the function is where we're using the first one.
-Puts that void * wherever map() expects to find it
So the pointer that map() gets looks like this:
+-void * pointing here
v
---------------------------------------- <-- the array
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <-- bytes
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ <-- ints
Since void * is a generic data type, the compiler doesn't know where
to point the result when you try to do pointer arithmetic on it[1].
Pointer arithmetic moves the pointer by the number of *things it's
pointing at* you add or subtract, but the internal representation
typically works in terms of which *byte* the pointer refers to[2].
But map() knows (because we told it) how many bytes each data object
takes up, and we know that "char" is exactly one byte in size, so we can
convert our void * to a char * (either at the beginning of the function
or every time we use it - this particular conversion is a no-op[3], so
any self-respecting optimizer will generate the same code either way),
and do the pointer arithmetic on *that* and get a byte pointer pointing at
the beginning of the data object we want the callback function to work on:
+-byte pointer pointing here
| +-add i*size to get byte pointer pointing here
v v
---------------------------------------- <-- the array
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <-- bytes
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ <-- ints
Once we have that pointer, we can pass it to the callback; we have a
callback that expects a void * (generic pointer), but since the compiler
knows that the callback expects a void * it will take care of that
conversion (since that's why generic pointers exist).
But note that with all of this manipulation, we're only working with
pointers and offsets, not with "actual" values. The only reason "char"
shows up in the code at all is because we know that it has a useful size.
We can convert a pointer to an integer type (including char), but that
conversion isn't guaranteed to be reversible (and, for char, it's highly
unlikely that it will be). So converting the pointer to a char probably
threw away most of the information that the pointer contained, and then
converting it back to a pointer gave you a pointer to some random chunk
of memory that (invoking some knowledge of what kind of implementation
you're likely to be working with) the program probably wasn't allowed
to access, so when the callback function tried to use *that* pointer it
probably crashed.
[Everything below here is starting to drift off-topic for CLC.
comp.programming might be a good next stop to get more information.]
[The following was written in a footnote]
Sometime when you're feeling
masochistic, try implementing fully general coroutines without
What is a 'coroutine' please? Also an example and a use might be
helpful.
Coroutines are a control structure that allows two (or more) independent
"threads" of program execution to pass control between them. (This is
NOT what most programmers are referring to when they talk about threads,
but I can't think of a better term for it.)
Essentially, a coroutine can call another coroutine as if it were a
function call, but when a coroutine is called it will resume where it
left off (as if it called a function that's returning) instead of at
the beginning (as if it were a function being called).
Google will happily give you lots of information, some of which is
probably reliable.
There's a pretty good description and a not-pathologically-limited
implementation at
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html .
A 'call-stack frame' must mean a list of function pointers that need to
be called FIFO?
No, it's the list of functions that *have been* called and haven't
returned yet, and contains things like local variables and where they
need to return to.
Most languages have a single FIFO call stack (hence the name "stack"),
and when a function returns it pops that function's call frame off the
stack and returns to the next function.
Coroutines need a separate call stack for each coroutine (since it needs
to be kept when control passes to another coroutine), and if you allow
the program to access them and don't require them to be strictly FIFO,
you end up getting closer to:
What is a 'continuation'?
It describes what happens next. In a lot of languages it's just
"the next operation gets run", but if you're working with a language
that lets you capture a continuation at an arbitrary point and return
to a captured continuation, you can use it to implement all sorts of
interesting things.
(It turns out that once you have this capability, you can use it to
implement any other flow control construct you want. I went and poked
my sigmonster to get some commentary on it in my .sig quote on this
post, f'rexample.)
One of the easier uses to describe is for early-exit: Capture a
continuation just before you start, and if you get the answer just feed
it to the continuation. At the continuation-capture point, you need some
way to tell whether it's just been captured (and you want to start the
computation) or it's been used for early-exit (and you want to carry on
with whatever it was you wanted the result for), but that can be easier
than backing out of, say, a deeply nested recursion one call at a time.
dave
[1] Actually, that's not quite correct, but if I didn't wave my hands
here I'd get rather too far away from the topic at hand.
[2] That's not required, but the compiler needs to be able to fake it
(because of, among other things, precisely the sorts of thing we're
talking about here). It's perfectly valid for most data pointers
to be word pointers and for byte pointers to contain a word pointer
and a byte offset in that word; and there have been real machines
that did this.
[3] char * and void * are required to have the same representation; no
other distinct types have that requirement (though it also applies
to "signed T *" and "unsigned T *" for any T for which signed and
unsigned are meaningful).