(void**)&ppval

O

Oliver Block

Hello,

sometimes you can read that a function requiring an argument of tpye
void ** is submitted something like the following:

mystruct **ppval;

myfunc((void **)&ppval, ...);

I would expect that one would instead use

myfunc((void **)ppval, ...);

Why would one use the first and what is the difference to using the
second call?

Regards,

Oliver Block
 
E

Eric Sosman

Oliver Block wrote On 09/04/07 11:01,:
Hello,

sometimes you can read that a function requiring an argument of tpye
void ** is submitted something like the following:

mystruct **ppval;

myfunc((void **)&ppval, ...);

I would expect that one would instead use

myfunc((void **)ppval, ...);

Why would one use the first and what is the difference to using the
second call?

Both are probably wrong.

One possibility is that the author of myfunc() assumed
that all data pointers have the same representation. He's
trying to use the void** argument to find his way to some
other pointer in memory, and then trying to manipulate that
other pointer as if it were a void*, regardless of what it
actually is. If this is the case, myfunc() is a bug looking
for a place to sting. It will find one as soon as it's run
on a machine where different "flavors" of pointers exist.

Another possibility is that myfunc() actually wants the
target to be an actual void*, and treats it as such in all
honesty. Since the thing ppval points to is not a void*,
then both calls above are wrong. (The difference between
this case and the first is really just documentation: if the
function is trying to manipulate a "generic pointer" the
author made a mistake, but if the function says it wants to
manipulate a void* and you give it something else then the
mistake is yours.)

Either way, it's time to take a close look at myfunc()
and figure out what it's trying to do.
 
C

Chris Torek

sometimes you can read that a function requiring an argument of tpye
void ** is submitted something like the following:

mystruct **ppval;

myfunc((void **)&ppval, ...);

This call is almost certainly wrong.
I would expect that one would instead use

myfunc((void **)ppval, ...);

This call is also almost certainly wrong, but in a different way.
Why would one use the first and what is the difference to using the
second call?

The first call takes the address of "ppval" -- this is a value of
type "mystruct ***", assuming of course that "mystruct" is a
typedef-alias for some other valid type, and this value is a pointer
to the object named "ppval" -- and converts it to the type "void **",
then passes the resulting value. (This assumes that the conversion
produced a valid value of type "void **", which need not be the
case: if the conversion is not valid, the program may halt at this
point, e.g., with something like "segmentation fault (core dumped)"
or "runtime error: invalid address, traceback follows or whatever.")
Assuming that this actually does call myfunc() with some value,
the function myfunc() would have to convert the value back to the
correct type -- "mystruct ***" -- in order to use it.

The second call takes the value of "ppval" -- this is a value of
type "mystruct **", assuming of course that "mystruct" is a
typedef-alias for some other valid type -- and converts it to the
type "void **", then passes the resulting value. As before, the conversion
from the original value to the new one may produce an invalid value.
If, by whatever chance or fortune (good or bad), the conversion
produces a valid value, the function myfunc() would have to convert
the value back to "mystruct **" to use it.

Now, with all that out of the way, how does code like the above
arise in the first place?

Let me back up a bit and talk about "void *". In ANSI C -- i.e.,
the language that has existed since the end of 1989 -- "void *" is
a sort of "generic data pointer type". For instance, the value
returned from malloc() has type "void *", so that it can point to
any kind of data. Furthermore, for any valid data type "T",
conversion from "void *" to "T *" and vice versa happens "freely",
without a cast:

void *some_sort_of_allocator(void);

T *p;
void *vp;

vp = some_sort_of_allocator();
...
p = vp; /* no cast required */
...
vp = p; /* no cast required */

This means that malloc() and free(), which return and take values
of type "void *", can be used without casts:

p = malloc(sizeof *p);
...
free(p);

The way this works, in theory if not always in practice, is by
making "void *" be a super-cali-fragialistic-expialidocious, huge
fat MEGA-pointer, capable of holding every other kind of data
pointer, almost as if it were a C "union" of all the various
data-pointer types (but slightly different, in that some sort of
conversion is involved). In practice, what it really means is that
"void *" can point to *any* byte[%] of data within the C run-time
system's address space.

-----
[%] Keep in mind that "byte" means "C byte", i.e., "unsigned char".
For this reason, ordinary "unsigned char *" is actually just as
good as "void *", except that "void *" is freely convertible --
without casts -- while the various forms of "char *" require casts
to convert them.
-----

On some machines -- rare today -- the machine's "native" pointer
type points to "machine words" that are bigger than a single C
byte. For instance, a machine might have four billion "words" of
RAM (4 giga-"words", rather than 4 giga-"bytes") where each "word"
is 32 bits wide. (This machine thus has 16 gigabytes of RAM,
addressable using only a 32-bit address bus. Addresses go from 0
to 4294967295 as usual, but each address gives you a full 32-bit
"machine word", hence the 16 gigabytes of RAM with only 4 gigabytes
of machine-word-addresses.)

To implement C on such a machine, the compiler-writer must choose
one of two approaches. He can make "char" 32 bits wide, so that
there are only four giga-"C bytes". Then "char *" and "int *" are
both 32-bit data types. Or, he can make "char" 8 bits wide as
usual, but make "char *" use *two* machine words to hold an address:
one word gives the 32-bit machine-level address of the 32-bit word
holding the 8-bit byte, and the other word contains a value between
0 and 3 inclusive, telling the runtime system which 8-bit field to
extract out of (or set in) that 32-bit word.

In the past, this "use some extra bits to tell which `C byte' to
manipulate within the machine word" method was more common. Some
pointers (like "int *") were ordinary, "skinny", machine pointers;
others ("char *" and a few more) were "fat" pointers containing
both a machine-level "skinny" pointer, and some more information.

Since "void *" also must point to any "C byte", "void *" is also
"fat" on this kind of system.

(Another technique used in the past was to have only one *size* of
pointer, but multiple different pointer *formats*. For instance,
on some Crays, the machine-level word and "word pointer" were both
64 bits wide, but the maximum amount of RAM in the machine was not
the 18,446,744,073,709,551,616 64-bit words that would use up all
64 bits, but rather something much smaller. This leaves a bunch
of "unused" address bits, and the byte-offset can be "smuggled" in
those bits. Some Cray compilers smuggled byte-offsets in high
order bits, so that the alignment of a "void *" or "char *" was
something you could inspect by shifting the value 48 bits right.
Or, on the Data General Eclipse, the machine actually had two
different kinds of pointers in hardware, and one used a bit-shift
instruction to convert from "word pointer" to "byte pointer" and
vice versa. On the Eclipse, it was physically impossible to have
an unaligned word pointer -- the low-order bits that would make
one "unaligned" got shifted out, and no longer existed. Hence:

short s;
short *sp1, *sp2;
char *cp;

sp1 = &s;
cp = (char *)sp1;
cp++;
sp2 = (short *)cp;
if (sp1 == sp2)
puts("this always prints, on the Eclipse");

because "cp = (char *)sp1" and then "sp2 = (short *)cp" compiles
to machine-level code that does, in effect, the following:

sp1 = &s;
cp = (char *)((unsigned int)sp1 << 1);
/* obviously cp has a 0 in its low order bit now */

cp++; /* now cp has a 1 in its low order bit */

sp2 = (short *)((unsigned int)cp >> 1);
/* the ">> 1" tosses out the bit that got set in "cp++" */

Porting code to the Eclipse was ... "educational". :) )

Now, given that ANSI C has "void *" -- which may be "fat", just
like "char *", although this is an implementation detail -- people
started doing silly (or maybe not so silly) things like this:

int some_function_that_allocates(void **p, size_t some_size) {
if (some_condition)
return ERROR_CODE_1;
*p = malloc(some_size);
...
if (some_other_condition) {
free(*p)
return ERROR_CODE_2;
}
...
return OK;
}

In order to *use* this function *correctly*, the caller must do
something like this:

void *tmp;
T *ptr;
int error;

error = some_function_that_allocates(&tmp, THE_SIZE);
if (error) ... handle error ...
ptr = tmp;

On the Eclipse or Cray or whatever, the line:

ptr = tmp;

may compile to actual machine code that does something like shifting
or "narrowing", to alter the "fat" pointer to a "skinny" one or
whatever.

On your garden-variety x86 CPU, however, all of the machine-level
pointers are the same "under the hood". So, a programmer who "knows"
this about the machine may choose to do the following:

error = some_function_that_allocates(&ptr, THE_SIZE);

thinking:

OK, some_function_that_allocates() writes a "void *", but
"void *" and "T *" have the same shape and layout and all,
so what the heck, I can just have it set my variable "ptr"
even though "ptr" is a "T *" instead of a "void *".

The call requires a diagnostic, which they silence by adding a
cast:

error = some_function_that_allocates((void **)&ptr, THE_SIZE);/*ERROR*/

This actually works -- on the x86. It fails on the Eclipse.

People also do this with free(), writing:

void silly_free_routine(void **p) {
free(*p);
*p = NULL;
}

and then call it with:

void *tmp;
T *ptr;
...
error = some_function_that_allocates(&tmp, THE_SIZE);
...
ptr = tmp;
...
tmp = ptr;
silly_free_routine(&tmp);

This works fine -- we are passing to silly_free_routine() the
address of the temporary variable "tmp", so that it gets a
"void **", and silly_free_routine() frees what "tmp" points to and
then sets tmp itself to NULL. But it does not set "ptr" to NULL,
so this was a pretty silly thing to do.

It looks less silly -- but to the experienced Eclipse or other
"weird" machine programmer, quite wrong -- if you use the call:

silly_free_routine((void **)&ptr);

Here the goal is to free the thing "ptr" points to, then set "ptr"
to NULL. But as we now know from the above, simply casting &ptr
to "void **" is wrong.

You can, of course, write:

void free_a_T(T **p) {
free(*p);
*p = NULL;
}
...
void do_something(void) {
T *ptr = malloc(...);
...
free_a_T(&ptr);
}

but this is not quite as useful as it might seem, because the
do_something() routine might also read, e.g.:

void do_something(void) {
T *ptr = malloc(...);
T *save;
...
save = ptr;
...
free_a_T(&ptr);
...
if (save)
use(save);
...
}

In other words, just because we NULL-ed out "ptr" does not mean
we have NULL-ed out every copy we might ever have made. And in
any case, I myself think it is just as easy to do:

void do_something(void) {
T *ptr = malloc(...);
...
free(ptr);
ptr = NULL; /* tell later code that our object is gone */

and then you can remember things like this:

save = NULL; /* kill off the saved copy too */
...
}
 
K

Keith Thompson

Eric Sosman said:
Oliver Block wrote On 09/04/07 11:01,:

Both are probably wrong.

One possibility is that the author of myfunc() assumed
that all data pointers have the same representation. He's
trying to use the void** argument to find his way to some
other pointer in memory, and then trying to manipulate that
other pointer as if it were a void*, regardless of what it
actually is. If this is the case, myfunc() is a bug looking
for a place to sting. It will find one as soon as it's run
on a machine where different "flavors" of pointers exist.
[...]

Right. 'void*' can be used as a generic pointer type. Any pointer
value (well, any pointer-to-object value, not a function pointer) can
be converted to void* and back again with no loss of information,
and the conversion can be done implicitly.

It's tempting to use 'void**' as a generic pointer-to-pointer type,
but there is no generic pointer-to-pointer type.
 
A

Ark Khasin

Chris Torek wrote:
The way this works, in theory if not always in practice, is by
making "void *" be a super-cali-fragialistic-expialidocious, huge
fat MEGA-pointer, capable of holding every other kind of data
pointer, almost as if it were a C "union" of all the various
data-pointer types (but slightly different, in that some sort of
conversion is involved). In practice, what it really means is that
"void *" can point to *any* byte[%] of data within the C run-time
system's address space.
On some machines -- rare today -- the machine's "native" pointer
type points to "machine words" that are bigger than a single C
byte. For instance, a machine might have four billion "words" of
RAM (4 giga-"words", rather than 4 giga-"bytes") where each "word"
is 32 bits wide. (This machine thus has 16 gigabytes of RAM,
addressable using only a 32-bit address bus. Addresses go from 0
to 4294967295 as usual, but each address gives you a full 32-bit
"machine word", hence the 16 gigabytes of RAM with only 4 gigabytes
of machine-word-addresses.)
In the past, this "use some extra bits to tell which `C byte' to
manipulate within the machine word" method was more common. Some
pointers (like "int *") were ordinary, "skinny", machine pointers;
others ("char *" and a few more) were "fat" pointers containing
both a machine-level "skinny" pointer, and some more information.

Since "void *" also must point to any "C byte", "void *" is also
"fat" on this kind of system.
<snip>
Just wondering:
Is void * pointer the fattest there is or can const void * be even
fatter? (Thinking of a machine with 2K of RAM and 1023.5G of ROM :)
Thanks,
-- Ark
 
J

Justin Spahr-Summers

Just wondering:
Is void * pointer the fattest there is or can const void * be even
fatter? (Thinking of a machine with 2K of RAM and 1023.5G of ROM :)

"A pointer to void shall have the same representation and alignment
requirements as a
pointer to a character type. Similarly, pointers to qualified or
unqualified versions of
compatible types shall have the same representation and alignment
requirements." (C99)

So, no, const void * cannot be fatter.
 
B

Barry Schwarz

Just wondering:
Is void * pointer the fattest there is or can const void * be even

The standard only guarantees that an object pointer converted to a
void* and back comes back equal. It says nothing about relative
sizes.
fatter? (Thinking of a machine with 2K of RAM and 1023.5G of ROM :)

The standard does guarantee that qualified types have the same size
and representation as unqualified types.

Using your example, if there were a string literal in ROM, its address
(which perhaps should be const char*) could be assigned to a char* so
that would have to be fat enough. Since char* and void* have the same
size and representation, void* must be fat enough also.


Remove del for email
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,148
Latest member
ElizbethDa
Top