Question about void pointers

CBFalconer · Sep 18, 2008

.... big snip ...

Why not? I'll simply keep in mind that it works only under a
certain compiler.

Because, if you do it properly, you will never have to rewrite it
again. Laziness pays. Whatever 'it' is.

CBFalconer · Sep 18, 2008

[email protected] said:
[email protected] said:

[email protected] said:

(e-mail address removed) wrote:

[about incrementing void pointers as opposed to (char *) cast]

Because sometimes you don't need portability and because
sometimes there's a better way?

Yes, that better way being the portable way, in this case.

Click to expand...

Well, it's a matter of whether you prefer void * or char *.

Click to expand...

No you idiot, it's a matter of whether you prefer portability
over non-portability without any other gains or loses.

The point being that you can't increment a void* pointer in a
conformant C system. If you (sOs) look at the code generated you
will see that a cast to char* generates none, as does the cast back
to char*. However it clues the compiler in to what is going on.

CBFalconer · Sep 18, 2008

.... snip ...

Shouldn't it be unsigned char *? Anyway, my argument was that if
your compiler can perform arithmetic with the void *, then go
void *, as you did above.

Because a compiler can't 'perform arithmetic with the void*'. It
says so, right in the C standard. Trying to do so results in
undefined behaviour.

CBFalconer · Sep 18, 2008

Nick said:
.... snip ...

but in general they aren't. Think DOS. Think IBM mainframe.
Are you trying to be dense?

No, he is just re-demonstrating the condition. I have seen many
explanations to him of this elementary fact.

s0suk3 · Sep 18, 2008

unsigned char is used for both kinds of "bytes".

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

Sebastian

Keith Thompson · Sep 18, 2008

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

No, I really can't.

If I had the opportunity to go back in time and change C, I'd probably
separate the concepts of "byte" (the fundamental unit of storage) and
"character". I'd probably make "byte" a keyword and a type name,
referring to a numeric type similar to what we know as unsigned char,
and I'd allow sizeof(char) to be something other than 1. Given such a
change, it would make sense to access raw memory using type byte*, and
you could sensibly perform pointer arithmetic on such a type. I'm not
certain whether void*, a type that points to raw memory but that can't
can't be dereferenced or incremented, would add sufficient value to be
worth having in the language; perhaps memcpy and friends would just
take arguments of type byte*.

But it's *way* too late to make such a change in any language calling
itself "C".

If you want to a raw pointer that can point to any arbitrary object,
use void*. If you want to access the object's representation as bytes
or perform pointer arithmetic, use unsigned char*. If you want to
point to a specific type, or to an array of a specific type, use a
pointer to that type.

You cannot legally perform pointer arithmetic on void* in standard C,
and if you attempt to do so your code will be *gratuitously*
non-portable.

Chris Dollin · Sep 18, 2008

Richard said:
Thats nice. How does your debugger do it?

Probably as numbers. How would I know? It's been so long*.

When you look into memory and you see your pointer stored in an
arry how does it look? Hex number by any chance? What a surprise.
If you cast it to a char * and subtract p
from ++p do you get 4 on a 32 bit machine? I did.....

I've never looked [at this using a debugger]. I've never needed to.
I understand C's computational model for pointers, and I understand
code generation, and I understand how a /particular/ implementation
can represent pointers in the same way it represents numbers, and I
don't confuse this with a /necessary/ property of implementations,
nor do I confuse scalars and vectors -- a segmented architectures
pointers are multi-component, like vectors, and just because you
can read off a bit-pattern and apply the traditional (but inappropriate)
binary decoding and get a number doesn't mean that bit-pattern
/is/ a number.

Chris Dollin · Sep 18, 2008

Richard said:
The correct statement is

"It is incorrect to increment a non initialised variable since the
behaviour is undefined - however the value may well increment as
expected".

"Or not, at the whim of the implementor. DO NOT RELY ON THIS (if
you expect your code to be even vaguely portable)."

Nick Keighley · Sep 18, 2008

Because sometimes you don't need portability and because sometimes
there's a better way?

at the risk of repeating myself. I REALLY do not understand
how anyone can think like this. You wish to do something in C.
You have two options *that produce identical results* and are similar
in programming complexity. One is portable and one is not.

Why not use the portable version?
What is the <expletive> point of using the non-portable version?
What can you possibly gain?

I'll probably give up after this. The trouble with text based
formats is you can't hear how hard I'm hitting my keyboard by
now.

Why is it rubbish?

doing arithmatic on a void pointer seems really strange to me.

If you have a void pointer, it will typically point
to any kind of object: int, double, long long, structures, anything.
yup

Does it seem natural to you to treat such an object as if it were a
*character*?

no. It's not a character!

Why not? I'll simply keep in mind that it works only under a certain
compiler.

<nick's head explodes>

Richard Bos · Sep 18, 2008

Richard said:
Thats nice. How does your debugger do it? When you look into memory and
you see your pointer stored in an arry how does it look? Hex number by
any chance? What a surprise.

My dear dunce, if you explicitly ask your debugger to hexdump
everything, _your DNA_ is a hex number. No surprise there, then.
OTOH, if I ask my debugger to do a stringdump of memory, all floating
point numbers will display as strings. Now, is a float therefore a
string?

Richard

James Kuyper · Sep 18, 2008

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

No, I do not see why you consider a type whose definition is that it
points at no specific type, to be the natural way of treating an object
as being composed of a series of bytes.

Fundamentally, the meaning of p+3, when p is a non-null pointer to an
object type, is that *(p+3) refers to an object of the pointed-at type
which is three objects of that type forward from the object referred to
by *p. That meaning disappears when *p itself is meaningless, which is
why the standard forbids pointer arithmetic on void* pointers.

The fact that some compilers provide void* arithmetic as an unnatural
extension to C doesn't make it natural. It just means that the users and
implementors of that feature don't properly understand what void* and
pointer arithmetic mean.

James Kuyper · Sep 18, 2008

CBFalconer said:
Because a compiler can't 'perform arithmetic with the void*'. It
says so, right in the C standard. Trying to do so results in
undefined behaviour.

So? How does that stop a compiler from doing so? Undefined behavior
includes, among it's infinite possibilities, permission to treat a void*
pointer as if it had been converted to char*.

The question is not whether a compiler can do so; many compilers do in
fact do so. If they issue a diagnostic message, they can even be fully
conforming compilers.

The real question is, given a fully conforming compiler which allows
such code as an extension, should a programmer take advantage of that
extension? I see no advantage to doing so, and lots of disadvantages,
Sebastion sees it the other way around, for reasons that make no sense
to me. But that's the real debate.

Keith Thompson · Sep 18, 2008

My dear dunce, if you explicitly ask your debugger to hexdump
everything, _your DNA_ is a hex number. No surprise there, then.
OTOH, if I ask my debugger to do a stringdump of memory, all floating
point numbers will display as strings. Now, is a float therefore a
string?

To be fair, a typical debugger probably displays pointers in hex or
something similar by default -- though on a segmented memory system
it's likely to use something like "1234:5678".

For example, gdb displays pointers in hex -- though not in a way that
supports Richard NoLastName's point. Here's a gdb session I just ran:

% gdb c
GNU gdb 6.8-debian
[...]
(gdb) break 6
Breakpoint 1 at 0x8048392: file c.c, line 6.
(gdb) run
Starting program: /home/kst/c

Breakpoint 1, main () at c.c:6
6 printf("x = %d, ptr = %p\n", x, (void*)ptr);
(gdb) list
1 #include <stdio.h>
2 int main(void)
3 {
4 int x = 42;
5 int *ptr = &x;
6 printf("x = %d, ptr = %p\n", x, (void*)ptr);
7 return 0;
8 }
(gdb) print x
$1 = 42
(gdb) print ptr
$2 = (int *) 0xbfa907a0
(gdb) continue
Continuing.
x = 42, ptr = 0xbfa907a0

Program exited normally.
(gdb) quit

Take a look at how it displayed the value of ptr, a variable of type
int*: "(int *) 0xbfa907a0". The authors of gdb know that pointers
aren't really integers, so gdb doesn't display them as integers; it
displays them using a valid C expression.

raphfrk · Sep 18, 2008

2) The function is a framework operating on "abstracted data
types." It doesn't know what kind of data it's handling,
but it passes the data pointers along to type-specific
functions that do know. qsort() and bsearch() are examples
of this style.

Yeah, that is what I was thinking of ... qsort(). Seems that it is a
standard function. Is it normally implemented efficiently?

Anyway, here is my attempt at insertsort.

size = sizeof <data type>

I guess swap doesn't strictly need to be passed.

void insertsort( void *start, void *end, int size, int (*compare)(void
*, void *), void (*swap)(void *, void *) )
{

void *curpos = start;
void *swappos;

while( (char *)curpos <= (char *)end )
{
swappos = curpos;
while( (char *)swappos > (char *)start && (*compare)( swappos ,
((char *)swappos - size) ) < 0 )
{
(*swap)( swappos , (void *)((char *)swappos - size) );
swappos = (char *)swappos - size;
}
curpos = (char *)curpos + size;
}

}

Bartc · Sep 19, 2008

For the fun, I've also this machine to which I telnetted to run a little
program. That may interest you:

@type pvoid.c
#include <stdio.h>

int main()
{
int x;
char y;
char t[10];
int i;
printf("&x = %p\n&x = %o\n&y = %p\n", (void*)&x, (unsigned)&x,
(void*)&y);
for (i=0; i<10; ++i) {
printf("&t[%d] = %p\n", i, &t);
}
return 0;
}
@run pvoid
&x = 331100050105
&x = 50105
&y = 1100050106
&t[0] = 331100050107
&t[1] = 221100050107
&t[2] = 111100050107
&t[3] = 1100050107
&t[4] = 331100050110
&t[5] = 221100050110
&t[6] = 111100050110
&t[7] = 1100050110
&t[8] = 331100050111
&t[9] = 221100050111

pointers are printed as number here, but they probably don't behave like
you'd expect. BTW, a debugger would have printed the first and third as
331100,,50105 and 1100,,50106. To understand the void* one, you have to
know that those are byte pointers, pointing to bytes made of 9 bits (11
octal is 9 decimal, octal being the base commonly use on this 36 bits
machines) inside 36 bits words. Those starting by 33 are pointing to the
least significant 9 bit byte of the word (starting at bit 33 octal -- 27
in
decimal).

Admitly this is quite an older machine, but at a time it was the most
common architecture on the Arpanet. The one I telnetted to was an
emulated
one, but there are still some hardware one on the Internet. There is also
gcc 4.3 port for it, and I suspect the company which is paying to make
that
port still makes hardware implementation even if it doesn't sell them
outside systems.

You're not saying what machine this is?

I'm guessing pdp-10. But I'm not sure if those funny pointers are done in
software or if it's using the special bitfield instructions; these use
pointers with a standard 18-bit address in low half and extra bit
position/width in top half.

The world is more diverse that you think?

Click to expand...

That machine is also obsolete. Byte-addressable machines were a breath of
fresh air. I just hope word-addressed machines don't make a big comeback.

You can consider any bit pattern as a number, but when it is an address
that is not always the best thing to do.

Click to expand...

In this case treating addresses as /integers/ would not have worked (for
incrementing and so on), but no harm in considering the address as a kind of
number, similar to a float which also uses special fields within the 36-bit
word.

BTW what would address to int conversion have done in this case, just done
nothing, or linearised the address?

Chris Torek · Sep 20, 2008

That's an excellent point: what if you _don't_ know what it's really
pointing to? For example, functions such as memcpy(), memmove(),
memset(), etc., are used extensively to deal with objects of any type.
So this functions take void *'s, and they might be pointing to an int,
to a char, to a structure, or anything you can think of. I'm sure the
implementor of such a function would find it enjoyable to be able to
perform void pointer arithmetic!

As someone who has actually written a number of these implementation
functions, I can answer to the last statement here (about "find[ing]
it enjoyable"): no, it really makes essentially no difference.

Here is a classic example of a linear search function (which I am
typing in "on the fly", otherwise I would show a binary search
modeled after bsearch()). Note where "void *" appears and where
it does not.

/*
* Linear search: search "nmemb" items of size "size", starting
* at "base", for the given "key". Return a pointer to the
* item found, or NULL if the item is not found.
*
* Comparisons between keys and items are done by calling the
* compar() function, which should return 0 if they match,
* nonzero otherwise.
*/
void *lsearch(const void *key, const void *base0,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *)) {
const unsigned char *base = base0;
size_t i;

for (i = 0; i < nmemb; i++, base += size)
if (compar(key, base) == 0)
return (void *)base; /* cast away const */
return NULL;
}

In GNUC, where sizeof(void)==1, I can replace "const void *base0"
with "const void *base" and remove the first line of the function.
Otherwise, the entire function remains exactly the same. (The cast
on the return value is still required, as lsearch() mimicks bsearch()
and de-consts its return value.) The code generated by any reasonably
good compiler also remains the same.

(Since linear search is slow, you probably want to call bsearch()
anyway. The bsearch() code is a bit more complex, so the "extra"
line, "const unsigned char *base = base0", adds even less to the
source, percentage-wise. As I said above, I would have shown a
bsearch(), but for two things: I might get it wrong, and the
name bsearch() is reserved to the implementation.)

lawrence.jones · Sep 26, 2008

Peter Nilsson said:
That's actually a very good question! If you ever hear
a good answer, please share it.

So you can print pointers (e.g., for debugging) in the native format for
the platform without having to know in advance what that is. (Yes,
there's no guarantee that %p actually does that, but only a very poor
quality implementation would do anything else.)

Tim Rentsch · Oct 9, 2008

Keith Thompson said:
There is no such prohibition in the normative text of the standard,
but it's suggested in a footnote.

C99 6.2.5p27:

A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type.

And a footnote:

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

This is an unfortunate flaw in the standard.

I think you're misunderstanding the idea behind the footnote.
The "interchangeability" of char*/void* is meant to talk about
values of those types used in particular contexts, not about
derived types with those types in them. For example,

char *
xyzzy(){
void *r = "answer";
return * (char**) &r;
}

is the sort of case the footnote means to address. This
interchangeability does not extend to derived types, such as
trying to convert &xyzzy to (void *(*)()). There's a similar
relationship between int and unsigned int; as values, int and
unsigned int are (often) interchangeable, but pointers to int and
pointers to unsigned int don't bear any special relationship to
each other -- they don't even have to be the same size, for
example. The analogy with function types that have int or
unsigned int in them should be clear.

Keith Thompson · Oct 9, 2008

Tim Rentsch said:
I think you're misunderstanding the idea behind the footnote.
The "interchangeability" of char*/void* is meant to talk about
values of those types used in particular contexts, not about
derived types with those types in them.

[snip]

Consider this:

printf("%p\n", "string literal");

I see nothing in the normative text of the standard that requires
char* and void* to be passed as function arguments in the same manner;
for example, void* arguments might be passed in one set of registers
and char* arguments in another. I can think of no good reason to do
so, but the standard doesn't forbid it. The footnote, however,
implies that the above must print (in some implementation-defined
manner) the address of the first character of the literal.

Richard Tobin · Oct 9, 2008

Keith Thompson said:
I see nothing in the normative text of the standard that requires
char* and void* to be passed as function arguments in the same manner; [...]
The footnote, however,
implies that the above must print (in some implementation-defined
manner) the address of the first character of the literal.

From which we conclude that there is a bug in the standard: it doesn't
normatively convey what the authors intended, and it should be fixed.

-- Richard

Adding adressing of IPv6 to program	1	Feb 16, 2023
void pointers	36	Oct 5, 2010
Lexical Analysis on C++	1	Oct 31, 2023
Logic Problem with BigInteger Method	2	Aug 26, 2023
Pointers	16	Aug 6, 2007
The question regarding type of pointers	17	Apr 25, 2012
Different sizes of data and function pointers on a machine -- void*return type of malloc, calloc, an	23	Jun 25, 2012
void pointers	17	Aug 5, 2008

Question about void pointers

CBFalconer

CBFalconer

CBFalconer

CBFalconer

s0suk3

Keith Thompson

Chris Dollin

Chris Dollin

Nick Keighley

Richard Bos

James Kuyper

James Kuyper

Keith Thompson

raphfrk

Bartc

Chris Torek

lawrence.jones

Tim Rentsch

Keith Thompson

Richard Tobin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads