Question about void pointers

C

CBFalconer

.... big snip ...


Why not? I'll simply keep in mind that it works only under a
certain compiler.

Because, if you do it properly, you will never have to rewrite it
again. Laziness pays. Whatever 'it' is.
 
C

CBFalconer

(e-mail address removed) wrote:

[about incrementing void pointers as opposed to (char *) cast]

Because sometimes you don't need portability and because
sometimes there's a better way?

Yes, that better way being the portable way, in this case.

Well, it's a matter of whether you prefer void * or char *.

No you idiot, it's a matter of whether you prefer portability
over non-portability without any other gains or loses.

The point being that you can't increment a void* pointer in a
conformant C system. If you (sOs) look at the code generated you
will see that a cast to char* generates none, as does the cast back
to char*. However it clues the compiler in to what is going on.
 
C

CBFalconer

.... snip ...

Shouldn't it be unsigned char *? Anyway, my argument was that if
your compiler can perform arithmetic with the void *, then go
void *, as you did above.

Because a compiler can't 'perform arithmetic with the void*'. It
says so, right in the C standard. Trying to do so results in
undefined behaviour.
 
C

CBFalconer

Nick said:
.... snip ...


but in general they aren't. Think DOS. Think IBM mainframe.
Are you trying to be dense?

No, he is just re-demonstrating the condition. I have seen many
explanations to him of this elementary fact.
 
S

s0suk3

unsigned char is used for both kinds of "bytes".

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

Sebastian
 
K

Keith Thompson

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

No, I really can't.

If I had the opportunity to go back in time and change C, I'd probably
separate the concepts of "byte" (the fundamental unit of storage) and
"character". I'd probably make "byte" a keyword and a type name,
referring to a numeric type similar to what we know as unsigned char,
and I'd allow sizeof(char) to be something other than 1. Given such a
change, it would make sense to access raw memory using type byte*, and
you could sensibly perform pointer arithmetic on such a type. I'm not
certain whether void*, a type that points to raw memory but that can't
can't be dereferenced or incremented, would add sufficient value to be
worth having in the language; perhaps memcpy and friends would just
take arguments of type byte*.

But it's *way* too late to make such a change in any language calling
itself "C".

If you want to a raw pointer that can point to any arbitrary object,
use void*. If you want to access the object's representation as bytes
or perform pointer arithmetic, use unsigned char*. If you want to
point to a specific type, or to an array of a specific type, use a
pointer to that type.

You cannot legally perform pointer arithmetic on void* in standard C,
and if you attempt to do so your code will be *gratuitously*
non-portable.
 
C

Chris Dollin

Richard said:
Thats nice. How does your debugger do it?

Probably as numbers. How would I know? It's been so long*.
When you look into memory and you see your pointer stored in an
arry how does it look? Hex number by any chance? What a surprise.
If you cast it to a char * and subtract p
from ++p do you get 4 on a 32 bit machine? I did.....

I've never looked [at this using a debugger]. I've never needed to.
I understand C's computational model for pointers, and I understand
code generation, and I understand how a /particular/ implementation
can represent pointers in the same way it represents numbers, and I
don't confuse this with a /necessary/ property of implementations,
nor do I confuse scalars and vectors -- a segmented architectures
pointers are multi-component, like vectors, and just because you
can read off a bit-pattern and apply the traditional (but inappropriate)
binary decoding and get a number doesn't mean that bit-pattern
/is/ a number.
 
C

Chris Dollin

Richard said:
The correct statement is

"It is incorrect to increment a non initialised variable since the
behaviour is undefined - however the value may well increment as
expected".

"Or not, at the whim of the implementor. DO NOT RELY ON THIS (if
you expect your code to be even vaguely portable)."
 
N

Nick Keighley

Because sometimes you don't need portability and because sometimes
there's a better way?

at the risk of repeating myself. I REALLY do not understand
how anyone can think like this. You wish to do something in C.
You have two options *that produce identical results* and are similar
in programming complexity. One is portable and one is not.

Why not use the portable version?
What is the <expletive> point of using the non-portable version?
What can you possibly gain?

I'll probably give up after this. The trouble with text based
formats is you can't hear how hard I'm hitting my keyboard by
now.

Why is it rubbish?

doing arithmatic on a void pointer seems really strange to me.
If you have a void pointer, it will typically point
to any kind of object: int, double, long long, structures, anything.
yup


Does it seem natural to you to treat such an object as if it were a
*character*?

no. It's not a character!

Why not? I'll simply keep in mind that it works only under a certain
compiler.

<nick's head explodes>
 
R

Richard Bos

Richard said:
Thats nice. How does your debugger do it? When you look into memory and
you see your pointer stored in an arry how does it look? Hex number by
any chance? What a surprise.

My dear dunce, if you explicitly ask your debugger to hexdump
everything, _your DNA_ is a hex number. No surprise there, then.
OTOH, if I ask my debugger to do a stringdump of memory, all floating
point numbers will display as strings. Now, is a float therefore a
string?

Richard
 
J

James Kuyper

Sure, because it's the standard way. But I think you can now see why I
find void * more natural for the above kinds of "bytes."

No, I do not see why you consider a type whose definition is that it
points at no specific type, to be the natural way of treating an object
as being composed of a series of bytes.

Fundamentally, the meaning of p+3, when p is a non-null pointer to an
object type, is that *(p+3) refers to an object of the pointed-at type
which is three objects of that type forward from the object referred to
by *p. That meaning disappears when *p itself is meaningless, which is
why the standard forbids pointer arithmetic on void* pointers.

The fact that some compilers provide void* arithmetic as an unnatural
extension to C doesn't make it natural. It just means that the users and
implementors of that feature don't properly understand what void* and
pointer arithmetic mean.
 
J

James Kuyper

CBFalconer said:
Because a compiler can't 'perform arithmetic with the void*'. It
says so, right in the C standard. Trying to do so results in
undefined behaviour.

So? How does that stop a compiler from doing so? Undefined behavior
includes, among it's infinite possibilities, permission to treat a void*
pointer as if it had been converted to char*.

The question is not whether a compiler can do so; many compilers do in
fact do so. If they issue a diagnostic message, they can even be fully
conforming compilers.

The real question is, given a fully conforming compiler which allows
such code as an extension, should a programmer take advantage of that
extension? I see no advantage to doing so, and lots of disadvantages,
Sebastion sees it the other way around, for reasons that make no sense
to me. But that's the real debate.
 
K

Keith Thompson

My dear dunce, if you explicitly ask your debugger to hexdump
everything, _your DNA_ is a hex number. No surprise there, then.
OTOH, if I ask my debugger to do a stringdump of memory, all floating
point numbers will display as strings. Now, is a float therefore a
string?

To be fair, a typical debugger probably displays pointers in hex or
something similar by default -- though on a segmented memory system
it's likely to use something like "1234:5678".

For example, gdb displays pointers in hex -- though not in a way that
supports Richard NoLastName's point. Here's a gdb session I just ran:

% gdb c
GNU gdb 6.8-debian
[...]
(gdb) break 6
Breakpoint 1 at 0x8048392: file c.c, line 6.
(gdb) run
Starting program: /home/kst/c

Breakpoint 1, main () at c.c:6
6 printf("x = %d, ptr = %p\n", x, (void*)ptr);
(gdb) list
1 #include <stdio.h>
2 int main(void)
3 {
4 int x = 42;
5 int *ptr = &x;
6 printf("x = %d, ptr = %p\n", x, (void*)ptr);
7 return 0;
8 }
(gdb) print x
$1 = 42
(gdb) print ptr
$2 = (int *) 0xbfa907a0
(gdb) continue
Continuing.
x = 42, ptr = 0xbfa907a0

Program exited normally.
(gdb) quit

Take a look at how it displayed the value of ptr, a variable of type
int*: "(int *) 0xbfa907a0". The authors of gdb know that pointers
aren't really integers, so gdb doesn't display them as integers; it
displays them using a valid C expression.
 
R

raphfrk

     2) The function is a framework operating on "abstracted data
        types."  It doesn't know what kind of data it's handling,
        but it passes the data pointers along to type-specific
        functions that do know.  qsort() and bsearch() are examples
        of this style.

Yeah, that is what I was thinking of ... qsort(). Seems that it is a
standard function. Is it normally implemented efficiently?

Anyway, here is my attempt at insertsort.

size = sizeof <data type>

I guess swap doesn't strictly need to be passed.

void insertsort( void *start, void *end, int size, int (*compare)(void
*, void *), void (*swap)(void *, void *) )
{

void *curpos = start;
void *swappos;

while( (char *)curpos <= (char *)end )
{
swappos = curpos;
while( (char *)swappos > (char *)start && (*compare)( swappos ,
((char *)swappos - size) ) < 0 )
{
(*swap)( swappos , (void *)((char *)swappos - size) );
swappos = (char *)swappos - size;
}
curpos = (char *)curpos + size;
}

}
 
B

Bartc

For the fun, I've also this machine to which I telnetted to run a little
program. That may interest you:

@type pvoid.c
#include <stdio.h>

int main()
{
int x;
char y;
char t[10];
int i;
printf("&x = %p\n&x = %o\n&y = %p\n", (void*)&x, (unsigned)&x,
(void*)&y);
for (i=0; i<10; ++i) {
printf("&t[%d] = %p\n", i, &t);
}
return 0;
}
@run pvoid
&x = 331100050105
&x = 50105
&y = 1100050106
&t[0] = 331100050107
&t[1] = 221100050107
&t[2] = 111100050107
&t[3] = 1100050107
&t[4] = 331100050110
&t[5] = 221100050110
&t[6] = 111100050110
&t[7] = 1100050110
&t[8] = 331100050111
&t[9] = 221100050111

pointers are printed as number here, but they probably don't behave like
you'd expect. BTW, a debugger would have printed the first and third as
331100,,50105 and 1100,,50106. To understand the void* one, you have to
know that those are byte pointers, pointing to bytes made of 9 bits (11
octal is 9 decimal, octal being the base commonly use on this 36 bits
machines) inside 36 bits words. Those starting by 33 are pointing to the
least significant 9 bit byte of the word (starting at bit 33 octal -- 27
in
decimal).

Admitly this is quite an older machine, but at a time it was the most
common architecture on the Arpanet. The one I telnetted to was an
emulated
one, but there are still some hardware one on the Internet. There is also
gcc 4.3 port for it, and I suspect the company which is paying to make
that
port still makes hardware implementation even if it doesn't sell them
outside systems.


You're not saying what machine this is?

I'm guessing pdp-10. But I'm not sure if those funny pointers are done in
software or if it's using the special bitfield instructions; these use
pointers with a standard 18-bit address in low half and extra bit
position/width in top half.
The world is more diverse that you think?

That machine is also obsolete. Byte-addressable machines were a breath of
fresh air. I just hope word-addressed machines don't make a big comeback.
You can consider any bit pattern as a number, but when it is an address
that is not always the best thing to do.

In this case treating addresses as /integers/ would not have worked (for
incrementing and so on), but no harm in considering the address as a kind of
number, similar to a float which also uses special fields within the 36-bit
word.

BTW what would address to int conversion have done in this case, just done
nothing, or linearised the address?
 
C

Chris Torek

That's an excellent point: what if you _don't_ know what it's really
pointing to? For example, functions such as memcpy(), memmove(),
memset(), etc., are used extensively to deal with objects of any type.
So this functions take void *'s, and they might be pointing to an int,
to a char, to a structure, or anything you can think of. I'm sure the
implementor of such a function would find it enjoyable to be able to
perform void pointer arithmetic!

As someone who has actually written a number of these implementation
functions, I can answer to the last statement here (about "find[ing]
it enjoyable"): no, it really makes essentially no difference.

Here is a classic example of a linear search function (which I am
typing in "on the fly", otherwise I would show a binary search
modeled after bsearch()). Note where "void *" appears and where
it does not.

/*
* Linear search: search "nmemb" items of size "size", starting
* at "base", for the given "key". Return a pointer to the
* item found, or NULL if the item is not found.
*
* Comparisons between keys and items are done by calling the
* compar() function, which should return 0 if they match,
* nonzero otherwise.
*/
void *lsearch(const void *key, const void *base0,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *)) {
const unsigned char *base = base0;
size_t i;

for (i = 0; i < nmemb; i++, base += size)
if (compar(key, base) == 0)
return (void *)base; /* cast away const */
return NULL;
}

In GNUC, where sizeof(void)==1, I can replace "const void *base0"
with "const void *base" and remove the first line of the function.
Otherwise, the entire function remains exactly the same. (The cast
on the return value is still required, as lsearch() mimicks bsearch()
and de-consts its return value.) The code generated by any reasonably
good compiler also remains the same.

(Since linear search is slow, you probably want to call bsearch()
anyway. The bsearch() code is a bit more complex, so the "extra"
line, "const unsigned char *base = base0", adds even less to the
source, percentage-wise. As I said above, I would have shown a
bsearch(), but for two things: I might get it wrong, and the
name bsearch() is reserved to the implementation.)
 
L

lawrence.jones

Peter Nilsson said:
That's actually a very good question! If you ever hear
a good answer, please share it.

So you can print pointers (e.g., for debugging) in the native format for
the platform without having to know in advance what that is. (Yes,
there's no guarantee that %p actually does that, but only a very poor
quality implementation would do anything else.)
 
T

Tim Rentsch

Keith Thompson said:
There is no such prohibition in the normative text of the standard,
but it's suggested in a footnote.

C99 6.2.5p27:

A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type.

And a footnote:

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

This is an unfortunate flaw in the standard.

I think you're misunderstanding the idea behind the footnote.
The "interchangeability" of char*/void* is meant to talk about
values of those types used in particular contexts, not about
derived types with those types in them. For example,

char *
xyzzy(){
void *r = "answer";
return * (char**) &r;
}

is the sort of case the footnote means to address. This
interchangeability does not extend to derived types, such as
trying to convert &xyzzy to (void *(*)()). There's a similar
relationship between int and unsigned int; as values, int and
unsigned int are (often) interchangeable, but pointers to int and
pointers to unsigned int don't bear any special relationship to
each other -- they don't even have to be the same size, for
example. The analogy with function types that have int or
unsigned int in them should be clear.
 
K

Keith Thompson

Tim Rentsch said:
I think you're misunderstanding the idea behind the footnote.
The "interchangeability" of char*/void* is meant to talk about
values of those types used in particular contexts, not about
derived types with those types in them.
[snip]

Consider this:

printf("%p\n", "string literal");

I see nothing in the normative text of the standard that requires
char* and void* to be passed as function arguments in the same manner;
for example, void* arguments might be passed in one set of registers
and char* arguments in another. I can think of no good reason to do
so, but the standard doesn't forbid it. The footnote, however,
implies that the above must print (in some implementation-defined
manner) the address of the first character of the literal.
 
R

Richard Tobin

Keith Thompson said:
I see nothing in the normative text of the standard that requires
char* and void* to be passed as function arguments in the same manner; [...]
The footnote, however,
implies that the above must print (in some implementation-defined
manner) the address of the first character of the literal.

From which we conclude that there is a bug in the standard: it doesn't
normatively convey what the authors intended, and it should be fixed.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top