arithmetic on a void * pointer

S

Seebs

Sez you. If what you said was correct, then we should never be able to
cast a pointer to another pointer type, since that object it's pointing
to doesn't exist.

You don't seem to have understood the difference between the pointer
and the thing pointed to.

When you convert a pointer to a new type, you may well create a new
value which is different from the value you converted.
Uh oh. Now I'm really worried, if having different, non-transferrable
data pointer representations is permitted by the C standard. I sure
hope you're wrong.

No, he's not. They've been around forever and have been essential for some
high-performance systems.
So is it not true that void * must be able to be a vessel for a pointer
to any data object?

Of course it must.

But that means a conversion. And in some cases, the conversion actually changes
the bit contents or even SIZE of the pointer.

-s
 
B

BGB / cr88192

Mark Adler said:
Helpful C swamis,

In the good old days, any self-respecting C compiler knew what you meant
when you did arithmetic on a void * pointer, i.e. that the units were
bytes. Then someone decided that we would all be more productive somehow
if the compilers couldn't figure out obvious things like that on their
own. So now you can't do arithmetic on void * pointers.

Ok, fine. So I adjusted, I would do this:

void *buf;
int n;

(char *)buf += n;

But now I'm getting nastygrams like "warning: target of assignment not
really an lvalue; this will be a hard error in the future". Once again
someone, probably the same person, decided that compilers shouldn't have
to burden this great responsibility of understanding obvious things like
that. (And why a cast of a pointer to another pointer type is no longer a
pointer, I have no idea.)

So now what am I supposed to do? It starts to get just downright silly.
E.g.

void *buf;
int n;
char *cmon;

cmon = (char *)buf;
cmon += n;
buf = (void *)cmon;

Is there a better way to work around this, at least until the *next*
dumbing down of the language?

even heroes step in things sometimes it seems...

well, I usually define a 'byte' type (as 'unsigned char'), and use pointers
to this for when I need bytes, and then cast to/from 'void *'...


elsewhere in the thread:
yeah, I guess I will note that I like to use '()' for no-args functions even
being aware that it is not strictly correct.

I remembered recently somene wanting to disallow the ability to type:
int foo()
{
...
}

which was a position I personally somewhat disageed with.
in my compiler, I dropped old-style declarations (mostly as I didn't feel
much need to implement them for my uses), but just made '()' and '(void)' do
the same thing, which to me seemed a perfectly reasonable option. but, this
opened up a big argument...

personally, I did not see it as a "change" of the semantics, rather a
"narrowing".

not like I have not seen other projects using the same notation though.


so, I guess it is more about how strict everyone wants to be over these
matters...


to some, it is "all or nothing" even over the tiny details, and to others it
is a bit more fluid.
as I see it, "what works works..." (though there are limits here, given
matters such as "what is in good taste" and "what is the right thing to do",
which is not generally one-and-the-same as picking over the fine details,
even if there is a lot of overlap at times...).
 
K

Keith Thompson

Seebs said:
I've never seen anything but gcc do that. There was no time ever when
it was part of the spec, and I think gcc was wrong.
[...]

Intel's icc presumably does, but only because it's designed to be
compatible with gcc (the goal is to be able to use it to build the
Linux kernel).
 
B

Beej Jorgensen

My current copy of gcc (4.2.1) will, with all warnings enabled (-Wall)
is perfectly happy with (no warnings):

void *buf;
int n;

buf += n;

gcc -pedantic:

foo.c:9: warning: pointer of type ‘void *’ used in arithmetic

Perhaps it's only warning idiots like me? ;-)

-Beej
 
M

Mark Adler

gcc -pedantic:

foo.c:9: warning: pointer of type ‘void *’ used in arithmetic

Perhaps it's only warning idiots like me? ;-)

Apparently -Wall really only means -Wsome.

Mark
 
M

Mark Adler

But that means a conversion. And in some cases, the conversion
actually changes
the bit contents or even SIZE of the pointer.

Ok, I'm a little slow, but I get it now. My mental model of a data
pointer always being simply an address pointing to a byte, with no
conversion due to the type pointed to, is not universal. I now agree
that (char *)b += n should not be allowed. (Sigh. Of course, I would
prefer that the world match all of my simple mental models. It's so
annoying when it doesn't.)

"void *" is for a typeless address, and since it has no type, there is no
meaningful size to increment it by when adding one...

Now I can play the game! (Gets up on soap box ...) I assert that a
function that takes a void * pointer and a size_t length to identify a
thing makes no sense whatsoever in the C standard, since length has no
meaning with respect to the void * pointer. Yet there are many such
functions that are also part of the C standard, such as memcpy(),
write(), etc. They should all really be char *, since only then can
the length have meaning. Those functions are in a sense implying char
* through the meaning of the length, but presenting void * as the
argument instead, and are therefore lying. (I realize of course that
they do this for the convenience of not having to cast to (char *)
everytime you use one of those functions.)

I would argue that alternatively we could eliminate this mixed message
of void both implicitly having a size and not having a size by changing
the C standard to give void a size, i.e. one. sizeof(void) == 1.
Nothing would break -- this would just add to the things that can be
expressed, in a way that is consistent with how library functions use
void * arguments. (The void type would still not be allowed by itself,
and so function(void) would still mean no arguments.) In fact, in the
evil gcc, n = sizeof(void); sets n to 1, and it seems to work fine.

Mark
 
K

Keith Thompson

Mark Adler said:
Ok, I'm a little slow, but I get it now. My mental model of a data
pointer always being simply an address pointing to a byte, with no
conversion due to the type pointed to, is not universal. I now agree
that (char *)b += n should not be allowed. (Sigh. Of course, I would
prefer that the world match all of my simple mental models. It's so
annoying when it doesn't.)



Now I can play the game! (Gets up on soap box ...) I assert that a
function that takes a void * pointer and a size_t length to identify a
thing makes no sense whatsoever in the C standard, since length has no
meaning with respect to the void * pointer. Yet there are many such
functions that are also part of the C standard, such as memcpy(),
write(), etc.

Well, write() isn't part of the C standard, but that's not really
relevant to your point.

The point is that in most such functions, the size is specified to be
in bytes. Which means that the implementation, assuming it's written
in C, has to convert the void* pointer to, say, unsigned char* to
operate on it.
They should all really be char *, since only then can
the length have meaning. Those functions are in a sense implying char
* through the meaning of the length, but presenting void * as the
argument instead, and are therefore lying. (I realize of course that
they do this for the convenience of not having to cast to (char *)
everytime you use one of those functions.)

I think "lying" overstates it. I don't dispute that the definition is
a trifle conceptually inconsistent, but it's not *that* bad.
I would argue that alternatively we could eliminate this mixed message
of void both implicitly having a size and not having a size by
changing the C standard to give void a size, i.e. one. sizeof(void)
== 1. Nothing would break -- this would just add to the things that
can be expressed, in a way that is consistent with how library
functions use void * arguments. (The void type would still not be
allowed by itself, and so function(void) would still mean no
arguments.) In fact, in the evil gcc, n = sizeof(void); sets n to 1,
and it seems to work fine.

I've seen errors that gcc failed to catch because of this extension
(sorry, I don't remember the details).

It's too late to redesign the language, but ...

I think part of the problem is that C use char (and its signed and
unsigned variants) for three distinct things: character data, small
integers, and the smallest addressible unit of memory. Perhaps it
would have been better to have a distinct "byte" type, not compatible
with any of the character types. Then a lot of the mem*() functions
could sensibly take arguments of type byte*. Presumably you could
implicitly convert to and from byte* just as with void*.

On the other hand, the fact that you can deference byte* and perform
pointer arithmetic on it might be a problem as well. Sometimes you
might *want* a type that's just a raw pointer, but that doesn't let
you manipulate what it points to.

So would the language have been better with both byte* and void*, or
would that have introduced too much complexity? I can imagine the
flame wars about which type should be used in which circumstances.

In any case, I don't recall anyone claiming that C is a paragon of
perfection.
 
I

Ian Collins

Mark said:
Now I can play the game! (Gets up on soap box ...) I assert that a
function that takes a void * pointer and a size_t length to identify a
thing makes no sense whatsoever in the C standard, since length has no
meaning with respect to the void * pointer.

Nonsense, the size is the size of the data pointed to. What the data is
is irrelevant.
Yet there are many such
functions that are also part of the C standard, such as memcpy(),
write(), etc. They should all really be char *, since only then can the
length have meaning.

How so? the length is a length in bytes. I really think you either
don't understand the concept of void*, or you are being obtuse.
Those functions are in a sense implying char *
through the meaning of the length, but presenting void * as the argument
instead, and are therefore lying. (I realize of course that they do
this for the convenience of not having to cast to (char *) everytime you
use one of those functions.)

Eh?

consider

int n;
write( &n, sizeof(n) );
I would argue that alternatively we could eliminate this mixed message
of void both implicitly having a size and not having a size by changing
the C standard to give void a size, i.e. one. sizeof(void) == 1.

Nowhere does void implicitly have a size.
Nothing would break -- this would just add to the things that can be
expressed, in a way that is consistent with how library functions use
void * arguments. (The void type would still not be allowed by itself,
and so function(void) would still mean no arguments.) In fact, in the
evil gcc, n = sizeof(void); sets n to 1, and it seems to work fine.

So nonsense like:

int n;
int *p = n;

....

void* v = p;
++v;

would become legal?
 
M

Mark Adler

Well, write() isn't part of the C standard, but that's not really
relevant to your point.

Ok, fwrite(). fread(). mem*().
So would the language have been better with both byte* and void*, or
would that have introduced too much complexity?

I'd like that even better than sizeof(void) == 1, so long that the type
conversions to and from byte * are "free", as they are with void *, as
you suggested.

Mark
 
M

Mark Adler

Nonsense, the size is the size of the data pointed to.

Is it now. The size in what units? void has no size. So how do we
know the size of the units a void * points to? Is that size the number
of int's? Of struct foo's?
the length is a length in bytes.

Ah, always in bytes you say. Excellent. Now we know the implicit size
of void.
int n;
write( &n, sizeof(n) ); ....
So nonsense like:

int n;
int *p = n;

(I think you mean &n there.)
...

void* v = p;
++v;

would become legal?

Why not? How is that different from what write(&n, sizeof(n)) has to
do with the &n in your example? Inside write(), it must consider &n to
point to a sequence of bytes, not to an integer and not to a mystery.
Therefore we have fallen from the purer faith by conceding that void *v
points to a sequence of bytes. Having already sinned, ++v must mean
point to the next byte. write() could be something like:

extern void writeonebyte(void *);

void write(void *p, size_t n)
{
while (n) {
writeonebyte(p);
p++;
n--;
}
}

That's all entirely consistent with the C standard library
interpretation of void * as a pointer to a sequence of bytes.

Anyway, I like Keith's proposal better, which is to have a "byte" type
which would combine the type behavior of char (or better yet unsigned
char) with the casting behavior of void. Then functions that take
arbitrary chunks of memory would use byte * instead of void *, and
everything would make sense. void * would remain immune to arithmetic,
and would be used when you mean an obscured object as opposed to a
sequence of bytes. When you mean sequences of bytes, like for
memcpy(), you would then use byte *. But you wouldn't need to cast the
pointers to byte *, so you could in fact say memcpy(dest, &n,
sizeof(n)) as opposed to memcpy(dest, (byte *)(&n), sizeof(n)).

I guess part of the problem is that void is a bit overloaded in C,
meaning a bunch of rather different things depending on the context: no
arguments (int main(void)), doesn't return anything (void func(int)),
deliberately dispose of the return value ((void)fclose(f)), obscured
data object (void *foo), and sequence of bytes (memcpy(..., void *,
size_t)). The last two are in a spritual conflict.

Anyway, this has been enjoyable and instructive for me, so thank you
all for your time. Some people seem to get a little ruffled in these
religious discussions, so I hope I didn't offend anybody. So as to not
waste any more of your time, I'll leave you alone and go back to
working on the next release of zlib, which now has two occurences of
buf = (char *)buf + n.

Mark
 
P

Phil Carmody

Mark Adler said:
I wasn't talking about the standard. I was talking about the compilers.


You're correct. I meant dumbing down of the compilers to agree with
the religious extremism of the standard.

You mean a change of state from being wrong to being correct?
If that's 'dumbing down' to you, then your perspective is completely
distorted.
Thank you. That's what I'll do.

If you're dealing with bytes, then you should almost certainly
be using an unsigned char* rather than a void* anyway.
Of course, at the loss of the
expessiveness and readability of the wonderful += operator of C.

Not lost if you use unsigned char*.
Is void * a pointer? Yes.

Is double* a pointer? Yes.
If I add n to the register that pointer is in, does it go up by n? Yes.

If I add n to the register that double* pointer is in, does it go up by n?
Yes - but it may no longer be a valid double* pointer, and almost certainly
doesn't point to n doubles past where it previously did.

What happens when you diddle with register values is _irrelevant_.
Isn't that what you'd expect? Yes.

Are your expectations completely pulled out of nowhere? Yes.
Anyway, I get the concept that void * deliberately makes the size of
the type ambiguous, so in principle arithmetic on it should be
undefined. On the other hand, the standard could make life easier and
still entirely self-consistent if it simply said that arithmetic on
void * pointers treats the size as 1. Why make a useful expression
illegal when instead it can mean what everyone expects it to mean?

I don't expect it to mean that.
Why is it not an l-value? I can see no ambiguity whatsoever about
what it means as an l-value.

Assigning to the result of a cast is not just 'ambiguous', it's
downright meaningless. You're advocating nonsense like:

int i=42;
(double)i+=1.;

Phil
 
P

Phil Carmody

Mark Adler said:
Is it now. The size in what units? void has no size. So how do we
know the size of the units a void * points to? Is that size the
number of int's? Of struct foo's?

The size in bytes. As the standard tells us.
Ah, always in bytes you say. Excellent. Now we know the implicit
size of void.

No we don't.

You're hallucinating.

Phil
 
P

Phil Carmody

Mark Adler said:
Apparently -Wall really only means -Wsome.

No, all warnings in a clearly defined set.

However, that's not your mistake - your mistake is that you've forgotten
to chose which language to use, and so GCC has defaulted to something
which isn't strictly C.

Phil
 
D

Dennis \(Icarus\)

I guess part of the problem is that void is a bit overloaded in C,

Not near as overloaded as static in C++ ;-)
meaning a bunch of rather different things depending on the context:
Anyway, this has been enjoyable and instructive for me, so thank you all
for your time. Some people seem to get a little ruffled in these
religious discussions, so I hope I didn't offend anybody. So as to not
waste any more of your time, I'll leave you alone and go back to working
on the next release of zlib, which now has two occurences of buf = (char
*)buf + n.

No offense taken at this end. Good luck with zlib.

Dennis
 
S

spinoza1111

I wasn't talking about the standard.  I was talking about the compilers..


You're correct.  I meant dumbing down of the compilers to agree with
the religious extremism of the standard.


Thank you.  That's what I'll do.  Of course, at the loss of the
expessiveness and readability of the wonderful += operator of C.



Is void * a pointer?  Yes.
If I add n to the register that pointer is in, does it go up by n?  Yes..
Isn't that what you'd expect?  Yes.

Anyway, I get the concept that void * deliberately makes the size of
the type ambiguous, so in principle arithmetic on it should be
undefined.  On the other hand, the standard could make life easier and
still entirely self-consistent if it simply said that arithmetic on
void * pointers treats the size as 1.  Why make a useful expression
illegal when instead it can mean what everyone expects it to mean?



Why is it not an l-value?  I can see no ambiguity whatsoever about what
it means as an l-value.

Mark

Watch out fella, you're Daniel in the lions' den. Keep up the good
questions.
 
S

spinoza1111

And compilers generally follow the standard. You may check to see if there's
a mode on te cpmpiler you're using that allows it as an extension.
Do you have an option like "disable language extensions" set?





The C language is defined b the standard. Do you want to program in C or
not?

No, from the standpoint of science (linguistics), no standard can
"define" C. This is because independent of the standard are all sorts
of C compilers with a fuzzy relation to any one formal definition of
C. Somewhat like world received English, C is what intelligent users
"say" it is. In addition, linguists also add that the language should
be the canonical user's first language, but this is impossible with C,
of course.

Now, the problem is measuring "intelligence", especially here. It
cannot be measured by "correct use of C" since "the correct use of C"
is basically "what I say it is" or "what is in the Holy Standard
[which was written to protect vendor profits and leaves too much
undefined]".

Here, intelligence is only measurable by literacy in writing English
and basic honesty (not making crude lies such as Heathfield's "Nilges
has never posted in comp.risks"). Of the proponents of "standard C"
only Ben Bacarisse passes this test. Other posters spout what I regard
as incoherent nonsense, such as the claim that one can be clear and
false at the same time, and, they defend their complete nonsense to
the last round and the last man.

The original poster here is in fact asking good questions in a
literate way. He's toast since basic literacy and honesty are so
seldom found in this newsgroup.


Yes it is, and dereferencing it will cause problems.

But if it is a pointer in truth, why would dereferencing it cause
problems, oh Divine Master? Yea this is hard for us mere mortals to
understand! For what is a pointer but that which dereferenceth?
That's assembly though - not C.

Oh divine Master, thou speakest horseshit indeed: for hath not C a
register declaration? And are we not supposed to think ahead a little
bit to what happeneth when our code runneth? Yea, verily they speak
with forked tongue who say, "behold: C is close to the Machine, and if
thou codest in C, thou shalt be one with the Machine, warm in winter
owing to the heat of its circuitry, cool in summer's heat owing to its
air conditioning", but then slay those, as Seebach has slain those,
who darest to speak of the mysteries of runtime such as the Stack, or
here the Register.
If I was programming assembly, sure, but not if I was programming in C.





Morris Keesan explains this nicely.

Thou directest him away for thou art full of shit.
 
S

spinoza1111

Until recently, gcc -Wall didn't complain about that.


So what about the cases where assignment *is* reasonable?  Like casting
a pointer?



I don't see why we should have to lose the += operator in this case.  
(struct foo *)buf += n looks to me like it should make perfect sense to
the compiler.

You poor bastard. Hast thou not heard? Hast thou not been told? Hast
thou not seen? Hast thou not gotten the memo? The motto of the regs
here is a song by David Byrne: Stop Making Sense.
 
S

spinoza1111

That depends how big a "void" is, doesn't it?


Not really.

Mark Adler: note what the guy says: he uses a negative as his major
operator. To say "it is not the case that" or "that is undefined" is
what the regs do here. That way, they don't take the risk of saying
something that can be easily falsified, since they need to avoid
humiliation at all costs.

And why do they need to avoid humiliation at all costs?

This is because most of the regs here (Seebach, Heathfield, Thompson
and Rosenau being the four horsemen) make it their business to stab
newbies in the back and shame people for not talking their talk.
 
S

spinoza1111

It sounds like gcc has been improved.



You find it reasonable; I don't.

A cast specifies a conversion, which takes a value as its operand and
yields a converted value of the specified type.

An lvalue designates an object.  For the result of a cast to designate
an object, there has to be some object *of the specified type* for it
to designate.  Since the operand of the cast was of a different type,
there is no such object.

Conceivably the language could specify that a cast yields an lvalue if
the source and target types both have the same representation (this
happens to be guaranteed for char* and void*).  But in my opinion that
would be ugly and inconsistent.  Would you want this:

    long l;
    (int)l = 42;

to compile if and only if int and long happen to be the same size?
Would you want this:

    int *ptr;
    (void*)ptr = NULL;

to work as "expected" on most systems, but fail to compile on systems
where int* and void* have different representations?

And would you want such a special-case rule just so you can use "+="?



But it doesn't.

--
Keith Thompson (The_Other_Keith) (e-mail address removed)  <http://www.ghoti.net/~kst>
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

This reply is sickeningly dishonest. Thompson wants the user of a
weakly typed language to use it as if it were strongly typed when
there are all sorts of languages that enforce type safety.

Kiki, get this through your head. Companies force programmers to use C
precisely because at the critical moment, C can be butt fucked into
doing things pathologically, and therefore "saving time" and
preserving short-term profits.

Hard magic constants, pre-allocated fixed arrays, void pointers and
all sorts of other crap are the norm in production C code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top