arithmetic on a void * pointer

K

Kaz Kylheku

Apparently -Wall really only means -Wsome.

Hypothesis: maybe, in the ancient early history of GCC, that's in
fact what it meant. Then maybe then new warnings crept into the code
and developers realized that they don't want the -Wall option to
actually turn on some of these new ones, because among them there are
false positives: warnings about situations that don't necessarily need
to be fixed. Nobody bothered to rename -Wall to -Wsome, or
-Whouse-special-combo.
 
S

spinoza1111

Keith Thompson said:
[...]
I would argue that alternatively we could eliminate this mixed message
of void both implicitly having a size and not having a size by changing
the C standard to give void a size, i.e. one.  sizeof(void) == 1.
Nothing would break -- this would just add to the things that can be
expressed, in a way that is consistent with how library functions use
void * arguments.  (The void type would still not be allowed by itself,
and so function(void) would still mean no arguments.)  In fact, in the
evil gcc, n = sizeof(void); sets n to 1, and it seems to work fine.
It works, but it's dumbing things down -- it's encouraging sloppy
thinking
about pointer types, and that ain't good.
it makes sense to the CPU, and in the end, it is this which is what
matters.
What matters more is what makes sense to the programmer and to the
reader of the code.  What matters is constructing a consistent model
for computation, and letting the compiler map that model onto some
particular hardware.
If we only cared about what makes sense to the CPU, we wouldn't have
more than a handful of types, all defined by the system, and portable
code would be a pipe dream.
granted, but these are still secondary, since if the CPU could not
understand the code or the data, the program would not run.
no matter how elegant the conceptual model, it would be pointless to have SW
which doesn't run...

Sure, but the particular abstractions we're talking about, such as
having void* be a raw pointer type that doesn't let you access what it
points to without first converting the pointer, don't prevent the
software from running.  Implementers and programmers are entirely
capable of writing working software using the abstractions defined by
the C language.


the CPU is inescapable, as noted by how people may still end up needing to
fall back to using assembler in many situations, and just the same, it is
sometimes / often needed to make use of these low-level details of how data
is represented on particular HW to have much of any real hope of making the
app work effectively...

I don't think that's as common as you imply.  I don't remember the
last time I needed to resort to assembler.  (For a lot of my own
programming, I don't even resort to C, but that's another story.)
universal portability for non-trivial apps IS a "pipe dream", apart from the
fact that luckily most HW tends to represent most things in similar enough
ways that one can gloss over the details in the majority of cases (and fall
back to good old "#ifdef" for most of the rest...).
and so, abstraction is a tower built on top of the HW, and not the other way
around.

The abstraction level provided by standard C seems to be just about
right for a lot of purposes, and I don't find that it gets in the way
of writing efficient code in most cases.  And in cases where it does,
you can often write non-portable code, making additional
implementation-specific assumptions, without leaving the language.

I really wouldn't want to make C any lower level than it already is.
none of this really changes the end case, that in the end it is the CPU
which matters...
if what were important were appeasing the mind of the reader of the code,
then people would be authors, not programmers, and the end result would be a
novella, rather than a codebase...

For the software I work on, I spend a lot more time maintaining it
than running it.  It's not a work of literature, but clarity and
legibility are vitally important if I'm going to finish the job
quickly enough for performance on the CPU to matter.

Programmerese and boilerplate. Since 1970, EVERY programmer has made
this claim. It starts with the disclaimer that OF COURSE it isn't
"literature" or any girlie stuff (why not) and rolls on to the claim
that the speaker is terribly interested in writing clear and legible
code. In making the claim, the writer or speaker often uses English in
a way that shows he hasn't fully mastered that language, the mastery
of which was thought by Dijkstra to be a prerequisite for mastering
programming.

Here, for example, Kiki says he writes "legible" code. But "legible"
means readable hand-writing! The last time it mattered that one writes
"legibly" was when we prepared Cobol on green and white coding sheets
for the lovely ladies of the keypunch room.

This is a subtle error, like claiming that Schildt is "clear but
false", but it's noticed by the careful reader.

And, just as we start with the necessary disclaimer that one is not
being so presumptuous and "disruptive" as to write Great Literature,
one ends with the absolute value of time to market as linked under the
rose with the time value of money, which rules our lives so savagely.

This sort of language paces out a space within a prison cell
constructed by economic relations which in the USA could have been
questioned long ago and today are in the process of collapse. The
result is the twisted lower middle class resentment that believes that
it, the lower middle class subject, has been virtuous for naught,
grinding away writing clear and legible code whilst elsewhere welfare
bums and dancing trolls chortle at him amongst the burning tyres.
 
N

Nobody

I don't see why we should have to lose the += operator in this case.

This isn't portable (see below), but it's not an error:

*(struct foo **)&buf += n;
(struct foo *)buf += n looks to me like it should make perfect sense to
the compiler.

So how should the compiler interpret:

int x;
(float)x = 7.0;

If you can see the problem with that, consider what happens when a void*
uses a different representation to a "struct foo *" (this is why the above
alternative isn't portable).

Assignment only makes sense for lvalues, casts only make sense for
expressions.
 
N

Nick

Mark Adler said:
For example ...?

I'd strongly suspect on the C90 vintage Crays for a start. On these
machines everything /except/ char was 8 bytes long (so to answer
someone's question in another thread from a couple of days ago "how do I
define a 64 bit integer" the answer was simple - 'short'). Pointers to
char were different, as they had to point within an 8 byte chunk. Since
a void * has to be able to hold a char *, I'd expect a void * to look
like a char * and /not/ like an int.

Of course, the system could have somehow encoded in the void * whether
it was "really" a char * or any other sort. In that case void * wouldn't
look like anything else, and it could be argued that adding 1 to a void
* on such as system should move it forward 8 bytes for an int and 1 for
char - which just goes to show how arithmetic on void * doesn't make
sense.

I think the OP is confused. If you want a generic memory pointer, use a
char * (better, an unsigned char *). Before void came along we used
those for both generic memory pointers and for pointers to anything that
we will revert to a proper type later. Adding void * replaced char *
for the second of these, but not for the first - if you do that you have
no problems.
 
B

Ben Bacarisse

Nick said:
I'd strongly suspect on the C90 vintage Crays for a start.

Yes, any word-addressed machine has this feature. C's parentage is
from languages that are word-oriented (B and BCPL) because such
machine were very common at one time.

I doubt they will become common again any time soon. The engineering
motivation was probably to increase the memory capacity with fewer
expensive address lines and that is not a strong motivator anymore.
Even if there is a reason to favour that design again, they will not
be popular because so much C code will break on them. C's rules are
designed to cope with word addressing but not all programs stick to
the rules.

<snip>
 
K

Kenny McCormack

And you are the prototype for it, and the prototypical instance.

Sure. The strategy being to say it's all a joke AFTER you've destroyed
people's reputations. They is just good old boys. They's just having
fun.[/QUOTE]

Indeed. So well put. They's just good ole boys>..
 
K

Kenny McCormack

spinoza1111 said:
Programmerese and boilerplate. Since 1970, EVERY programmer has made
this claim. It starts with the disclaimer that OF COURSE it isn't
"literature" or any girlie stuff (why not) and rolls on to the claim
that the speaker is terribly interested in writing clear and legible
code. In making the claim, the writer or speaker often uses English in
a way that shows he hasn't fully mastered that language, the mastery
of which was thought by Dijkstra to be a prerequisite for mastering
programming.

Excellent. Of course no one else here has any clue as to what you are
talking about (pearls before swine to the max!), but you have got it so
dead to rights.

Especialy the bit about "girlie" stuff. As I've noted before, the ethic
of this group is so tightly bound to the prototype of an uneducated (I
don't need no stinkin' college degree!) but manly (no girlie stuff for
me!) programming stud. And, as you've noted elsewhere, they share the
idea that they don't wanna go into management, because going into
management means the end of their eternal youth.

....
This sort of language paces out a space within a prison cell
constructed by economic relations which in the USA could have been
questioned long ago and today are in the process of collapse. The
result is the twisted lower middle class resentment that believes that
it, the lower middle class subject, has been virtuous for naught,
grinding away writing clear and legible code whilst elsewhere welfare
bums and dancing trolls chortle at him amongst the burning tyres.

Yup. I'll bet a lot of the group members here wish they could just get
out of programming and make a fortune doing some scam. The scams are
all around us.
 
B

Ben Bacarisse

Francis Glassborow said:
I understood the dsps use 32 bit words and sometimes C for these
actually uses a 32-bit char. But perhaps I misunderstood.

So I understand, but that is then a byte-addressed machine. There
would be no need to have different representation for, say, char * and
int *.
 
B

Ben Bacarisse

Francis Glassborow said:
Yes, where they take the option of using 32-bit chars but what if the
implementation chooses not to? The point I am making is that there is
hardware around today that is word orientated.

Agreed. I don't know, but I suspect none of them do that simply
because DSPs are always brought up as examples where sizeof(int) == 1
rather than examples of differing pointer representations.

You are right that this may well be the area where the issue of
pointer representations comes up again in the future.
 
K

Keith Thompson

Ben Bacarisse said:
So I understand, but that is then a byte-addressed machine. There
would be no need to have different representation for, say, char * and
int *.

Right. The Cray T90 I worked on ran a flavor of Unix, so it really
had to have 8-bit bytes. (Earlier Crays ran a non-Unix OS; I don't
know how their C implementations behaved.)

DSPs, on the other hand, generally have no need to handle
byte-oriented data or deal with 8-bit text files.

There would have been some advantages to having CHAR_BIT==64 on the
T90, but this was outweighed by need for Unix compabitility and
communication with other systems. Most programs that ran on it
wouldn't have done much character processing anyway.
 
N

Nick

Keith Thompson said:
Right. The Cray T90 I worked on ran a flavor of Unix, so it really
had to have 8-bit bytes. (Earlier Crays ran a non-Unix OS; I don't
know how their C implementations behaved.)

Entertainingly, and well pre-standard (in fact it was moving to a more
standard version that did it) one compiler changed managed to reverse
the order of bit addressing inside words.
 
E

Eric Sosman

Yes you did. Now please stop.
Don't repeat yourself, you are not important. Please disappear.
Get used to it.

He will not disappear as long as people keep paying
attention to him. Hint.
 
M

Moi

He will not disappear as long as people keep paying
attention to him. Hint.

I fully realize that.
I would only wish that RH && Seebs would do the same.

I rest my case.

AvK
 
K

Keith Thompson

Moi said:
I fully realize that.
I would only wish that RH && Seebs would do the same.

I rest my case.

What case is that? That since others feed the troll, it's ok for
you to do so?
 
A

Alan Curry

So how should the compiler interpret:

int x;
(float)x = 7.0;

You can find out what gcc would do with that by reading the "Extensions to
the C Language Family" section of the manual. Here's the section on lvalues
extensions:

http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Lvalues.html

You have to look in an old version like 3.4.6 because the extension is gone
in newer versions. Even in 3.x if you use it you get a warning (mandatory
warning, no -W options necessary).

Not exactly useful when pointers aren't involved, since it just converts the
value back and forth and ends up being the same as x=7.0; but it's not like
this is a feature just recently dreamed up by Mark Adler so he could have
something to complain about. It has been thought through. More than that, it
was documented and supported for a while before being deprecated and removed.

Once you decide to allow casts as lvalues, it's pretty obvious what
i=*((int*)p)++ should mean, and it really couldn't be written any more
elegantly. This is far more useful than arithmetic on a void * treating
sizeof(void) as 1. The deprecation fairy hit the wrong one.
 
A

Andrew Poelstra

Once you decide to allow casts as lvalues, it's pretty obvious what
i=*((int*)p)++ should mean, and it really couldn't be written any more
elegantly. This is far more useful than arithmetic on a void * treating
sizeof(void) as 1. The deprecation fairy hit the wrong one.

I think you could write:

i = *((int *)p);
p += (sizeof *p) / sizeof(int);

This won't work if sizeof *p is not a multiple of sizeof(int), but in
that case I can't imagine what the original code would do. Probably
segfault on most machines I use.
 
B

Ben Bacarisse

Andrew Poelstra said:
I think you could write:

i = *((int *)p);
p += (sizeof *p) / sizeof(int);

This won't work if sizeof *p is not a multiple of sizeof(int), but in
that case I can't imagine what the original code would do. Probably
segfault on most machines I use.

I don't agree with the second statement. If p is void * (as I think
is likely given the context) then arithmetic on it is not permitted at
all. The quoted example can be written:

i = *(int *)p;
p = (int *)p + 1;

The slightly different i = *++(int *)p; can be written without
repeating the cast:

i = *(p = (int *)p + 1);
 
N

Nick

Joe Wright said:
Surely 'byte' addressing. It is the hardware which determines
endianess, not the compiler.

It's 20 years ago, and my memory is fallible, but it certainly involved
bits. I'm trying to work out what this meant (as it's clearly silly to
think that that the MSB & 1 could ever be true) but I do remember having
to change the direction of bit-shift operators to fix things.
 
S

spinoza1111

What case is that?  That since others feed the troll, it's ok for

I am not a troll. A troll posts in bad faith to "get a rise" out of
people. It is not a person with a different point of view, who used C
heavily enough to be asked to assist Nash. You use the word because
you're a Nordic racist: it referred to the peoples of north and
western Europe whose culture was destroyed by invaders.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,792
Messages
2,569,639
Members
45,353
Latest member
RogerDoger

Latest Threads

Top