Neatest way to get the end pointer?

Tomás Ó hÉilidhe · Feb 7, 2008

user923005:

That void pointers have no stride, and yet you can do this:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
void *p[5][5];
p[0][0] = NULL;
return 0;
}

"Have no stride"? As in "have no gait"? As in "don't have a particular way
of walking"?

Is this your way of saying you can't do pointer arithmetic on them?

Tomás Ó hÉilidhe · Feb 7, 2008

user923005:

void *p[5][5];
p[0][0] = NULL;

This is equal to:

*( *(p + 0) + 0 )

1) p is a void(*)[5][5], which decays to a void (**)[5]
2) p + 0 is a void (**)[5]
3) *(p + 0) is a void (*)[5], which decays to a void **
4) *(p + 0) + 0 is a void **
5) *( *(p + 0) + 0 ) is a void *

At no point is pointer arithmetic performed on a void*.

Tomás Ó hÉilidhe · Feb 7, 2008

Tomás Ó hÉilidhe:

This is equal to:

*( *(p + 0) + 0 )

1) p is a void(*)[5][5], which decays to a void (**)[5]
2) p + 0 is a void (**)[5]
3) *(p + 0) is a void (*)[5], which decays to a void **
4) *(p + 0) + 0 is a void **
5) *( *(p + 0) + 0 ) is a void *

Wups a daisy I got that wrong:

1) p is a void *[5][5], which decays to a void *(*)[5]
2) p + 0 is a void *(*)[5]
3) *(p + 0) is a void *[5], which decays to a void **
4) *(p + 0) + 0 is a void **
5) *( *(p + 0) + 0 ) is a void *

Therefore we could have:

void *p[5][5];

void *(*a)[5] = p + 0;

void **b = *(p + 0);

void **c = *(p + 0) + 0;

void *d = *((p + 0) + 0);

There's no arithmetic performed on a void*.

Malcolm McLean · Feb 7, 2008

Walter Roberson said:
Sounds to me like you have not considered Kolmogorov complexity in
this matter. Some functions are inherently complex, and cannot be
written smally except each in some language tailor-made to express
that one function smally.

Kolmogorov complexity is the complexity of A given B as the input. In this
case B is the operating system / platform / whatever. B is deliberately
designed to reduce the Kolmogorov complexity of as many functions as
possible. So it's not a good measure. Sometimes printf() is one call to a
library outside of the program, sometimes an expansion of the format string
followed by a call for each character, sometimes it decodes glyphs and
accesses the screen raster.

Invoking a routine is a form of manipulating memory.

If you want to have textual case distinguish object size, then
you need to be consistant and have large functions distinguished by
uppercase, as reminders that they are space-expensive to invoke.

They take memory space to link and memory space to invoke. You're running
the two things together here.
We could have a leading / trailing _M system.
_M_Mprintf_M
indicates that it is relatively hefty to link in, but only moderately
memory-greedy to invoke.

_M_M_M_M_malloc()

however is extremely memory heavy to link in, since it uses a huge
allocation pool, but the overhead on invocation is fairly tiny.

printf() is a good traditional example: an integer-only program
that invokes printf() must be linked with the floating point library
(if there is a seperate floating point library) because printf()
needs floating point linked it case the user specified a floating
point formating element. It was not uncommon in the earlier days
for an integer-only program to be a fairly small number of Kb but
to kick up by several hundred Kb because printf() was referenced.

That is a danger for embedded programs.

pete · Feb 7, 2008

Tomás Ó hÉilidhe wrote:

The uppercase letters in a macro's name are the wasp's stripes.
They're there to make you think
"Oh Christ I better not pass an argument whose
evalution has sideeffects". Thankfully, this convention is well
propogated throughout the "C community".

Using all uppercase letters for types is useless...
if anything it just introduces a
"boy who called wolf" situation for when
you come across a situation where you actually need to be warned.

That's what I think too.

Army1987 · Feb 7, 2008

Malcolm said:
It makes it harder to read, in my opinion.
"null" should be a keyword and ptr = 0 should be illegal. However that
raises issues of old code that uses memset or calloc to intialise pointers
to null.

What the hell? memset sets the representation of pointers to all bits
zero, and it could be the valid representation of a null pointer even if
ptr = 0 was illegal. (Now, the vice versa applies: ptr = 0 is legal, but
((unsigned char)&ptr) needn't be all zeroes.)
It raises the issues of old and less old code which use 0 as a null
pointer constant.

"NULL" shouts, and gives the null pointer an emphasis it normally should not
have.

Click to expand...

???

Old Wolf · Feb 8, 2008

*(&my_array+1)

I'd like to hear what people think in regard to there being UB when I
dereference at the end. Strictly speaking, there probably is, but
realisticly speaking, I think it's another thing to be added to the list
of "UB" that we ignore.

Who is "we" ? I don't ignore any UB. I'm not keen on
my application suddenly going haywire.

Like for instance, here's a UB that I ignore:

union { int s; unsigned us; } x;
x.s = -27;
printf("%u",x.us); /* Let's see what the bit pattern is */

There is no UB here. Memory is explicitly permitted to be
aliased by the corresponding 'signed' or 'unsigned' version
of the effective type, and there are no trap representations
for unsigned.

Old Wolf · Feb 8, 2008

user923005 said:
user923005 said:

That void pointers have no stride, and yet you can do this:

Click to expand...

void *p[5][5];
p[0][0] = NULL;
return 0;

Click to expand...

Sorry, I don't see what's confusing about that, other than the
potentially misleading statement that "void pointers have no stride".

p is a two-dimensional array of elements, where the size of each
element (the stride, if you like) is typically something like 4 or 8
bytes.

By 'stride', I think he means sizeof(T), where T is a
complete type of course. However, this has absolutely
nothing to do with declaring an array of T* . Probably
the OP is confused about the distinction between
arrays and pointers.

Keith Thompson · Feb 8, 2008

Old Wolf said:
There is no UB here. Memory is explicitly permitted to be
aliased by the corresponding 'signed' or 'unsigned' version
of the effective type, and there are no trap representations
for unsigned.

It's *almost* explicitly permitted; the most explicit statement is in
a footnote.

C99 6.2.5p9:

The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same.

with a footnote:

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

It's not as unambiguous a statement as I'd like.

ymuntyan · Feb 8, 2008

It's *almost* explicitly permitted; the most explicit statement is in
a footnote.

C99 6.2.5p9:

The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same.

with a footnote:

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

Are padding bits (and hence trap representation) really prohibited
in unsigned int? If not, then the following would break type punning
here:

int is [4 "same-as-sign" bits] [sign bit] [27 value bits]
unsigned is [4 padding bits] [ 28 value bits]

where four "same-as-sign" bits in an int object are set
or not set together with the sign bit; and in unsigned
the four padding bits are never set. Then if you store
a negative number in the int member of the union, and
access it as unsigned, you get the four padding bits set,
a trap representation.

Yevgen

Herbert Rosenau · Feb 8, 2008

Malcolm said:
Malcolm said:

It makes it harder to read, in my opinion.
"null" should be a keyword and ptr = 0 should be illegal. However that
raises issues of old code that uses memset or calloc to intialise pointers
to null.

Click to expand...

What the hell? memset sets the representation of pointers to all bits
zero, and it could be the valid representation of a null pointer even if
ptr = 0 was illegal. (Now, the vice versa applies: ptr = 0 is legal, but
((unsigned char)&ptr) needn't be all zeroes.)
It raises the issues of old and less old code which use 0 as a null
pointer constant.

<type> *p = NULL; /* legal */
<type> *p = 0; /* legal */
<type> *p = '\0'; /* legal */

memset(p, 0, n); /* illegal! when p is misused to overwrite vales not
type char */

There is absolutely no guarantee that all bits 0 are a legal value for
all (except char, including pointer) types. All bits 0 does NOT mean
that it is a nullpointer. An address (pointer) decays in 2 parts:
- address bits
- padding bits

So memset on anything, except char arrays, is always a good chance to
produce illegal padding bits.

???

Click to expand...

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2R Deutsch ist da!

Peter Nilsson · Feb 9, 2008

Keith Thompson said:
Peter Nilsson said:

[email protected] said:

...
int *p = my_array;
int const *const pend = my_array + sizeof my_array/
sizeof*my_array;
do *p++ = 42;
while (pend != p);

At no stage is pend dereferenced; and the loop exits
on p == pend, so the address is not dereferenced by p
either.

Ah, I was a bit confused I guess.
However, OP _does_ dereference it. Not in that snipped,
but in 4)

4) *(&my_array+ 1) decays to the address of the first
element in the non-existant array after the current
one, which is also the "pend" address for the array
that actually exists.

Click to expand...

Again, it is not derefenced...

The type of my_array is int[]
The type of &my_array is int (*)[]
The type of &my_array + 1 is int (*)[]
The type of *(&my_array + 1) is int []

The last expression will decay to an int * when used in
the assignment to pend. At no stage is that pointer
dereferenced.

Click to expand...

I'm not convinced (which is not to say that you're wrong).

The declaration was

int my_array[X];

so (&my_array + 1) is of type int (*)[X] (let's assume X
is constant; VLAs make my head hurt).

Then *(&my_array + 1) is of type int[X].

Yes. That is the only consequence of the unary * indirection
under 6.5.3.2p4. [The second sentence is not applicable
since neither condition applies. The third is not applicable
since &my_array + 1 is valid pointer.]

This is an expression of array type, and it's the value
of an array object that does not exist.

What value?

It then immediately decays to a value of type int*,
pointing just past the end of my_array.

Yes. That's given by 6.3.2.1p3.

But I think that
the intermediate expression *before* the array-to-pointer
conversion invokes UB.
How?

(Though I'd be surprised if any implementation did
anything other than what was intended; after all, there's
no need to load the value of the nonexistent array object.)

Where does the standard allow the 'loading' of this non-
existant array? 6.3.2.1p2 specifically excludes an array
lvalue from becoming a value.

Harald van DÄ³k · Feb 9, 2008

Keith Thompson said:
Keith Thompson said:

I'm not convinced (which is not to say that you're wrong).

The declaration was

Â Â int my_array[X];

so (&my_array + 1) is of type int (*)[X] (let's assume X is constant;
VLAs make my head hurt).

Then *(&my_array + 1) is of type int[X].

Click to expand...

Yes. That is the only consequence of the unary * indirection under
6.5.3.2p4. [The second sentence is not applicable since neither
condition applies. The third is not applicable since &my_array + 1 is
valid pointer.]

The third may be applicable. The footnote attached to it clarifies that
"invalid value" should be read as "value that is invalid for
dereferencing", and that otherwise perfectly valid pointer values can be
invalid for dereferencing. I don't know where it's stated which pointer
values are valid and which aren't, though.

Old Wolf · Feb 9, 2008

It's *almost* explicitly permitted; the most explicit statement is in
a footnote.

C99 6.2.5p9:

The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same.

with a footnote:

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

It's not as unambiguous a statement as I'd like.

On re-reading I think perhaps the behaviour could
be undefined; it's only unsigned char that has no
trap representations; and the sign bit (for example)
in the representation of -27 could be a padding bit
which causes a trap rep in the unsigned type. The
footnot you quote seems only to apply to the case
of non-negative values.

Harald van DÄ³k · Feb 9, 2008

Old Wolf said:
Old Wolf said:

There is no UB here. Memory is explicitly permitted to be aliased by
the corresponding 'signed' or 'unsigned' version of the effective type,
and there are no trap representations for unsigned.

Click to expand...

It's *almost* explicitly permitted; [...]

The aliasing rules (6.5p7) explicitly allow access of signed integers as
unsigned, and vice versa. There may be other reasons why a specific
attempt to access a signed int as unsigned int fails, such as the
possibility of trap representations pointed out in later messages, but
Old Wolf was correct in claiming that the aliasing itself is permitted.

Ben Bacarisse · Feb 9, 2008

Harald van DÄ³k said:
Keith Thompson said:

I'm not convinced (which is not to say that you're wrong).

The declaration was

Â Â int my_array[X];

so (&my_array + 1) is of type int (*)[X] (let's assume X is constant;
VLAs make my head hurt).

Then *(&my_array + 1) is of type int[X].

Click to expand...

Yes. That is the only consequence of the unary * indirection under
6.5.3.2p4. [The second sentence is not applicable since neither
condition applies. The third is not applicable since &my_array + 1 is
valid pointer.]

Click to expand...

The third may be applicable. The footnote attached to it clarifies that
"invalid value" should be read as "value that is invalid for
dereferencing", and that otherwise perfectly valid pointer values can be
invalid for dereferencing. I don't know where it's stated which pointer
values are valid and which aren't, though.

I thought this was a case where the last sentence of 6.5.6 p8 applies:
"If the result points one past the last element of the array object,
it shall not be used as the operand of a unary * operator that is
evaluated.". The "is evaluated" seems, to me, to reserved for the &*E
case where the validity of E is not significant. Surely the * "is
evaluated" in this case even though a decay to a pointer is the
immediate result?

pete · Feb 9, 2008

Malcolm said:
Yes. Macros like clamp(), lerp(), uniform(),
and so on are sufficiently
function-like that it makes sense to use the same
typography as functions.
You can make a legitimate case that caller should be warned about the
side-effect of passing an increment or += to the macro.
Nothing is black and
white. However Nintendo's N64 graphics pipeline macros,
which had to take an
arguement of the form ptr++
because they sometimes expanded to two or more
pipeline comands, were made to look like function calls.

This is also the convention set by the standard library.

Structures, however, should shout, because they are big.
Hence FILE or JPEG
might refer to very significant memory objects.

So lower case for portable macros,
mixed case for macros that depend on
something other than the standard library, as with fucntions, caps for
structure typedefs.

I prefer Indian Hill style naming conventions.

http://www.psgd.org/paul/docs/cstyle/cstyle11.htm

Tomás Ó hÉilidhe · Feb 9, 2008

Old Wolf:

There is no UB here. Memory is explicitly permitted to be
aliased by the corresponding 'signed' or 'unsigned' version
of the effective type, and there are no trap representations
for unsigned.

I was referring to the rule whereby you can't write to union member A
and then go on to read from union member B.

Chris Thomasson · Feb 9, 2008

Tomás Ó hÉilidhe said:
Old Wolf:

I was referring to the rule whereby you can't write to union member A
and then go on to read from union member B.

I have always wondered why that rule exists. For instance:
____________________________________________________________
#include <stdio.h>

typedef char static_assert[
sizeof(long int) == sizeof(int) ? 1 : -1
];

typedef union foo_u {
long int a;
int b;
} foo;

int main(void) {
foo f = { 0 };
f.a = 1;
printf("%d\n", f.b);
return 0;
}

____________________________________________________________

I find it a bit hard for me come up with a scenario in which the output
would not be: 1

Of course I have to be wrong here because according to the standard, the
output "could" be: 666.

Right?

Malcolm McLean · Feb 9, 2008

Chris Thomasson said:
#include <stdio.h>

typedef char static_assert[
sizeof(long int) == sizeof(int) ? 1 : -1
];

Is this dodge legal?

How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
Dereference an array pointer... UB?	33	Feb 11, 2008
Array of structs function pointer	10	Jul 16, 2023
pointer arithmetic	16	Feb 21, 2014
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Pointer-to-Object type error	0	Mar 26, 2022
Is there a way to get a single mode using all the points within a 2D array?	2	Oct 17, 2022
Help me with Concatenating strings	21	Sep 24, 2006

Neatest way to get the end pointer?

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Malcolm McLean

pete

Army1987

Old Wolf

Old Wolf

Keith Thompson

ymuntyan

Herbert Rosenau

Peter Nilsson

Harald van DÄ³k

Old Wolf

Harald van DÄ³k

Ben Bacarisse

pete

Tomás Ó hÉilidhe

Chris Thomasson

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads