Does standard say anything about pointer->int or int->ptr conversions?

P

Peter

It's not that uncommon to see code where integral type is cast to a pointeror vice versa. On this page:

https://computing.llnl.gov/tutorials/pthreads/

you can see a few examples of casting void* to long, long to void* etc. Does standard address such conversions in any way other than calling them "undefined behaviour"? If not then why do people use them? Are they well defined for a particular compiler/hardware?
 
I

Ian Collins

Peter wrote:

{please clean up the mess google will inevitably make of your replies!}
It's not that uncommon to see code where integral type is cast to a
pointer or vice versa. On this page:

https://computing.llnl.gov/tutorials/pthreads/

you can see a few examples of casting void* to long, long to void*
etc. Does standard address such conversions in any way other than
calling them "undefined behaviour"? If not then why do people use
them? Are they well defined for a particular compiler/hardware?

Old (C) habits die hard!

On most (if not all?) current 64 bit systems, a pointer is a quart to an
int's pint pot. In other words, it won't fit. A long is more likely to
be the same size as a pointer, but that's not guaranteed.
 
Ö

Öö Tiib

It's not that uncommon to see code where integral type is cast to a
pointer or vice versa. On this page:

https://computing.llnl.gov/tutorials/pthreads/

you can see a few examples of casting void* to long, long to void* etc.
Does standard address such conversions in any way other than calling
them "undefined behaviour"? If not then why do people use them? Are
they well defined for a particular compiler/hardware?

There are optional typedefs 'intptr_t' and 'uintptr_t' in <cstdint> of
C++ for that purpose. Integer type capable of holding a value converted
from a void pointer and then be converted back to that type with a value
that compares equal to the original pointer.

So if 'intptr_t' happens to be present and happens to be alias of 'long'
then casting 'void*' to 'long' and back to 'void*' is well-defined.
 
I

Ian Collins

Scott said:
but uintptr_t is guaranteed to be the same size as a pointer.

Correct, but there is still a pile of old crufty code that predates C99
waiting for the chance to fail in unexpected but dramatic ways!
 
J

Jorgen Grahn

It's not that uncommon to see code where integral type is cast to a
pointer or vice versa. On this page:

https://computing.llnl.gov/tutorials/pthreads/

you can see a few examples of casting void* to long, long to void*
etc.

Side note: if the examples are in C (which they appear to be) most of
the casts are unnecessary. The author even casts the return value
from malloc() for some reason. I'd be a bit careful about relying on
that tutorial.

(Adding such casts just so C code can compile as C++ may be
justifiable in some situations, but then you'd better say that's what
you're doing.)

/Jorgen
 
J

James Kanze

Peter wrote:
Old (C) habits die hard!

Old (C) interfaces have a long life too. I've not looked at the
example he cited, but since it involves pthreads, I rather
suspect the casting is to pass an int as argument to the
callback (which formally takes a void*). It's one case where
I'll cast from int to void* and back. The alternative is to
pass the address of an int, and ensure that it will still be
accessible when the new thread gets around to reading it.
On most (if not all?) current 64 bit systems, a pointer is a quart to an
int's pint pot. In other words, it won't fit. A long is more likely to
be the same size as a pointer, but that's not guaranteed.

Note that in this case, you're starting with an int (and often,
with an int known to have a very small value). If void* is
larger than an int, that's no problem. (If int is larger than
a void*, it could be, if you needed all of the range of the
int.)
 
I

Ian Collins

Scott said:
Because you don't have to, doesn't mean you shouldn't. An explicit
cast of the return value from malloc(3c) in a C program acts as additional
documentation.

Casting the return of malloc in C is generally frowned upon because it
is both unnecessary and has the potential to mask errors. Like any form
of superfluous "documentation" it also has to be maintained.
 
I

Ian Collins

Scott said:
It can also _find_ errors (i.e. the type of the lvalue has
changed - implicit cast wouldn't flag that, but explict cast
will).

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

the expression is immune to changes in the type T. In addition, if T
changes, you don't have to hunt down the casts and change them.
I'd be interested in any case where it
masks error - that hasn't been my experience.

It was more of an issue pre-C99 where function prototypes weren't
required. In that case, the return value of malloc would be assumed to
be int and could be cast inappropriately.
 
J

Jorgen Grahn

Because you don't have to, doesn't mean you shouldn't. An explicit
cast of the return value from malloc(3c) in a C program acts as additional
documentation. Kinda like using {} for a single statement, style more
than substance (albeit _useful_ style, particularly for old card
programmers who didn't like repunching cards when patching source decks).

Although I don't like or use that style myself, my main complaint was
the part you snipped: that it was introduced without comment in a
tutorial.

/Jorgen
 
J

James Kanze

Yeah, it's a way to get two interfaces at the price of one, plus some
ugly casting:
1 passing an int to the thread by value: simple, if the information
you want to give the thread fits in an int
2 passing a whole struct to the thread by pointer, and having to
manage ownership and lifetime of it.
I suspect the designer of pthread_create() expected and wanted us
all to do (2) ...

I suspect that the designers of pthread_create() wanted to
create the most generic interface possible, given the constraint
that it had to be in C. Given this, I don't think that they'd
be surprised by either of the uses. For better or for worse,
both are widespread in C. (Which is one of the reasons I avoid
the language.)
 
I

Ike Naar

Ian Collins said:
Scott Lurndal wrote:

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

C++ does not have variable length structs. Given

/* begin a.cpp */
struct Var
{
int length;
char data[]; /* line 4 */
};
/* end a.cpp */

g++ 4.5.3 fails with the following diagnostic:
a.cpp:4:13: warning: ISO C++ forbids zero-size array 'data'

Returning to C,

/* begin a.c */
#include <stdio.h>
struct Var
{
int length;
char data[];
} *p;
int main(void)
{
printf("%zu %zu\n", sizeof(struct Var), sizeof *p);
return 0;
}
/* end a.cpp */

there is no differenct between the output of "sizeof(struct Var)"
or the canonical form "sizeof *p",
"4 4" is printed on a system with 4-byte int.

So I'm not sure what your objection is to using the canonical form.
 
I

Ian Collins

Scott said:
Ian Collins said:
Scott Lurndal wrote:

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

The canonical form is ideally suited to this case given you have to work
out how mach to allocate for the data part and the rest of the struct.
Given:

typedef struct Var
{
int a;
int b;
char var[];
} Var;

and you want 8 bytes on the end:

Var* p = malloc(8 + sizeof *p);
 
I

Ike Naar

Ian Collins said:
Scott Lurndal wrote:
It can also _find_ errors (i.e. the type of the lvalue has
changed - implicit cast wouldn't flag that, but explict cast
will).

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

C++ does not have variable length structs. Given

/* begin a.cpp */
struct Var
{
int length;
char data[]; /* line 4 */
};
/* end a.cpp */

g++ 4.5.3 fails with the following diagnostic:
a.cpp:4:13: warning: ISO C++ forbids zero-size array 'data'

Returning to C,

/* begin a.c */
#include <stdio.h>
struct Var
{
int length;
char data[];
} *p;
int main(void)
{
printf("%zu %zu\n", sizeof(struct Var), sizeof *p);
return 0;
}
/* end a.cpp */

there is no differenct between the output of "sizeof(struct Var)"
or the canonical form "sizeof *p",
"4 4" is printed on a system with 4-byte int.

So I'm not sure what your objection is to using the canonical form.

where, in the case of a variable struct Var with last member "T vec[];"
the canonical form would be

struct Var *p = malloc(sizeof *p + N * sizeof p->vec[0]);
 
J

Jorgen Grahn

Ian Collins said:
Scott Lurndal wrote:
It can also _find_ errors (i.e. the type of the lvalue has
changed - implicit cast wouldn't flag that, but explict cast
will).

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

C++ does not have variable length structs. Given [...]
So I'm not sure what your objection is to using the canonical form.

Actually, we were talking about C -- I took the thread slightly
offtopic when I warned about the IMO unusual C code in the pthreads
examples, quoted in the original posting. At some point someone
should probably have set Followup-To.

/Jorgen
 
I

Ike Naar

Scott Lurndal wrote:

It can also _find_ errors (i.e. the type of the lvalue has
changed - implicit cast wouldn't flag that, but explict cast
will).

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

C++ does not have variable length structs. Given [...]
So I'm not sure what your objection is to using the canonical form.

Actually, we were talking about C -- I took the thread slightly
offtopic when I warned about the IMO unusual C code in the pthreads
examples, quoted in the original posting. At some point someone
should probably have set Followup-To.

In most of the part that you snipped I was talking about C as well.
 
J

Jorgen Grahn

Scott Lurndal wrote:

It can also _find_ errors (i.e. the type of the lvalue has
changed - implicit cast wouldn't flag that, but explict cast
will).

If you stick to the canonical form in "C" of

T* p = malloc(sizeof *p);

This isn't always possible, however. Particularly with variable
length structs (i.e. with a T vec[]; terminal member).

C++ does not have variable length structs. Given [...]
So I'm not sure what your objection is to using the canonical form.

Actually, we were talking about C -- I took the thread slightly
offtopic when I warned about the IMO unusual C code in the pthreads
examples, quoted in the original posting. At some point someone
should probably have set Followup-To.

In most of the part that you snipped I was talking about C as well.

Sure, but I wasn't commenting on that part. You seemed to complain
about S.L. talking about C, and I wanted to explain that it was my
fault.

I should have snipped the " So I'm not sure what your objection is to
using the canonical form." too though -- it referred to something
else. Sorry!

/Jorgen
 
D

darylew

C++ does not have variable length structs. Given

/* begin a.cpp */
struct Var
{
int length;
char data[]; /* line 4 */
};
/* end a.cpp */

g++ 4.5.3 fails with the following diagnostic:
a.cpp:4:13: warning: ISO C++ forbids zero-size array 'data'

Just add an element:

struct Var
{
int length;
char data[ 1 ];
};
//...
int data_length = Whatever;
Var *v = std::malloc( sizeof(Var) + sizeof(char) * (data_length - 1));

When I first read about this trick, a single element was used as an anchor.I think "malloc" has to return a memory block whose location is suitable for all legal alignments. And an array, if properly aligned, has its firstelement aligned and therefore all later elements, which must be adjacent.

Daryle W.
 
B

Bo Persson

C++ does not have variable length structs. Given

/* begin a.cpp */
struct Var
{
int length;
char data[]; /* line 4 */
};
/* end a.cpp */

g++ 4.5.3 fails with the following diagnostic:
a.cpp:4:13: warning: ISO C++ forbids zero-size array 'data'

Just add an element:

struct Var
{
int length;
char data[ 1 ];
};
//...
int data_length = Whatever;
Var *v = std::malloc( sizeof(Var) + sizeof(char) * (data_length - 1) );

When I first read about this trick, a single element was used as an anchor. I think "malloc" has to return a memory block whose location is suitable for all legal alignments. And an array, if properly aligned, has its first element aligned and therefore all later elements, which must be adjacent.

However, in C++ it would still be undefined to access anything beyond
data[0].


Bo Persson
 
J

James Kanze

C++ does not have variable length structs. Given

/* begin a.cpp */
struct Var
{
int length;
char data[]; /* line 4 */
};
/* end a.cpp */

g++ 4.5.3 fails with the following diagnostic:
a.cpp:4:13: warning: ISO C++ forbids zero-size array 'data'

Just add an element:

struct Var
{
int length;
char data[ 1 ];
};
//...
int data_length = Whatever;
Var *v = std::malloc( sizeof(Var) + sizeof(char) * (data_length - 1) );

When I first read about this trick, a single element was
used as an anchor. I think "malloc" has to return a memory
block whose location is suitable for all legal alignments.
And an array, if properly aligned, has its first element
aligned and therefore all later elements, which must be
adjacent.
However, in C++ it would still be undefined to access anything beyond
data[0].

As in C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top