casting (void *) to (class *)

J

James Kanze

Which is why C++0x, following the lead of C99, offers
uintptr_t, which is large enough to support round trip
conversions of void*'s. It's found in <stdint.h> or, if you're
sufficiently compulsive, <cstdint>.

Is it guaranteed to exist? Practically, of course, now that
there is long long (required to be 64 bits, at least), I doubt
that there will be any implementations where it doesn't. But in
the past, I've worked on machines where pointers were 48 bits,
and the largest integral type only 32.

And of course, that's only the tip of the iceberg with regards
to the problems in the original code.

Note that I also use something similar (actually a lot
simpler---just a cast to size_t), in specific cases. But my
original question to Alf was for a portable solution; I know how
to calculate a hash code for a pointer on every platform I've
ever worked on, but the soluton does vary depending on the
platform. (For example, it involves "normalizing" the pointer
on an 8086, something that has no meaning on a Sparc.)
 
G

Giovanni Deretta

Actually, this is one place where the standard did evolve
significantly from Boost: the thread and its identifier have
been separated, there is an explicit function for detach
(although the implicit detach in the constructor is still
there, rather than having it an error condition to destruct a
joinable thread),

AFAIK the latest draft requires explicit join or detach before
destruction, otherwise std::terminate is called. Thus the destructor
never blocks.
 
J

James Kanze

* James Kanze:
I think that problem is academic.

Not really.
It is the problem of a platform not supporting Boost. :)

So Boost isn't all that portable. I've known that for a long
time.
If one could find a C++ compiler for 16-bit Windows or MS-DOS
and compile in anything but "large model" there would be a
problem. Actually I still have the CD for such a compiler,
Visual C++ 1.5 :) But it's not standard C++.

Actually, the problem is large model, in real mode. At least
until 2007, Intel was manufacturing the 80186, which only worked
in real mode. Given the date, it wouldn't surprise me if they
had a EDG based C++ compiler for it---in other words, a C++
compiler more modern than what you're probably using under
Windows.
We can *imagine* some embedded system using e.g. an 80286 or
something, and segemented addressing.
But it's a myth that standard C++ is applicable to such systems.

Why not? The goal of the standards committee is that it should
be. And I've certainly used C++ (pre-standard, of course) on an
8086.
The standard guarantees roundtrip conversion for integer of
"sufficient size".

Only if such an integral type exists. C90 introduced long long
and ptrint_t, so its not existing on a system supporting these
aspects of C++0x is academic. But the code you posted used
size_t, which is NOT guaranteed to be of sufficient size.
So when the integer is of "sufficient size" conversion to
integer can't yield the same value for different pointer
values.

For an integer of "sufficient size", again. The code you posted
used size_t. I've worked on machines (in C++) where that wasn't
"sufficient size".
It can only be far from universal if there are a number of
systems with C++ compilers where those integer types,
ptrdiff_t and size_t, are not of the "sufficient size"
required by the standard for roundtrip conversion.

There are two serious problems with the code, depending on the
implementation of pointers on the system. The first is that
size_t may *not* have sufficient size; on a segmented
architecture, it may only be large enough for the offset, and on
at least one system I've used, every request for dynamic memory
returned a new segment, with an offset of 0. The second is that
is in fact quite common (IBM 360 and successors---including IBM
mainframes today, in some modes; Intel 80x86, etc.) for pointers
to allow more than one bit pattern to point to the same memory.
When you compare pointers for equality, the compiler has to take
this into account, somehow "normalizing" the pointer value if
there isn't a special instruction for the comparison. It
doesn't do this when converting to an integral type, so you end
up with several different hash codes for the same value.

If you're not concerned with such systems, if everything is
Windows for you, fine. Not all code has to be portable. But
don't pretend that it is.
Me neither.
If anything, perhaps it is designed to encourage discussion
about why the heck they're doing that. :)
But there is a difference, namely that for a sign-and-value
representation of signed integers, it maps binary 00...0 and
10...0 to the same size_t value, 0.

There is a difference, yes. On the machine I'm working on now,
addresses are considered "unsigned" (more or less, but the
machine does use linear addressing, no segments or anything).
And there are addresses which can't correctly be represented in
a ptrdiff_t, so technically, the code shouldn't compile. (But
the compilers I have don't enforce this.)
Sort of. A maximum alignment of 8 is pretty universal. And
it's only a "value adding service" so to speak, for the
hashing can't guarantee lack of collisions in the final
reduction to hast table size; it can only make it less likely.

Given the other restictions (all pointers which compare equal
have a unique bit pattern), they might as well do it right,
treat the pointer as an array of unsigned char, and use
something like a Mersenne prime function or FNV hashing.
(Consider that if the objects in question are big enough, the
dynamic allocator might try to return them page aligned, and if
they are small, the dynamic allocator might use a special pool
of them, resulting in all of the values being very close to one
another.)
Well, they're relying on the guaranteed roundtrip conversion
for "sufficient size" integers, which means guaranteed unique
values.
As I see it. :)

Where do you see a round trip? Or "sufficient size"?

The concrete case I know of (which was current up until at least
two years ago on some Intel embedded processors) is the basic
8086 architecture, in real mode. Typically, pointers were 32
bits, but size_t was only 16 bits. And the Intel real time
kernel only allocated segments, which meant that the results of
converting a dynamically allocated pointer to a size_t (assuming
the compiler accepted it) was always 0. Given an unsigned long,
of course, the pointer fitted. When comparing pointers, the
compilers normalized (segment * 16 + offset), but they didn't do
this when converting to an unsigned long; they just copied the
bits. Which of course causes no problems for round trip, since
you get back the pointer you started with, but does cause
problems because the hash code can be different, even though the
pointers compare equal, and represent the same address.

Of course, if you're only targetting Windows, or even if you're
only targetting desktop computers (Windows, Linux and Mac), it's
not something you should worry about. But that's not what I
understand by "portable" (and I certainly wouldn't consider it
acceptable for something claiming to be a general purpose
library, like Boost).
 
A

Alf P. Steinbach

* James Kanze:
Where do you see a round trip?

In the standard, not the code. The standard's roundtrip conversion guarantee
guarantees uniqueness when the integer is of sufficient of size.

Or "sufficient size"?

The standard again (direct quote). Whether the Boost code uses integer types of
"sufficient size" for all compilers supported, is a different matter. But one
would tend to think that they do perform unit-testing?

The concrete case I know of (which was current up until at least
two years ago on some Intel embedded processors) is the basic
8086 architecture, in real mode. Typically, pointers were 32
bits, but size_t was only 16 bits. And the Intel real time
kernel only allocated segments, which meant that the results of
converting a dynamically allocated pointer to a size_t (assuming
the compiler accepted it) was always 0. Given an unsigned long,
of course, the pointer fitted. When comparing pointers, the
compilers normalized (segment * 16 + offset), but they didn't do
this when converting to an unsigned long; they just copied the
bits. Which of course causes no problems for round trip, since
you get back the pointer you started with, but does cause
problems because the hash code can be different, even though the
pointers compare equal, and represent the same address.

Hm, I can't imagine that it didn't offer 32-bit integers.

And if did (or does, it it still exists) then all that's needed for the Boost
code is a platform-dependent typedef for the integer type they cast to, plus a
possible "gather the significant bits" conversion "down" to size_t, if they
choose to still use size_t as the hash function result type.

The rest is then taken care of by the standard's guarantees.

Perhaps one should CC Dave Harris and Alberto Barbati.

However, I leave that to you. :)

Of course, if you're only targetting Windows, or even if you're
only targetting desktop computers (Windows, Linux and Mac), it's
not something you should worry about. But that's not what I
understand by "portable" (and I certainly wouldn't consider it
acceptable for something claiming to be a general purpose
library, like Boost).

Ah, portability. There is the formal portability, where one writes standard C++
code and hopes for the boost, eh, beest. And then there is in-practice
portability, where one realizes that with the possible exception of Comeau no
current C++ compiler is sufficiently standard-conforming to make relying on only
formal portability realistic, so one adds compiler and platform-dependent fixes.
Chasing after the formal portability that would turn out to also be in-practice
portability is, IMHO, futile as a real goal, for it can be (extremely) much more
work to try to shoehorn code into the straightjacket of formally portable;
instead it's only a strong guideline, something that can help greatly with the
in-practice portability *if* one understands that going too far can have the
opposite effect, not helping but rather just generating extra, needless work.

I gather that if your portability concern should turn out to be a real issue,
i.e. there is some commonly used C++ compiler where the Boost code fails, then
the Boost solution will be in the direction I outlined above, purely practical.

And so it doesn't really matter if the standard supports hashing of pointers in
the same way for all compilers (formal portability that is also practical
portability with conforming compilers), although it's nice that C++0x adds that.


Cheers,

- Alf
 
J

James Kanze

* James Kanze:
In the standard, not the code. The standard's roundtrip
conversion guarantee guarantees uniqueness when the integer is
of sufficient of size.

So the standard guarantees something that isn't in the code. I
don't see how that affects the code. (There's also a slight
ambiguity in the standard---it says that after the round trip,
the pointer "will have its original value". I presume that this
means that it will compare equal to the original value, and
refer to the same place in memory. But I suppose that it could
be interpreted to mean that it would have the same bit pattern.
Either way, it's irrelevant here.)

The guarantee needed here is that two pointers which compare
equal will result in the same value when converted to an integer
(of sufficient size). Which has nothing to do with round trip.

Consider a very real case: on the Intel segmented architecture,
pointers consist of two parts: a segment and an offset. In real
mode (used at least until very recently in embedded processors),
the actual address is 16*segment+offset (all arithmetic
unsigned), so something like 0x1000:0x10 and 0x1001:0x0 point to
the same memory. And compare equal. When converted to a 32 bit
integer, however, one results in 0x10000010, and the other in
0x10010000. Two values which will not result in the same hash
code (although converting back to the initial pointer type will
result in the same value).

Also, machines with "large" pointers may not use all of the bits
in an address, masking out the irrelevant bits when comparing
pointers, but not when converting to integers. The classic
example of this is the IBM 360, which only used 24 bits of the
32 bits in a pointer for addressing---I believe that even the
latest IBM mainframes have a mode supporting this, to avoid
breaking old code. Comparing pointers masked off these bits,
converting them to an integer wouldn't (since it was common
practice back then to use them for additional information, to
save memory). And while I doubt you'll find such things in
modern 32 bit machines, it wouldn't surprise me if the still
existed in machines with larger word sizes (e.g. Unisys MP
series, etc.).
The standard again (direct quote). Whether the Boost code uses
integer types of "sufficient size" for all compilers
supported, is a different matter. But one would tend to think
that they do perform unit-testing?

On the few systems they support. Hopefully. (I don't know if
the issue has improved, but Boost didn't compile under Solaris,
on a Sparc, for a long time.)
Hm, I can't imagine that it didn't offer 32-bit integers.

They did.
And if did (or does, it it still exists) then all that's
needed for the Boost code is a platform-dependent typedef for
the integer type they cast to, plus a possible "gather the
significant bits" conversion "down" to size_t, if they choose
to still use size_t as the hash function result type.
The rest is then taken care of by the standard's guarantees.

Not at all. Where do you find a guarantee (direct or indirect)
that if two pointers compare equal, the resulting integers will
compare equal after conversions? It just isn't the case,
neither on Intel with far pointers, nor on IBM mainframes
running in compatibility mode.
Perhaps one should CC Dave Harris and Alberto Barbati.
However, I leave that to you. :)
Ah, portability. There is the formal portability, where one
writes standard C++ code and hopes for the boost, eh, beest.
And then there is in-practice portability, where one realizes
that with the possible exception of Comeau no current C++
compiler is sufficiently standard-conforming to make relying
on only formal portability realistic, so one adds compiler and
platform-dependent fixes.

Certainly. Practically all of my code assumes at least 32 bits
for int, for example. Although I know of (and have used)
implementations where this is not the case. In practice, it's
very, very rare to need to be portable to literally everything.
The standard libraries are an exception, of course. (One of the
justifications for putting something in the standard library is
that it cannot be implemented portably in the language: things
like offsetof, or std::less of a pointer---now that the standard
has hash tables, it probably should provide a "standard"
function to obtain a hash value of a pointer. For any given
implementation, it's certainly possible; for many, it's as
trivial as casting to std::size_t; but there's no way a user can
write it and be sure. But I see they did, §20.6.17 in the
draft.)
Chasing after the formal portability that would turn out to
also be in-practice portability is, IMHO, futile as a real
goal, for it can be (extremely) much more work to try to
shoehorn code into the straightjacket of formally portable;
instead it's only a strong guideline, something that can help
greatly with the in-practice portability *if* one understands
that going too far can have the opposite effect, not helping
but rather just generating extra, needless work.

Agreed. Different code has different portability requirements;
a lot of the code I write professionally makes extensive use of
pthread_mutex_t, for example, which isn't available on one of
the implementations I use at home.

If I raised the question of portability with regards to hashing
pointers, it's because it has been a real problem for me, in
practice. I've worked on implementations where simply
converting the pointer to an integer (of the appropriate size)
didn't work. And I'm aware of others where it won't work.
I gather that if your portability concern should turn out to
be a real issue, i.e. there is some commonly used C++ compiler
where the Boost code fails, then the Boost solution will be in
the direction I outlined above, purely practical.

I've used compilers where the Boost code fails. In practice, it
will work under Windows and Linux on a PC; it will probably work
on most, if not all, desktop computers. There are probably
mainframes, and almost certainly embedded processors, on which
it will fail.
And so it doesn't really matter if the standard supports
hashing of pointers in the same way for all compilers (formal
portability that is also practical portability with conforming
compilers), although it's nice that C++0x adds that.

Agreed. Let the standard library implementors worry about it,
not us.
 
M

Maxim Yegorushkin

[]
On the few systems they support.  Hopefully.  (I don't know if
the issue has improved, but Boost didn't compile under Solaris,
on a Sparc, for a long time.)

Just for the record, at work we've been using boost-1.36 on Solaris 10
x84-64 and SPARC. Boost is compiled with Sun C++ 5.9.
 
A

Alf P. Steinbach

* James Kanze:
So the standard guarantees something that isn't in the code. I
don't see how that affects the code.

I'm sorry, that's meaningless to me.


[snip]
The guarantee needed here is that two pointers which compare
equal will result in the same value when converted to an integer
(of sufficient size). Which has nothing to do with round trip.

Agreed.

The roundtrip discussion was in direct response to your statement that

"I also know of one system where it is almost useless. Where
for any dynamically allocated complete object, x would
always have the same value."

The standard guarantees that this isn't case for a standard-conforming compiler
when using an integer type of sufficient size.

It seems you lost the context of this argument?

Consider a very real case: on the Intel segmented architecture,
pointers consist of two parts: a segment and an offset. In real
mode (used at least until very recently in embedded processors),
the actual address is 16*segment+offset (all arithmetic
unsigned), so something like 0x1000:0x10 and 0x1001:0x0 point to
the same memory. And compare equal. When converted to a 32 bit
integer, however, one results in 0x10000010, and the other in
0x10010000. Two values which will not result in the same hash
code (although converting back to the initial pointer type will
result in the same value).

Also, machines with "large" pointers may not use all of the bits
in an address, masking out the irrelevant bits when comparing
pointers, but not when converting to integers. The classic
example of this is the IBM 360, which only used 24 bits of the
32 bits in a pointer for addressing---I believe that even the
latest IBM mainframes have a mode supporting this, to avoid
breaking old code. Comparing pointers masked off these bits,
converting them to an integer wouldn't (since it was common
practice back then to use them for additional information, to
save memory). And while I doubt you'll find such things in
modern 32 bit machines, it wouldn't surprise me if the still
existed in machines with larger word sizes (e.g. Unisys MP
series, etc.).

I think that problem is academic.

It is the problem of a platform not supporting Boost. :)

If one could find a C++ compiler for 16-bit Windows or MS-DOS and compile in
anything but "large model" there would be a problem. Actually I still have the
CD for such a compiler, Visual C++ 1.5 :) But it's not standard C++.

We can *imagine* some embedded system using e.g. an 80286 or something, and
segemented addressing.

But it's a myth that standard C++ is applicable to such systems.


[snip]
Not at all. Where do you find a guarantee (direct or indirect)
that if two pointers compare equal, the resulting integers will
compare equal after conversions?

That's a different issue.

Your original statement was that "x would always have the same value".

I agree that getting different integer values (from the logically same pointer
value) is a possible problem with selector+offset pointers.

But I can't imagine the Boost library ever supporting my old Visual C++ 1.5
compiler...


Cheers,

- Alf
 
J

James Kanze

* James Kanze:
I'm sorry, that's meaningless to me.

What you are saying is meaningless with regards to the code.
There's no round trip in the code.
[snip]
The guarantee needed here is that two pointers which compare
equal will result in the same value when converted to an integer
(of sufficient size). Which has nothing to do with round trip.

The roundtrip discussion was in direct response to your
statement that
"I also know of one system where it is almost useless. Where
for any dynamically allocated complete object, x would
always have the same value."
The standard guarantees that this isn't case for a
standard-conforming compiler when using an integer type of
sufficient size.
It seems you lost the context of this argument?

You seem to have lost it. You posted code from Boost which was
supposedly a portable implementation of a hash value for
pointers. There was no "integer type of sufficient size" in it,
and no "round trip" in it. So I fail to see what you're trying
to get at, except to raise side issues that aren't relevant.

Whether you like the fact or not, there has been at least one
system where the Boost implementation of hash codes for pointers
would always result in the same value for any pointer resulting
from dynamic allocation. A fully compliant implementation, at
least in this respect.
I think that problem is academic.

Maybe for you. Probably for a lot of people. Not for me---I've
worked on embedded systems, for example, and I'm familiar with
mainframes. My world isn't just Windows and Linux.
It is the problem of a platform not supporting Boost. :)

Which is probably true for a majority of the platforms in use
today.
If one could find a C++ compiler for 16-bit Windows or MS-DOS
and compile in anything but "large model" there would be a
problem. Actually I still have the CD for such a compiler,
Visual C++ 1.5 :) But it's not standard C++.
We can *imagine* some embedded system using e.g. an 80286 or
something, and segemented addressing.

The 80186 was a very popular processor on embedded systems.
But it's a myth that standard C++ is applicable to such systems.

Not to me, it isn't. Nor to the standards committee. If your
world is only Windows, fine. But I tend to work on a wide
variety of platforms.
[snip]
Not at all. Where do you find a guarantee (direct or indirect)
that if two pointers compare equal, the resulting integers will
compare equal after conversions?
That's a different issue.
Your original statement was that "x would always have the same
value".

I made two statements, corresponding to two different real
implementations I'm familiar with. Whether you like it or not,
the fact remains that x had type size_t, which on most 8086
implementations was 16 bits. And in one case, converting a
pointer returned from malloc to a size_t always resulted in 0.
I agree that getting different integer values (from the
logically same pointer value) is a possible problem with
selector+offset pointers.

Not only possible, but real.
But I can't imagine the Boost library ever supporting my old
Visual C++ 1.5 compiler...

Or many other embedded systems. Or a lot of other systems,
outside or Windows and Linux.

Parts of Boost are excellent, and quite portable. Other parts
are useless, or broken, or whatever. And the build system is
unusable---to use Boost effectively, other than under Windows or
Linux on a PC, you have to strip out the parts you want, then
write your own makefiles to build them (unless, of course,
they're just headers).
 
A

Alf P. Steinbach

* James Kanze:
What you are saying is meaningless with regards to the code.
There's no round trip in the code.

<teaspoon mode>:

* You maintained that:
"I also know of one system where it is almost useless. Where
for any dynamically allocated complete object, x would
always have the same value."

* I pointed out that this is not a problem when the integer x is
of "sufficient size".

* The reason it's not a problem is the standard's guaranteed
roundtrip conversion, which in turn guarantees unique values.

* There does not need to be a roundtrip conversion in the code
for the standard's guarantee about unique values to apply.

* The fact that you don't see a roundtrip conversion in the
code is therefore irrelevant, and meaningless as an argument.

</teaspooon mode>


[snip]
You seem to have lost it. You posted code from Boost which was
supposedly a portable implementation of a hash value for
pointers. There was no "integer type of sufficient size" in it,
and no "round trip" in it. So I fail to see what you're trying
to get at, except to raise side issues that aren't relevant.

On the contrary, you're the one veering away here; see above. But I think it's
just a case of the context getting lost.

Whether you like the fact or not, there has been at least one
system where the Boost implementation of hash codes for pointers
would always result in the same value for any pointer resulting
from dynamic allocation. A fully compliant implementation, at
least in this respect.

As shown, that's easy to fix. :)

The opposite problem, that of logically equal pointers yielding different
integer values, is a bit harder.

But for that to be an *actual* problem one would have to find an extant C++
compiler where it is a problem.

Maybe for you. Probably for a lot of people. Not for me---I've
worked on embedded systems, for example, and I'm familiar with
mainframes. My world isn't just Windows and Linux.


Which is probably true for a majority of the platforms in use
today.



The 80186 was a very popular processor on embedded systems.


Not to me, it isn't. Nor to the standards committee. If your
world is only Windows, fine. But I tend to work on a wide
variety of platforms.

Well, the challenge is then as I wrote above, to find an extant C++ compiler
where it is a problem.

If such a compiler exists (and is actually used), then perhaps the Boost guys
should be notified that it might be an idea to add support for it. :)


[snip]
Parts of Boost are excellent, and quite portable. Other parts
are useless, or broken, or whatever. And the build system is
unusable---to use Boost effectively, other than under Windows or
Linux on a PC, you have to strip out the parts you want, then
write your own makefiles to build them (unless, of course,
they're just headers).

I agree that the build system is bad. Also the filesystem API is IMHO sort of
practically unusable, but at least it provides wrappable functionality. Of
course ideally we should join the Boost effort and fix things, and participate
actively in Wikipedia maintainance to fix things there, and so on, but at least
I don't currently have the energy to do that in addition to other things.


Cheers,

- Alf
 
J

James Kanze

* James Kanze:
<teaspoon mode>:
* You maintained that:
"I also know of one system where it is almost useless. Where
for any dynamically allocated complete object, x would
always have the same value."

Where the function you posted (from Boost) is almost useless,
because in that function, x would always have the same value.
* I pointed out that this is not a problem when the integer x is
of "sufficient size".

Which is totally irrelevant to the posted code, since the
integer x did not have sufficient size. (Technically, the code
shouldn't compile, but a lot of compilers accept it.)
* The reason it's not a problem is the standard's guaranteed
roundtrip conversion, which in turn guarantees unique values.

Except that there is no round trip converision involved. In
fact, the code as posted is not legal under some
implementations, and in the frequent case where compilers don't
enforce this, it is useless.
* There does not need to be a roundtrip conversion in the code
for the standard's guarantee about unique values to apply.
* The fact that you don't see a roundtrip conversion in the
code is therefore irrelevant, and meaningless as an argument.

The argument is the fact that the code doesn't meet the
requirements the standard sets for a round trip conversion, so
any argument concerning roundtrip conversion is irrelevant.

Anyway, you can believe what you want---I've actually
encountered such implementations, in reality. And in my world,
reality trumps theory.
</teaspooon mode>
[snip]
It seems you lost the context of this argument?
You seem to have lost it. You posted code from Boost which
was supposedly a portable implementation of a hash value for
pointers. There was no "integer type of sufficient size" in
it, and no "round trip" in it. So I fail to see what you're
trying to get at, except to raise side issues that aren't
relevant.
On the contrary, you're the one veering away here; see above.
But I think it's just a case of the context getting lost.

The context is the code from Boost. That is what we're
discussing, isn't it?
As shown, that's easy to fix. :)

Yes and no. On at least one implementation I've used, there was
no integral type of sufficient size. With the introduction of
long long, I suspect that such cases will disappear in practice,
at least in my lifetime, but the CD still officially recognizes
the possibility.
The opposite problem, that of logically equal pointers
yielding different integer values, is a bit harder.
But for that to be an *actual* problem one would have to find
an extant C++ compiler where it is a problem.

And for it to be a real problem, there would have to be the
possibility that such a compiler might exist in the future.

[...]
Well, the challenge is then as I wrote above, to find an
extant C++ compiler where it is a problem.

No. The challenge is to consider that some system might exist
where such a compiler would make sense. The committee, and I
agree with them in this regard, wants to make C++ implementable
on as many machines as possible.
If such a compiler exists (and is actually used), then perhaps
the Boost guys should be notified that it might be an idea to
add support for it. :)

The current situation with Boost is that it only has limited
portability anyway. For a large number of reasons, not all bad.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top