ptr1 == ptr2 but (int)ptr1 != (int)ptr2

  • Thread starter Hallvard B Furuseth
  • Start date
H

Hallvard B Furuseth

Do anyone know of an architecture where this can break?

T *ptr1, *ptr2;
...
if (ptr1 == ptr2)
if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/
assert((int)ptr1 == (int)ptr2);

(Feel free to replace int with another integer type if that helps to
break something.)

I know an addess can have several representations at least on some DOS
memory models, but I don't know if it normalizes pointers before
converting to integer.
 
K

Kenneth Brody

Hallvard said:
Do anyone know of an architecture where this can break?

T *ptr1, *ptr2;
...
if (ptr1 == ptr2)
if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/
assert((int)ptr1 == (int)ptr2);

(Feel free to replace int with another integer type if that helps to
break something.)

I know an addess can have several representations at least on some DOS
memory models, but I don't know if it normalizes pointers before
converting to integer.

I would suspect that, on most platforms in which a pointer can fit
into an int, if the pointers compare equal, then the converted-to-int
values will also compare equal.

However, I'm sure the standard probably says it's implementation
defined at best.

Consider a segmented memory architecture, such as "real-mode" on
the x86 line of processors. On such platforms "far" pointers are
32 bits (16-bit segment, plus 16-bit offset), and I would suspect
that it may be possible for two pointers to compare equal, even
if their bit patterns are not identical. (For example, it may
compare FFFF:0000 and F000:FFF0 as "equal", but 0xFFFF0000 and
0xF000FFF0 as ints [or perhaps longs] will not compare equal.)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
M

Malcolm McLean

Hallvard B Furuseth said:
Do anyone know of an architecture where this can break?

T *ptr1, *ptr2;
...
if (ptr1 == ptr2)
if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/
assert((int)ptr1 == (int)ptr2);

(Feel free to replace int with another integer type if that helps to
break something.)

I know an addess can have several representations at least on some DOS
memory models, but I don't know if it normalizes pointers before
converting to integer.
I'd be surprised if it does. If someone wants to convert a pointer to an
integer normally they want to get at the bits, so reinterpretation seems
likely. That would mean that if the same address has two interpretations, it
would break.

However in reality ptr1 == ptr2 would almost certainly break as well. That
is why it is illegal to create a pointer that roams over more than one
object. If the architecture is segmented, probably the only C way of
generating two different pointers to the same physical address is to move
one beyond its range. So at runtime the program can cheaply compare pointer
for equality by comaring bits, and doesn't need a normalisation step.
 
F

Flash Gordon

Malcolm said:
I'd be surprised if it does. If someone wants to convert a pointer to an
integer normally they want to get at the bits, so reinterpretation seems
likely. That would mean that if the same address has two
interpretations, it would break.

However in reality ptr1 == ptr2 would almost certainly break as well.

No, ptr1==ptr2 is guaranteed by the standard to always work[1] whether
or not the pointers have any relationship to each other and whether or
not they use different representations for the same value.
That is why it is illegal to create a pointer that roams over more than
one object.

That is a separate matter from whether different representations can be
used for the same pointer value. It allows for range checking
implementations, something which does exist and does not require there
to be different representations of the same physical address.
> If the architecture is segmented, probably the only C way of
generating two different pointers to the same physical address is to
move one beyond its range.

Wrong. The architecture could have overlapping segments such that:
{
int arr[4069]; /* arr starts in segment 1 */
int *ptr1 = a+1024; /* ptr1 starts in segment 2 */
int *ptr2 = a+4096; /* ptr2 starts in segment 3 */
while (ptr1 != ptr2) ptr2--;
Then they meet when ptr1 is using segment 2 but ptr2 is using segment 3.
The compiler has to make it work.

The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
> So at runtime the program can cheaply compare
pointer for equality by comaring bits, and doesn't need a normalisation
step.

Wrong. The standard guarantees that comparing for equality always works.
It is only relational operators excluding equality and inequality that
do not have to work.

[1] If one of the pointers is neither null, nor a pointer to an object,
nor a pointer to 1 past an object, then undefined behaviour occurs on
evaluating the pointer before you get as far as the comparison.
 
M

Malcolm McLean

Flash Gordon said:
Wrong. The standard guarantees that comparing for equality always works.
It is only relational operators excluding equality and inequality that do
not have to work.
Once your pointer holds an illegal address, any other operations on it
become undefined. Including tests for equality. So if we move a pointer
outside its object, the test for equality with another pointer can either
fail or pass, it is undefined. Writing to that pointer may write to the same
memory location, or it might trigger a segfault, again the behaviour is UB.
So by holding objects to the size of a segment, we can implement equality
tests by a simple comparison of bits, and still conform.

You have however homed in on a problem with the standard, which is that the
"1 past is valid" rule can make correct implementation of segemented objects
difficult.
[1] If one of the pointers is neither null, nor a pointer to an object,
nor a pointer to 1 past an object, then undefined behaviour occurs on
evaluating the pointer before you get as far as the comparison.
You've put the right answer here. Once you execute UB, all subsequent
operations also become undefined.
 
H

Hallvard B Furuseth

Malcolm said:
Hallvard B Furuseth wrote in message

I'd be surprised if it does. If someone wants to convert a pointer to an
integer normally they want to get at the bits, so reinterpretation seems
likely. That would mean that if the same address has two
interpretations, it would break.

So, two opposite guesses about what would happen so far... that's why I
wondered if anyone knew of real-world examples.
However in reality ptr1 == ptr2 would almost certainly break as
well.

No, that can only break if ptr1 or ptr2 does not contain a valid pointer
representation. I.e. a trap representation, as C99 calls it. That's
not a different representation of another pointer value, it's more like
invalid values which accidentally could compare equal to a valid value.
That is why it is illegal to create a pointer that roams over more
than one object. If the architecture is segmented, probably the only C
way of generating two different pointers to the same physical address is
to move one beyond its range.

I'm pretty sure I've seen counterexamples of this, and that some DOS
memory model was one of them.
 
A

ais523

The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.

<OT, but not very>

I seem to remember a C89 implementation in which malloc wouldn't
return larger than a segment, but by using the right compiler switches
and an implementation-specific header and library, it was possible to
get objects larger than a segment. You could declare

char far* s;

to get a pointer that could point anywhere in memory (the compiler
switch made far into a keyword) but compared like intptr_t would (so
occasionally you got counter-intuitive and non-compliant behaviour
such as greater-than tests on two pointers into the same object
returning the wrong result), or

char huge* s;

to get a pointer which behaved correctly but was more expensive (i.e.
the segmentation was taken into account with extra instructions).
'far' was enough in most cases, and you needed a special
implementation-specific farmalloc to get big objects.

Of course, the special things you had to do to get this to happen
weren't C89, or any other standard that I know of. This was one of the
sorts of things that caused confusion for beginning C programmers on
DOS systems (it's linked to the whole 'memory model' thing that
nowadays compilers can thankfully take care of themselves (hint to
anyone who actually ends up having to learn C on such a system: set it
to 'small' and you'll get correct C89 behaviour without having to
worry about it any further)).

</OT>
 
K

Kenneth Brody

Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.


Attempting to make this on-topic...

On such architectures, the compilers would assume (rightfully so, as
far as the standard is concerned) that only the offsets would need to
be compared in non-huge pointers. Because, even though a "far"
pointer was 32 bits, you can only compare pointers within a single
object. Therefore, 1111:0080 and 2222:0080 could compare "equal".
Also, I believe that xxxx:0000 would compare equal to NULL, regardless
of the segment "xxxx" value. This would mean that 1111:0000 and
2222:0000 would both compare equal to each other, and both would
compare equal to NULL, but storing them in long ints would make them
compare unequal to each other. (Remember, the above assumes that
these pointers are "far" and not "huge".)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

Malcolm said:
Once your pointer holds an illegal address, any other operations on it
become undefined. Including tests for equality.
[...]

If you have two valid pointers to separate "objects", must an
equality test return "false"? I seem to recall that in the
segmented world of real-mode x86 systems, "far" pointers (which
consisted of a 16-bit segment and a 16-bit offset) only compared
the offset part, meaning that, if the two "objects" were at the
same offset within different segments, the pointers would compare
as equal.

Of course, it's been many years since I've done such work, so I
may be remembering incorrectly.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
D

Dave Vandervies

Kenneth Brody said:
If you have two valid pointers to separate "objects", must an
equality test return "false"?

Yes. Equality and inequality tests, unlike ordering tests, must work on
any two valid pointers of the same type, whether or not they point into
the same object, and must report inequality for pointers to different
objects.
I seem to recall that in the
segmented world of real-mode x86 systems, "far" pointers (which
consisted of a 16-bit segment and a 16-bit offset) only compared
the offset part, meaning that, if the two "objects" were at the
same offset within different segments, the pointers would compare
as equal.

Of course, it's been many years since I've done such work, so I
may be remembering incorrectly.

An implementation that did that would be non-conforming.


dave
 
F

Flash Gordon

Kenneth Brody wrote, On 29/05/07 17:31:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

Ah, but could you use such pointers without using the non-standard
far/huge keywords? Say with an option on the compiler?
Attempting to make this on-topic...

On such architectures, the compilers would assume (rightfully so, as
far as the standard is concerned) that only the offsets would need to
be compared in non-huge pointers. Because, even though a "far"
pointer was 32 bits, you can only compare pointers within a single
object. Therefore, 1111:0080 and 2222:0080 could compare "equal".
Also, I believe that xxxx:0000 would compare equal to NULL, regardless
of the segment "xxxx" value. This would mean that 1111:0000 and
2222:0000 would both compare equal to each other, and both would
compare equal to NULL, but storing them in long ints would make them
compare unequal to each other. (Remember, the above assumes that
these pointers are "far" and not "huge".)

That would be non-conforming.
 
K

Keith Thompson

Kenneth Brody said:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

<OT>
I could be wrong about this, since I've never actually used such a
system, but I *think* that "near" and "far" were kinds of pointers,
and "huge" was one of several memory models. I don't think there was
(is?) such a thing as a "huge" pointer.
</OT>
 
R

Richard Heathfield

Keith Thompson said:
Kenneth Brody said:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments,
although I don't know if any compilers allowed objects (or malloced
regions) larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

<OT>
I could be wrong about this, since I've never actually used such a
system, but I *think* that "near" and "far" were kinds of pointers,
and "huge" was one of several memory models. I don't think there was
(is?) such a thing as a "huge" pointer.
</OT>

You're right. Early PCs had six memory models: tiny, small, medium,
large, compact, and huge. (Come to think of it, they're probably still
in there somewhere!)
 
C

Coos Haak

Op Tue, 29 May 2007 16:11:20 -0700 schreef Keith Thompson:
Kenneth Brody said:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

<OT>
I could be wrong about this, since I've never actually used such a
system, but I *think* that "near" and "far" were kinds of pointers,
and "huge" was one of several memory models. I don't think there was
(is?) such a thing as a "huge" pointer.
</OT>

<OT>
TC had huge pointers, from the help:
"The huge modifier is similar to the far
"modifier except for two additional features.
"Its segment is normalized during pointer
"arithmetic so that pointer comparisons are
"accurate. And, huge pointers can be
"incremented without suffering from segment
"wraparound.
</OT>
So I think that's conforming, although the range was 20 bits.
 
K

Kenneth Brody

Flash said:
Kenneth Brody wrote, On 29/05/07 17:31:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments, although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

Ah, but could you use such pointers without using the non-standard
far/huge keywords? Say with an option on the compiler?

I believe so. You told the compiler the memory model you wanted to
use (there were at least "small", "medium", "compact" and "large")
which determined the type of data and code pointers. (The "medium"
and "compact" models had one type as 16 bits and the other as 32.
I don't recall which was which.) There may have been a flag to set
"huge" as the default.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
M

Malcolm McLean

Dave Vandervies said:
Yes. Equality and inequality tests, unlike ordering tests, must work on
any two valid pointers of the same type, whether or not they point into
the same object, and must report inequality for pointers to different
objects.


An implementation that did that would be non-conforming.
If two pointers have the same bit pattern then, unless you have some
seriously weird architecture, they must be equal. However on some
architectures there are several representations of the same physical
address.
However if we increment a pointer to one object so that it points into
another then, except in the special case of one past, that is an error, and
all subsequent operations, including pointer comparision, become undefined.
Therefore, if you restrict objects to one "segment", equality tests can be
implemented with a simple comparison of bits. There is no need to resolve to
a physical address. Pointers in different objects will always return false.
Pointers in the same object have the same base, so not two representations
ever address the same memory. With illegal pointers the behaviour is
undefined so the compiler can return true, false, or segfault at whim.
 
M

Mark McIntyre

Early PCs had six memory models: tiny, small, medium,
large, compact, and huge.

Strictly speaking, these were 'features' of early PC compilers,
allowing you to choose between different stack and heap sizes and
addressable space layout. They weren't features of the hardware per
se, whereas far and near pointers were.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
B

Bart van Ingen Schenau

Malcolm said:
If two pointers have the same bit pattern then, unless you have some
seriously weird architecture, they must be equal. However on some
architectures there are several representations of the same physical
address.
However if we increment a pointer to one object so that it points into
another then, except in the special case of one past, that is an
error, and all subsequent operations, including pointer comparision,
become undefined. Therefore, if you restrict objects to one "segment",
equality tests can be implemented with a simple comparison of bits.
There is no need to resolve to a physical address. Pointers in
different objects will always return false. Pointers in the same
object have the same base, so not two representations ever address the
same memory. With illegal pointers the behaviour is undefined so the
compiler can return true, false, or segfault at whim.

Bitwise comparison is not good enough, if it is possible to write a
program where two pointers to the same object have different
representations without invoking UB in creating both representations.
If a valid program can create different representations for a pointer to
the same object, then you must resolve pointers to the physical address
before you can compare them.

Otherwise, I agree with your analysis.

Bart v Ingen Schenau
 
M

Malcolm McLean

Coos Haak said:
Op Tue, 29 May 2007 16:11:20 -0700 schreef Keith Thompson:
Kenneth Brody said:
Flash Gordon wrote:
[...]
The good old x86 range of processors uses overlapping segments,
although
I don't know if any compilers allowed objects (or malloced regions)
larger than a segment.
[...]

Yes. In addition to "near" pointers (16 bit offset into the default
data segment) and "far" pointers (32 bit segment:eek:ffset), there were
also "huge" pointers (32 bit segment:eek:ffset) which could point to
memory regions larger than a single 64K segment.

<OT>
I could be wrong about this, since I've never actually used such a
system, but I *think* that "near" and "far" were kinds of pointers,
and "huge" was one of several memory models. I don't think there was
(is?) such a thing as a "huge" pointer.
</OT>

<OT>
TC had huge pointers, from the help:
"The huge modifier is similar to the far
"modifier except for two additional features.
"Its segment is normalized during pointer
"arithmetic so that pointer comparisons are
"accurate. And, huge pointers can be
"incremented without suffering from segment
"wraparound.
</OT>
So I think that's conforming, although the range was 20 bits.
Huge pointers were saying "forget about performance, I just want standard
C". Unfortunately the processors were rather slow, so normally you couldn't.
Then we were all younger in those days, and thought hacking was cleverer
than writing clean and portable software.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top