Questions about pointer comparisons

R

raphfrk

Assuming this program:

#include <stdio.h>

int main( int argc, char *argv[] )
{

int a[100];

int b[100];

int *pa;

int *b_top;

pa = a + 20;

b_top = b + 99;

if( (pa >= b) && (pa<=b_top) )
{
printf( "Pa is pointing to an element in array b\n" );
}
else
{
printf( "Pa is not pointing to an element in array b\n" );
}

}

Will the else branch always execute in theory (it probably will in
practice)?

Does the standard define what comparison between pointers which point
to different arrays do?

In principle, there could be a rule that for all possible pointers,
there must be an ordering.

What if the pointers were void pointers in the comparisons?

i.e. would

((void *)Pa) < ((void *)b_top)

give the same answer as

Pa < b_top

for all Pa and b_top
 
J

Jens Thoms Toerring

Assuming this program:
#include <stdio.h>
int main( int argc, char *argv[] )
{
int a[100];
int b[100];
int *pa;
int *b_top;
pa = a + 20;
b_top = b + 99;
if( (pa >= b) && (pa<=b_top) )
{
printf( "Pa is pointing to an element in array b\n" );
}
else
{
printf( "Pa is not pointing to an element in array b\n" );
}
}
Will the else branch always execute in theory (it probably will in
practice)?
Does the standard define what comparison between pointers which point
to different arrays do?

Yes, the standard is quite clear about this in the section about
"Relational operators":

If the objects pointed to are not members of the same aggregate
or union object, the result is undefined, with the following
exception. If P points to the last member of an array object
and Q points to a member of the same array object, the pointer
expression P+1 compares higher than Q , even though P+1 does
not point to a member of the array object.

I.e. by comparing pointers to different arrays you invoke un-
defined behaviour and the result is meaningless.
In principle, there could be a rule that for all possible pointers,
there must be an ordering.

That would require that each machine a C can be written has to
have a flat memory model. And what do you hope to gain by com-
paring pointers to different objects? In your example program
the compiler could put 'a' before 'b' or also the other way
round. And it, in principle, could also put some extra padding
in between. So what would be the benefit of being able to com-
pare pointers to somehwere within 'a' and 'b' (except maybe
figuring out something about how your compiler lays out the
arrays in memory (for the set of options you invoked it with),d
something you can't rely on)?
What if the pointers were void pointers in the comparisons?
i.e. would
((void *)Pa) < ((void *)b_top)
give the same answer as
Pa < b_top
for all Pa and b_top

Casting to void pointers doesn't change anything about the
basic fact that it invokes undefined behaviour. I would say
it just makes things a bit worse since a comparison like
a < b implies a - b < 0, and arithmetic on void poiters is
also undefined.
Regards, Jens
 
N

Nick Keighley

Assuming this program:

I've reduced the vertical space (I was tempted to fix the indentation
as well...)
#include <stdio.h>

int main( int argc, char *argv[] )
{
int a[100];
int b[100];
int *pa;
int *b_top;

pa = a + 20;
b_top = b + 99;
if( (pa >= b) && (pa<=b_top) )
 printf( "Pa is pointing to an element in array b\n" );
else
 printf( "Pa is not pointing to an element in array b\n" );
}

Will the else branch always execute in theory[?]
no

(it probably will in practice)?

probably, on a typical modern desktop
Does the standard define what comparison between pointers which point
to different arrays do?

no (it says the behaviour is undefined)
In principle, there could be a rule that for all possible pointers,
there must be an ordering.

there could be such a rule, but there isn't
What if the pointers were void pointers in the comparisons?

the behaviour still wouldn't be defined
i.e. would

((void *)Pa) < ((void *)b_top)

give the same answer as

Pa < b_top

for all Pa and b_top

you don't know, the behaviour is undefined in either case and may not
necessarily give the same behaviour on different implementations.
 
P

Pillsy

On Dec 1, 7:35 am, (e-mail address removed) (Jens Thoms Toerring) wrote:
[...]
And what do you hope to gain by comparing pointers to different objects?
[...]
Perhaps you want to use that ordering as the basis for a set
represented as a binary tree? AIUI, that's how things work in the C++
STL, and it seems reasonable to want to use the same implementation
strategy in C.

Cheers,
Pillsy
 
N

Nick Keighley

On Dec 1, 7:35 am, (e-mail address removed) (Jens Thoms Toerring) wrote:

Perhaps you want to use that ordering as the basis for a set
represented as a binary tree? AIUI, that's how things work in the C++
STL, and it seems reasonable to want to use the same implementation
strategy in C.

I don't think C++ allows you compare pointers to different objects
either. A particular implementation may do this but they are gambling
that they never port to somewhere this assumption is broken. I suppose
on most implementaions you can rely on them being in *some* order and
that distinct objects can be compared and won;t do anything bizzare
like over lapping or interleaving. Some of the mainframe people may be
slightly scared by the idea of trying to get this to work...
 
P

Pillsy

On Dec 1, 11:03 am, Nick Keighley <[email protected]>
wrote:
[...]
I don't think C++ allows you compare pointers to different objects
either.

I don't want to wander too deep into the dubiously topical weeds of
how C++ works, but (again AIUI) the STL provides a default ordering on
pointers using comparison using std::less, not that you can use the
"<" operator.
A particular implementation may do this but they are gambling
that they never port to somewhere this assumption is broken.

Well, yeah, but that's the point of making it part of the standard
library, isn't it?
I suppose on most implementaions you can rely on them being in
*some* order and that distinct objects can be compared and won;t
do anything bizzare like over lapping or interleaving.

I don't see how even overlapping or interleaving prevents you from
defining an ordering that's sufficient for the purpose at hand
(stuffing things in a binary tree). Then again, I'm not a mainframe
programmer, and it's more than possible that I'm missing something.

Cheers,
Pillsy
 
S

Seebs

Will the else branch always execute in theory (it probably will in
practice)?

I wouldn't say it "always" will in practice -- I'd guess it will fairly
often, on many compilers.
Does the standard define what comparison between pointers which point
to different arrays do?

No, but it specifies what it does -- it invokes undefined behavior.
In principle, there could be a rule that for all possible pointers,
there must be an ordering.

But this rule would have been extremely hard to implement on some fairly
widespread targets, so it isn't there.
What if the pointers were void pointers in the comparisons?

No effect.
((void *)Pa) < ((void *)b_top)

give the same answer as

Pa < b_top

for all Pa and b_top

Interesting question. I'd guess that the answer is likely that they'd do
the same thing (which might be failing dismally) on essentially every target,
but it's still undefined behavior unless they're both pointers into the same
object.

-s
 
N

Nick

Assuming this program:
#include <stdio.h>
int main( int argc, char *argv[] )
{
int a[100];
int b[100];
int *pa;
int *b_top;
pa = a + 20;
b_top = b + 99;
if( (pa >= b) && (pa<=b_top) )
{
printf( "Pa is pointing to an element in array b\n" );
}
else
{
printf( "Pa is not pointing to an element in array b\n" );
}
}
Will the else branch always execute in theory (it probably will in
practice)?
Does the standard define what comparison between pointers which point
to different arrays do?

Yes, the standard is quite clear about this in the section about
"Relational operators":

If the objects pointed to are not members of the same aggregate
or union object, the result is undefined, with the following
exception. If P points to the last member of an array object
and Q points to a member of the same array object, the pointer
expression P+1 compares higher than Q , even though P+1 does
not point to a member of the array object.

I.e. by comparing pointers to different arrays you invoke un-
defined behaviour and the result is meaningless.

Which is something of a shame - although finding out if the array 'a' is
lower in memory than the array 'b' could be meaningless on machines
without a flat memory model, find out if a particular pointer is pointing
to a place within a particular object could, conceivably, be useful in
some rather strange circumstances - the OPs program is asking a not
silly question (I have a pointer to an integer, and an array of
integers, is my pointer pointing to a place inside the array).

I can see why - it's hard to see how you can legitimise it while keeping
if(a < b) outlawed - but I could also see a use for it.
 
K

Keith Thompson

Nick said:
(e-mail address removed) (Jens Thoms Toerring) writes: [...]
I.e. by comparing pointers to different arrays you invoke un-
defined behaviour and the result is meaningless.

Which is something of a shame - although finding out if the array 'a' is
lower in memory than the array 'b' could be meaningless on machines
without a flat memory model, find out if a particular pointer is pointing
to a place within a particular object could, conceivably, be useful in
some rather strange circumstances - the OPs program is asking a not
silly question (I have a pointer to an integer, and an array of
integers, is my pointer pointing to a place inside the array).

I can see why - it's hard to see how you can legitimise it while keeping
if(a < b) outlawed - but I could also see a use for it.

Agreed. (You can achieve the same thing in 100% portable C by
comparing the address for equality to the address of each byte
within the array, but that's not feasible for large arrays -- and
the simpler method, though not guaranteed to work, will *probably*
work on most implementations.)

As for why the standard doesn't guarantee anything, consider a
system with a segmented addressing scheme, where an address consists
of something that identifies a segment plus an offset within the
segment. Each object must be contained within a single segment.
"<" and ">" comparisons on pointers can then compare just the offset
portion of the addresses, which may be significantly more efficient.

The standard could have required "<" and ">" to work sensibly
on pointers to distinct objects, but that would require such
implementations to use more expensive operations for such
comparisons. It would also require some consistent ordering to be
imposed on the segments, which might require still more work.
 
P

Peter Nilsson

Nick Keighley said:
....
I don't think C++ allows you compare pointers to different
objects either. ...

The original context was relational operators, but I think it's
worth mentioning that pointers to different objects are often
compared with each other via equality operators.

That a conforming implementation must support this makes me
wonder why the application of relative operators on pointers
to 'different' objects is undefined, rather than unspecified.

The only reason I can think of is to cater for some sort of
interpreted implementation where arrays are implemented as
linked lists. Is there perhaps a more pragmatic reason?
 
A

Antoninus Twink

the OPs program is asking a not silly question (I have a pointer to an
integer, and an array of integers, is my pointer pointing to a place
inside the array).

I suggest you ignore the idiotic paranoia of the regulars. This
construct will work perfectly on every single desktop platform you're
ever likely to encounter, either now or in the future.

If you're programming in the real world, be a pragmatist. If you want to
debate angels on pinheads with the clc "regulars", then by all means be
a pedantic literalist.
 
S

Seebs

That a conforming implementation must support this makes me
wonder why the application of relative operators on pointers
to 'different' objects is undefined, rather than unspecified.
The only reason I can think of is to cater for some sort of
interpreted implementation where arrays are implemented as
linked lists. Is there perhaps a more pragmatic reason?

Making !=/== do extra work that's needed to keep things from blowing up
has lower impact than making all the pointer relational operators do that.

When you're working within an object, you're unlikely to compare pointers
for equality, so the extra cost is trivial; when you're comparing across
objects, though, you might well need equality to work.

-s
 
N

Nick

Antoninus Twink said:
I suggest you ignore the idiotic paranoia of the regulars. This
construct will work perfectly on every single desktop platform you're
ever likely to encounter, either now or in the future.

Please don't patronise me. I'm quite capable of working out who gives
sensible advice, and who leaps straight for the "you can't do"
autoresponder. I'm also quite capable of spotting cronies or
sockpuppets of people who never post anything about programming at all.
I'd already mentally put you on the "candidate for plonking" list, along
with your mates. You might be helping to fast-track your application.

For the record, in my time I've programmed in C on a lot of machines
that don't fit this description. I've programmed on non-ascii machines,
I've programmed on machines where, except for chars, everything was 64
bits. I've programmed on machines where the introduction of a standard
compiler blew my software up because suddenly the bits were ordered from
the opposite end.

I also write software now that assumes ascii (and comments on it), and
uses a pile of system specific stuff - nicely partitioned off so it can
be changed if and when the system changes. Indeed, the code is running
on its the third generation of processor and operating system.
If you're programming in the real world, be a pragmatist. If you want to
debate angels on pinheads with the clc "regulars", then by all means be
a pedantic literalist.

But note that when you do go outside the realms of what is standardised,
it's worth knowing that you are doing it.
 
K

Keith Thompson

Nick said:
Please don't patronise me. I'm quite capable of working out who gives
sensible advice, and who leaps straight for the "you can't do"
autoresponder.
[...]

Nick, by all means feel free to waste your own time reading what the
trolls post, or even responding to them by e-mail, but please don't
waste everyone else's time by posting your respnoses to the newsgroup.
Most of us have already killfiled them.

(I see AT uses a fake e-mail address, so I suppose replying by e-mail
isn't an option. Oh well.)
 
F

Flash Gordon

Pillsy said:
On Dec 1, 11:03 am, Nick Keighley <[email protected]>
wrote:
[...]
I don't think C++ allows you compare pointers to different objects
either.

I don't want to wander too deep into the dubiously topical weeds of
how C++ works, but (again AIUI) the STL provides a default ordering on
pointers using comparison using std::less, not that you can use the
"<" operator.

The C equivalent would be adding a library function to do this, of course.
Well, yeah, but that's the point of making it part of the standard
library, isn't it?

It would be, and if you want to propose it I suggest wandering over to
comp.std.c
I don't see how even overlapping or interleaving prevents you from
defining an ordering that's sufficient for the purpose at hand
(stuffing things in a binary tree). Then again, I'm not a mainframe
programmer, and it's more than possible that I'm missing something.

I actually suspect that a major reason for it not being defined
originally was small architectures, such as the Intel 8086. On such
architectures each individual object will fit within a segment, but the
program may be using objects in several segments. It is quick to compare
just the offset, but comparing both segment and offset, especially if
you need to normalise the pointers first, is a rather more major task.

As you say, I'm sure an ordering could be defined for even the most
exotic architecture, even if it does not really have an ordering.

If functions were added I would suggest two functions...
ptrcmp(void*,void*)
ptrbtwn(void*,void*,void*)
After all, you might be able to check if a pointer is in a range more
efficiently than doing two individual pointer comparisons.
 
A

Antoninus Twink

I'd already mentally put you on the "candidate for plonking" list, along
with your mates. You might be helping to fast-track your application.

Go right ahead - it will certainly earn you plenty of brownie points
with the "regulars".
For the record, in my time I've programmed in C on a lot of machines
that don't fit this description.

Good for you.
But note that when you do go outside the realms of what is standardised,
it's worth knowing that you are doing it.

I don't disagree. In fact, pragmatism precisely means that you should
decide on the right balance between portability and functionality.

Let's imagine I have two large arrays, say size n, one called A and the
other called B. A and B both store struct foos, all struct foos either
live in A or in B, and I need to be able to tell whether a given struct
foo comes from A or B.

I see three choices here:

1) I can add a field to each struct foo saying whether it belongs to A
or B. This will either require some fiddling and complexity in order to
use a spare bit in one of the struct fields for this purpose, or else
I've increased the memory by at least 2n bytes.

2) the Heathfield solution:
int is_in_A(struct foo *p)
{
struct foo *q;
for(q = A; q < A + n; q++)
if(q == p)
return 1;
return 0;
}
This is O(n) work EVERY TIME I want to check whether the struct is in A
or B. But it's standards compliant.

3) the OP's solution:
#define IS_IN_A(p) ((p) >= A && (p) < A + n)
Now it's O(1).

The /only/ circumstances in which a pragmatic programmer should be
prepared to pay the price of 1) or 2) are

* either if they're taking part in an academic exercise (e.g. all of
Heathfield's programming falls into this category),

* or if they have DAMN good reasons to expect that their program will
need to run on some obscure architecture where pointer representations
will screw this up.
 
S

Seebs

(I see AT uses a fake e-mail address, so I suppose replying by e-mail
isn't an option. Oh well.)

The purpose of replying by email is to communicate with someone. I have
seen nothing to suggest that there is anything to communicate with in
that case, so no real loss.

-s
 
A

Antoninus Twink

Nick, by all means feel free to waste your own time reading what the
trolls post, or even responding to them by e-mail, but please don't
waste everyone else's time by posting your respnoses to the newsgroup.
Most of us have already killfiled them.

Keith, if I didn't know you were incapable of humor, I'd chalk this up
as comic genius.

This must be the most patronizing response to a request not to be
patronized that it's possible to imagine.
 
P

Pillsy

The standard could have required "<" and ">" to work sensibly
on pointers to distinct objects, but that would require such
implementations to use more expensive operations for such
comparisons.  It would also require some consistent ordering to be
imposed on the segments, which might require still more work.

That work wouldn't necessarily have to be done by the relational
operators. "<" and ">" don't provide lexicographic ordering on
strings, either, but strcmp() does. Unlike strcmp(), implementing
these kinds of pointer comparisons evidently requires an extensive
knowledge of non-portable details of an implementation.

Cheers,
Pillsy
 
E

Eric Sosman

Peter said:
The original context was relational operators, but I think it's
worth mentioning that pointers to different objects are often
compared with each other via equality operators.

That a conforming implementation must support this makes me
wonder why the application of relative operators on pointers
to 'different' objects is undefined, rather than unspecified.

The only reason I can think of is to cater for some sort of
interpreted implementation where arrays are implemented as
linked lists. Is there perhaps a more pragmatic reason?

There have been computers with multiple address spaces[*],
where relations within a space make sense but cross-space
relations don't. Loose analogy: You may know that "1234 Elm
Street" is farther out of town than "96 Elm Street," but what
can you say about its position relative to "1447 River Drive?"

[*] At least one contemporary chip supports multiple spaces,
although I am not aware of any machine that actually uses them
except to distinguish I/O registers from ordinary memory. Also,
the address space ID is hidden inside the MMU where the code
never sees it as part of a virtual address, so virtual addresses
have a natural ordering even across spaces. But if somebody
finds a way to quadruple the teraflops by exploiting multiple
address spaces, it's likely they'll make use of the idea ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,801
Messages
2,569,658
Members
45,421
Latest member
DoreenCorn

Latest Threads

Top