# Decrement a given pointer.

Discussion in 'C Programming' started by Michael Press, Jul 9, 2013.

1. ### Michael PressGuest

Given a pointer, p, can I set pm1 = p-1 and use pm1
without worrying that an implementation will object
or do other than what one expects? The idea is to
get offset one arrays, e.g.,

void
f(int *p, int l)
{
int i;
int t;
int *pm1 = p - 1;

for(i = 1; i <= l; i++)
t = pm1;
}

int a[] = {1, 2, 3};
int na = sizeof a / sizeof *a;

void
doit(void)
{
int *b = a;

f(a, na);
f(b, na);
}

--
Michael Press

Michael Press, Jul 9, 2013

2. ### Keith ThompsonGuest

Michael Press <> writes:
> Given a pointer, p, can I set pm1 = p-1 and use pm1
> without worrying that an implementation will object
> or do other than what one expects? The idea is to
> get offset one arrays, e.g.,
>
> void
> f(int *p, int l)
> {
> int i;
> int t;
> int *pm1 = p - 1;
>
> for(i = 1; i <= l; i++)
> t = pm1;
> }
>
> int a[] = {1, 2, 3};
> int na = sizeof a / sizeof *a;
>
> void
> doit(void)
> {
> int *b = a;
>
> f(a, na);
> f(b, na);
> }

No. Given a pointer to an array element, you can safely construct a
pointer to any element of the array. You can also safely construct a
pointer just past the end of the array, but you can't dereference it.
Using pointer arithmetic to construct a pointer outside the bounds of
the array has undefined behavior. (A single object is treated as a
one-element array for purposes of pointer arithmetic.)

It's fairly likely to work on most systems, but it's not guaranteed.

I seem to recall that the book "Numerical Recipes in C" used this
technique to translate Fortran code into C.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jul 9, 2013

3. ### Joe PfeifferGuest

Michael Press <> writes:

> Given a pointer, p, can I set pm1 = p-1 and use pm1
> without worrying that an implementation will object
> or do other than what one expects? The idea is to
> get offset one arrays, e.g.,

<snip>

Keith has already given what I expect is the right answer to your
question, but I'd go on to ask "why?". Unless there's a *really* good
reason, you should simply use the language as designed.

Having said that, I'll mention that I have occasion to use what amount
to offset 1 arrays on a current project: I'm obtaining altimeter data
from an altimeter that has a parameter that goes from 1 to 9; it seems
less error-prone to me to use a 10 element array and just waste element
0 than to mess with macros or other code to add and subract 1 from an
index in multiple places. But you'll notice that this approach to it
doesn't depend on tricky code having undefined behavior do the right
thing.

Joe Pfeiffer, Jul 9, 2013
4. ### Siri CruiseGuest

In article <>, Keith Thompson <>
wrote:

> It's fairly likely to work on most systems, but it's not guaranteed.

computed rather than waiting for address load. If the array is at the beginning
some memory partition, these kinds of CPUs can get an address fault.
--
Mommy is giving the world some kind of bird.
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted.
NSA CIA Constitution patriot terrorism freedom Snowden Paid Maternity Leave

Siri Cruise, Jul 9, 2013
5. ### Eric SosmanGuest

On 7/8/2013 7:12 PM, Michael Press wrote:
> Given a pointer, p, can I set pm1 = p-1 and use pm1
> without worrying that an implementation will object
> or do other than what one expects? The idea is to
> get offset one arrays, e.g.,
> [...]

This is Question 6.17 on the comp.lang.c Frequently
Asked Questions (FAQ) page at <http://www.c-faq.com/>.

--
Eric Sosman
d

Eric Sosman, Jul 9, 2013
6. ### Ian CollinsGuest

Siri Cruise wrote:
> In article <>, Keith Thompson <>
> wrote:
>
>> It's fairly likely to work on most systems, but it's not guaranteed.

>
> computed rather than waiting for address load. If the array is at the beginning
> some memory partition, these kinds of CPUs can get an address fault.

Not in the context of the OP, the address p-1 was never dereferenced.

--
Ian Collins

Ian Collins, Jul 9, 2013
7. ### Eric SosmanGuest

On 7/8/2013 9:14 PM, Ian Collins wrote:
> Siri Cruise wrote:
>> In article <>, Keith Thompson
>> <>
>> wrote:
>>
>>> It's fairly likely to work on most systems, but it's not guaranteed.

>>
>> Some CPUs use address registers and check address validity when the
>> computed rather than waiting for address load. If the array is at the
>> beginning
>> some memory partition, these kinds of CPUs can get an address fault.

>
> Not in the context of the OP, the address p-1 was never dereferenced.

Even computing it (trying to compute it) yields undefined
behavior. FAQ 6.17.

--
Eric Sosman
d

Eric Sosman, Jul 9, 2013
8. ### Siri CruiseGuest

In article <>,
Ian Collins <> wrote:

> Siri Cruise wrote:
> > In article <>, Keith Thompson <>
> > wrote:
> >
> >> It's fairly likely to work on most systems, but it's not guaranteed.

> >
> > computed rather than waiting for address load. If the array is at the
> > beginning
> > some memory partition, these kinds of CPUs can get an address fault.

>
> Not in the context of the OP, the address p-1 was never dereferenced.

It's not guarenteed to work because some CPUs validate addresses on computation
before dereference.
--
Mommy is giving the world some kind of bird.
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted.
NSA CIA Constitution patriot terrorism freedom Snowden Paid Maternity Leave

Siri Cruise, Jul 9, 2013
9. ### Keith ThompsonGuest

Ian Collins <> writes:
> Siri Cruise wrote:
>> In article <>, Keith Thompson <>
>> wrote:
>>
>>> It's fairly likely to work on most systems, but it's not guaranteed.

>>
>> Some CPUs use address registers and check address validity when the
>> is at the beginning some memory partition, these kinds of CPUs can

>
> Not in the context of the OP, the address p-1 was never dereferenced.

Siri Cruise said that the validity of the address is checked when it's
computed, not when it's dereferenced, so yes, that kind of CPU would get

(Which is consistent with my statement, since most CPUs don't do that.
Still, I certainly don't recommend counting on that.)

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jul 9, 2013
10. ### glen herrmannsfeldtGuest

Siri Cruise <> wrote:
> In article <>,

(snip on generating p-1 where p is a pointer to something)

>> Not in the context of the OP, the address p-1 was never
>> dereferenced.

> It's not guarenteed to work because some CPUs validate addresses
> on computation before dereference.

Do you know of any actual such CPUs in current use?

The most popular CPU that uses anything similar wraps the
address on computation, such that it works.

I used protected mode on the 80286 in OS/2, and then on the 486 for
a while before OS/2 2.0 came out. On the 80286 in protected mode,
addresses consist of a segment selector, selecting a segment descriptor,
and a 16 bit offset into the segment. If you subtract, the offset will
wrap, and when you add one again, will wrap back again.

The CPU will validate a segment selector when loaded into a segment
register, except that segment 0 is the null segment selector.
(A special case in hardware.)

If a system does bounds checking, it is possible that the
bounds check will notice, but even then it is likely done
only at dereference.

But yes, it violates the standard but most likely will work.

-- glen

glen herrmannsfeldt, Jul 9, 2013
11. ### Siri CruiseGuest

In article <>, Keith Thompson <>
wrote:

> Ian Collins <> writes:
> > Siri Cruise wrote:
> >> In article <>, Keith Thompson <>
> >> wrote:
> >>
> >>> It's fairly likely to work on most systems, but it's not guaranteed.
> >>
> >> Some CPUs use address registers and check address validity when the
> >> is at the beginning some memory partition, these kinds of CPUs can
> >> get an address fault.

> >
> > Not in the context of the OP, the address p-1 was never dereferenced.

>
> Siri Cruise said that the validity of the address is checked when it's
> computed, not when it's dereferenced, so yes, that kind of CPU would get
>
> (Which is consistent with my statement, since most CPUs don't do that.
> Still, I certainly don't recommend counting on that.)

An alternative is something like
int A_[m,n];
#define A(j,k) A_[(j)-1,(k)-1]
--
Mommy is giving the world some kind of bird.
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted.
NSA CIA Constitution patriot terrorism freedom Snowden Paid Maternity Leave

Siri Cruise, Jul 9, 2013
12. ### James KuyperGuest

On 07/08/2013 10:44 PM, Siri Cruise wrote:
....
> An alternative is something like
> int A_[m,n];
> #define A(j,k) A_[(j)-1,(k)-1]

That's equivalent to

int A_[n];
#define A(j,k) A_[(k)-1]

Were you thinking of Fortran?
--
James Kuyper

James Kuyper, Jul 9, 2013
13. ### Stephen SprunkGuest

On 08-Jul-13 21:20, glen herrmannsfeldt wrote:
> Siri Cruise <> wrote:
>> In article <>,
>>> Not in the context of the OP, the address p-1 was never
>>> dereferenced.

>>
>> It's not guarenteed to work because some CPUs validate addresses on
>> computation before dereference.

>
> Do you know of any actual such CPUs in current use?

AS/400 is commonly cited here as an example of such a system.

> The most popular CPU that uses anything similar wraps the address on
> computation, such that it works.
>
> I used protected mode on the 80286 in OS/2, and then on the 486 for a
> while before OS/2 2.0 came out. On the 80286 in protected mode,
> addresses consist of a segment selector, selecting a segment
> descriptor, and a 16 bit offset into the segment. If you subtract,
> the offset will wrap, and when you add one again, will wrap back
> again.

You seem to be assuming that the wrapped pointer will not exceed the
segment limit and generate an exception. That is probably true on x86
systems, where the segment limit is almost always (unsigned)(-1), but
probably not on other segmented systems.

> The CPU will validate a segment selector when loaded into a segment
> register, except that segment 0 is the null segment selector. (A
> special case in hardware.)

True, but the selector would remain valid if the original pointer were
valid. i286 doesn't validate the offset part, which is likely to be
invalid in this case, before a load or store is performed; doing so
would be impossible since it doesn't have dedicated address registers.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

Stephen Sprunk, Jul 9, 2013
14. ### glen herrmannsfeldtGuest

Stephen Sprunk <> wrote:

(snip, someone wrote)
>>> It's not guarenteed to work because some CPUs validate
>>> addresses on computation before dereference.

>> Do you know of any actual such CPUs in current use?

> AS/400 is commonly cited here as an example of such a system.

Yes, they might do it. Do they have a C compiler?

>> The most popular CPU that uses anything similar wraps the
>> address on computation, such that it works.

>> I used protected mode on the 80286 in OS/2, and then on the 486 for
>> a while before OS/2 2.0 came out. On the 80286 in protected mode,
>> addresses consist of a segment selector, selecting a segment
>> descriptor, and a 16 bit offset into the segment. If you subtract,
>> the offset will wrap, and when you add one again, will wrap back
>> again.

> You seem to be assuming that the wrapped pointer will not exceed the
> segment limit and generate an exception. That is probably true on x86
> systems, where the segment limit is almost always (unsigned)(-1), but
> probably not on other segmented systems.

I am not sure at all what it does in huge mode, I never used that.
In large mode, the offset is in an ordinary register, and will
wrap back again before the dereference.

The segment selector has to exist when the value is loaded into
a segment register, but the offset isn't checked until an actual

In 32 bit protected mode, I believe it is usual to set the limit to,
as you note, (unsigned)(-1), but in 16 bit mode, no. In 32 bit
mode, you have the PMMU to validate addresses, in 16 bit the only
validation is the segment selector limit.

segment register would require the segment be valid.

>> The CPU will validate a segment selector when loaded into a
>> segment register, except that segment 0 is the null segment
>> selector. (A special case in hardware.)

> True, but the selector would remain valid if the original pointer were
> valid. i286 doesn't validate the offset part, which is likely to be
> invalid in this case, before a load or store is performed; doing so
> would be impossible since it doesn't have dedicated address registers.

Well, there are instructions that load both a segment register and
another register with a segment/offset pair. In that case it could
be done, but I don't believe it is done. It would be extra work
that isn't necessary.

I don't know AS/400 addressing enough to know if it is necessary
to validate early.

-- glen

glen herrmannsfeldt, Jul 9, 2013
15. ### glen herrmannsfeldtGuest

Gordon Burditt <> wrote:
>> Do you know of any actual such CPUs in current use?

>> The most popular CPU that uses anything similar wraps the
>> address on computation, such that it works.

> There are other problems that can happen besides faults
> happening when you form the address.

> Consider this loop to traverse an array of structures backwards:

> struct huge *p;
> struct huge bigarray[MAX];

> /* WRONG! */
> for (p = &bigarray[MAX-1]; p >= &bigarray[0]; p--) {
> ... do something with struct huge
> pointed at by p ...;
> }

In huge mode, it should work, but not in large mode.
Huge mode decrements the segment selector when the offset wraps.
(Much extra code to do that, so I try not to use it.)

The segment selector will be invalid, but I believe the arithmetic
is not done in segment registers. (There is no decrement operation
on segment registers.)

> In order for this loop to stop p has to equal &bigarray[-1], and
> this value has to be less than &bigarray[0]. In a situation where
> (a) pointers are compared as unsigned numbers, (b) global data is
> allocated starting around virtual address 0, and (c) there isn't
> much other global data compared to the size of a struct huge,
> &bigarray[-1] overflows to a large positive number. The loop never
> terminates. (Well, when p is set to a large positive number there's
> a good chance no memory is allocated there, so the body of the
> loop will segfault.)

Again, large but not huge. Huge mode has to compare both the segment
and offset of the pointer. Large mode only the offset.

> This problem was actually observed on a Motorola 68000 processor,
> one that generally behaves like you "expect" rather than quirky
> behavior the standard allows.

The 68000 is a 16 bit processor, but able to address more than 64K.
I don't remember quite how they did it.

Some time before the 68020, I used a 68010 system with a custom MMU.

-- glen

glen herrmannsfeldt, Jul 9, 2013
16. ### Siri CruiseGuest

In article <krhopj\$mi2\$>,
glen herrmannsfeldt <> wrote:

> > AS/400 is commonly cited here as an example of such a system.

>
> Yes, they might do it. Do they have a C compiler?

It doesn't matter whether you think this is the result of stupid design. What
matters is some vendor with enough influence with ANSI got this caveat written
into the C standard. Code that violates it may run on 99% of all machines; but
it is still code not guaranteed for 100%. If you're happy with that and so are
your customers, go for it. I write code that only runs on Unix or even just
MacOSX. My customers pay for that, so I'm fine with being nonstandard.

I added my comment simply to explain why such an odd rule exists. I once worked
on CDC computers with address registers so I happen to be aware of these issues.
I have no comment on whether this is a good idea.

I avoid the issue by letting array indices go out of bounds instead of pointers,
such as
#define A(j,k) A_[(j)-1][(k)-1]
or
for (int j=n-1; j>=0; j--) f(B[j]);
--
Mommy is giving the world some kind of bird.
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted.
NSA CIA Constitution patriot terrorism freedom Snowden Paid Maternity Leave

Siri Cruise, Jul 10, 2013
17. ### Stephen SprunkGuest

On 09-Jul-13 14:50, glen herrmannsfeldt wrote:
> Gordon Burditt <> wrote:
>> This problem was actually observed on a Motorola 68000 processor,
>> one that generally behaves like you "expect" rather than quirky
>> behavior the standard allows.

>
> The 68000 is a 16 bit processor, but able to address more than 64K. I
> don't remember quite how they did it.

The m68k is a 32-bit processor, at least in the sense it presented
32-bit registers and a 32-bit address space to the programmer. The
first implementation used pairs of 16-bit registers with carry, but that
was invisible to the programmer; code continued to work as-is when
ported to later implementations that had true 32-bit registers.

bits), so it was possible for the CPU to fault when loading or
manipulating an invalid pointer even without dereferencing it.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

Stephen Sprunk, Jul 10, 2013
18. ### Michael PressGuest

In article <>,
Joe Pfeiffer <> wrote:

> Michael Press <> writes:
>
> > Given a pointer, p, can I set pm1 = p-1 and use pm1
> > without worrying that an implementation will object
> > or do other than what one expects? The idea is to
> > get offset one arrays, e.g.,

>
> <snip>
>
> Keith has already given what I expect is the right answer to your
> question, but I'd go on to ask "why?". Unless there's a *really* good
> reason, you should simply use the language as designed.

Indexing into a heap. The top of the heap is heap[1].
The two subsidiary nodes to heap[k] are
heap[2 * k] and heap[2 * k + 1].

> Having said that, I'll mention that I have occasion to use what amount
> to offset 1 arrays on a current project: I'm obtaining altimeter data
> from an altimeter that has a parameter that goes from 1 to 9; it seems
> less error-prone to me to use a 10 element array and just waste element
> 0 than to mess with macros or other code to add and subract 1 from an
> index in multiple places.

I often have arrays that naturally start at 1 and do the same
as you: waste the array entry at index 0. Sometimes I do not
have the choice.

> But you'll notice that this approach to it
> doesn't depend on tricky code having undefined behavior do the right
> thing.

--
Michael Press

Michael Press, Jul 10, 2013
19. ### Michael PressGuest

In article <>,
(Gordon Burditt) wrote:

> > Do you know of any actual such CPUs in current use?
> >
> > The most popular CPU that uses anything similar wraps the
> > address on computation, such that it works.

>
> There are other problems that can happen besides faults
> happening when you form the address.
>
> Consider this loop to traverse an array of structures backwards:
>
> struct huge *p;
> struct huge bigarray[MAX];
>
> /* WRONG! */
> for (p = &bigarray[MAX-1]; p >= &bigarray[0]; p--) {
> ... do something with struct huge
> pointed at by p ...;
> }

So write

for (int k = MAX; k-- > 0; ) {

--
Michael Press

Michael Press, Jul 10, 2013
20. ### Joe PfeifferGuest

Michael Press <> writes:

> In article <>,
> Joe Pfeiffer <> wrote:
>
>> Michael Press <> writes:
>>
>> > Given a pointer, p, can I set pm1 = p-1 and use pm1
>> > without worrying that an implementation will object
>> > or do other than what one expects? The idea is to
>> > get offset one arrays, e.g.,

>>
>> <snip>
>>
>> Keith has already given what I expect is the right answer to your
>> question, but I'd go on to ask "why?". Unless there's a *really* good
>> reason, you should simply use the language as designed.

>
> Indexing into a heap. The top of the heap is heap[1].
> The two subsidiary nodes to heap[k] are
> heap[2 * k] and heap[2 * k + 1].

Ah, should have thought of that one. While it's not quite as elegant as
the standard scheme, using heap[2*k + 1] and heap[2*k + 2] works just
fine with 0-offset arrays and and doesn't involve weird messing with
pointers.

>> Having said that, I'll mention that I have occasion to use what amount
>> to offset 1 arrays on a current project: I'm obtaining altimeter data
>> from an altimeter that has a parameter that goes from 1 to 9; it seems
>> less error-prone to me to use a 10 element array and just waste element
>> 0 than to mess with macros or other code to add and subract 1 from an
>> index in multiple places.

>
> I often have arrays that naturally start at 1 and do the same
> as you: waste the array entry at index 0. Sometimes I do not
> have the choice.

And, of course, simply rooting your heap at heap[1] and wasting heap[0]
works just fine here as well.

>> But you'll notice that this approach to it
>> doesn't depend on tricky code having undefined behavior do the right
>> thing.

>
> That is why I asked.

Joe Pfeiffer, Jul 10, 2013