pointer past end of buffer

J

John Goche

A lot of C++ code allocates a buffer and initializes
start and end pointers as follows:

+-------------------------------+
+ +
+-------------------------------+
^ ^
| |
pStart pEnd

setting pEnd = pStart + bufLen

But what if the buffer is allocated at the very end of memory
and just fits. Then pEnd == MEM_MAX + 1 == 0 and so
library users could tamper with code by creating a buffer
of suitable size. Can this happen in practice?

JG
 
A

Alf P. Steinbach

* John Goche:
A lot of C++ code allocates a buffer and initializes
start and end pointers as follows:

+-------------------------------+
+ +
+-------------------------------+
^ ^
| |
pStart pEnd

setting pEnd = pStart + bufLen

But what if the buffer is allocated at the very end of memory
and just fits. Then pEnd == MEM_MAX + 1 == 0 and so
library users could tamper with code by creating a buffer
of suitable size. Can this happen in practice?

The wrapping can not be a /problem/ with a conforming compiler.

And in practice such wrapping will not (be allowed to) happen.

But theoretically a compiler could allow that and make you unaware that
it happens unless you do low-level machine-specific things to inspect
the bit patterns of pointers.
 
P

Phlip

John said:
A lot of C++ code allocates a buffer and initializes
start and end pointers as follows:

+-------------------------------+
+ +
+-------------------------------+
^ ^
| |
pStart pEnd

setting pEnd = pStart + bufLen

But what if the buffer is allocated at the very end of memory
and just fits. Then pEnd == MEM_MAX + 1 == 0 and so
library users could tamper with code by creating a buffer
of suitable size. Can this happen in practice?

The C++ Standard reputedly declares that pointing and indexing
one-off-the-end of an array is well-defined. (Copying out the value of that
bogus element is undefined, except if the element is a char, where it's
simply garbage.)

That means a C++ implementation may not, for example, place any array right
at the end of memory, such that its one-off-the-end location occupies an
overflowed pointer value, or a storage location protected by hardware.

This rule permits all the idioms you have noted, including all of STL's
"asymetric extents". The "start" of anything must be a valid element, and
the "end" must use -- to get to a valid element.

After you become familiar with this effect, it becomes vaguely elegant. But
also extremely useful!
 
J

John Goche

Alf said:
* John Goche:

The wrapping can not be a /problem/ with a conforming compiler.

Is there something in the C++ standard that states this?

Thanks,

JG
 
F

Frederick Gotham

JG Posted:
A lot of C++ code allocates a buffer and initializes
start and end pointers as follows:

+-------------------------------+
+ +
+-------------------------------+
^ ^
| |
pStart pEnd

setting pEnd = pStart + bufLen


Indeed.

size_t const buf_size = 512;

char unsigned *const p = (char unsigned*)malloc(buf_size);
char unsigned const *const pover = p + buf_size;

But what if the buffer is allocated at the very end of memory
and just fits. Then pEnd == MEM_MAX + 1 == 0


That's a possible way of doing it, yes.

and so library users could tamper with code by creating a buffer of
suitable size. Can this happen in practice?


I don't understand what you're saying. . . how could they tamper with code?

Phlip:
The C++ Standard reputedly declares that pointing and indexing
one-off-the-end of an array is well-defined. (Copying out the value of
that bogus element is undefined, except if the element is a char, where
it's simply garbage.)


That's incorrect; the behaviour of the following is undefined:

int main()
{
char buf[12];

buf[12];
}

That means a C++ implementation may not, for example, place any array
right at the end of memory, such that its one-off-the-end location
occupies an overflowed pointer value, or a storage location protected by
hardware.


The C++ Standard imposes no such restriction.

The whole "pointer to one past last" concept has been discussed in depth
many times. Things to note are:

(1) The null pointer value need not be represented by all bits zero.
(2) Pointer arithmetic need not be calculated internally in the same
fashion that unsigned arithmetic is (i.e. wrap-around overflow).
(3) The "pointer to one past last" may compare equal to null.

This leaves the door wide open for implementors, just so long as the code
behaves as it should.
 
J

John Goche

Jim said:

So I understand that for a buffer of length buflen > 0 we can
assume that p < q so long as q is set to a value q <= p + buflen.
In the case where we set q > p + buflen then it is not guaranteed
that p < q holds due to possible pointer overflow. Is this correct?

Thanks,

JG
 
R

Ron Natalie

Phlip said:
That means a C++ implementation may not, for example, place any array right
at the end of memory, such that its one-off-the-end location occupies an
overflowed pointer value, or a storage location protected by hardware.
It can be a protected location if the protection is limited to accessing
the memory at that location. If you get a trap for just having that
address in a pointer (very uncommon) well then it's not allowed.
 
A

Andrew Koenig

So I understand that for a buffer of length buflen > 0 we can

buflen >= 0 (which could happen if the buffer is dynamically allocated)
assume that p < q so long as q is set to a value q <= p + buflen.

we can assume that p <= q (because buflen might be 0) so long as
p <= q <= p + buflen (i.e. you can't have q < p and still expect p < q
:) )
In the case where we set q > p + buflen then it is not guaranteed
that p < q holds due to possible pointer overflow. Is this correct?

Correct. In that case you're not even assured that you can evaluate p<q.
 
A

Andrew Koenig

The C++ Standard reputedly declares that pointing and indexing
one-off-the-end of an array is well-defined. (Copying out the value of
that bogus element is undefined, except if the element is a char, where
it's simply garbage.)

Unsigned char, I think.
This rule permits all the idioms you have noted, including all of STL's
"asymetric extents". The "start" of anything must be a valid element, and
the "end" must use -- to get to a valid element.

The "start" of anything must be a valid element as long as it's not equal to
the "end", which is how you would indicate an empty sequence.

In other words, if c is an empty container, c.begin() == c.end() will be
true, but you are not assured of being able to evaluate *c.begin().
 
O

Old Wolf

Andrew said:
Correct. In that case you're not even assured that you can evaluate p<q.

5.9#2 says that p<q is unspecified in this case (ie. you can
evaluate it but it could evaluate to either true or false).

In the C language, the behaviour is undefined.
 
O

Old Wolf

Old said:
5.9#2 says that p<q is unspecified in this case (ie. you can
evaluate it but it could evaluate to either true or false).

Of course it is also undefined if q no longer points to a valid
object, which I believe we are currently debating in c.l.c ;)
 
R

Ron Natalie

Andrew said:
Unsigned char, I think.
It's still undefined.

You're mistaking the rule that says you can use a char pointer to
access all the ALLOCATED bytes comprising any object. Once you
get outside the bounds of an object, your in undefined land.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top