Do vector iterators remain valid after resize() if base pointer ofelements remains unchanged?

zr · Mar 24, 2009

I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id:

ointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

zr · Mar 24, 2009

I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id:ointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;

}

Here is a corrected version that passes compilation:

int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

Victor Bazarov · Mar 24, 2009

zr said:
I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id:ointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;

}

Click to expand...

Here is a corrected version that passes compilation:

"Passes compilation"? Not on my compiler. '_TCHAR' is undefined.

int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

'std::vector' is very sensitive WRT its iterators. If reallocation
occurs, all iterators are invalidated. So, if your pointer comparison
is designed to see whether the reallocation has occurred (which it would
seem to do), then you're probably OK. I don't know much about the
systems on which loading an invalid pointer into a register can trip the
hardware signal, so I can't vouch for the cases when 'base' is compared
with the address of the first element after becoming invalid on such
systems. FAIK, it could be a trap. The only value guaranteed to work
(when compared with any other pointer) is a null pointer value.

Of course, hardware interrupt upon loading of a pointer value that just
became "invalid" is highly unlikely, so you're *most likely* fine.

V

Balog Pal · Mar 24, 2009

int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)

if reallocation happened base has an invalid pointer, so using it this way
(lvalue to rvalue conversion) is undefined behavior.

You may try to play with capacity() before resize()

Fred Zwarts · Mar 24, 2009

I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

I am not sure.
Could sorted_image_id.end() depend on the size of the vector even if no reallocation occurs?
If so, an iterator with this value before the resize may have a value after the
resize with at least a questionable value.

rep_movsd · Mar 24, 2009

A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason why
iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that faults
on invalid pointer assignment ( as opposed to dereference ) as clearly
evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
on Windows on x86. So I think musing about that is not fruitful.

Also since vector iterators are effectively pointers, there is nothing
special about end() , its just a pointer to data one element past the
last one.

The most sensible thing to do is to not use iterators but indexes...

There is really no significant performance difference between
accessing an element by an iterator or simply using the [] operator
with an index. [] gets inlined, its the compilers job....

Bo Persson · Mar 24, 2009

rep_movsd said:
A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason
why iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that
faults on invalid pointer assignment ( as opposed to dereference )
as clearly evidenced by the use of "_TCHAR*" - Its garden variety
MS Visual C++ on Windows on x86. So I think musing about that is
not fruitful.

There is nothing wrong in pointing out the undefined behaviour, should
the OP release his "working" code for others to reuse. Remenber that
the x86 does have checking in segmented mode, we are just not using
that most of the time.

Bo Persson

zr · Mar 25, 2009

There is nothing wrong in pointing out the undefined behaviour, should
the OP release his "working" code for others to reuse. Remenber that
the x86 does have checking in segmented mode, we are just not using
that most of the time.

Bo Persson

Now I'm just curious
On x86, two pointers are compared using "compare integer"
instructions, so if one the operands is actually a dangling pointer,
no exception would be raised. Is there any platform where this is not
the case?

Fred Zwarts · Mar 25, 2009

rep_movsd said:
A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason why
iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that faults
on invalid pointer assignment ( as opposed to dereference ) as clearly
evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
on Windows on x86. So I think musing about that is not fruitful.

Also since vector iterators are effectively pointers, there is nothing
special about end() , its just a pointer to data one element past the
last one.

But if end() was saved in an iterator, after the resize this iterator no longer equals end().
If that iterator was used as a shortcut for end(), this iterator can no longer be used for this purpose after the resize.
The iterator may still point to something, but no longer to the end of the vector.
It depends on the meaning of "valid" if that iterator is still valid.

James Kanze · Mar 25, 2009

[...]

Now I'm just curious On x86, two pointers are compared using
"compare integer" instructions, so if one the operands is
actually a dangling pointer, no exception would be raised. Is
there any platform where this is not the case?

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's). So something more is needed. The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it. If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

James Kanze · Mar 25, 2009

messagenews:1e21430a-6060-4167-a61a-df2248f8a83b@o36g2000yqh.googlegroups..com...

I hope the following example illustrates the question:
#include <vector>
int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);
// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...
return 0;
}

Click to expand...

I am not sure.
Could sorted_image_id.end() depend on the size of the vector
even if no reallocation occurs?

Or course it could.

If so, an iterator with this value before the resize may have
a value after the resize with at least a questionable value.

I think it's evident that he can only be talking about valid,
dereferenceable iterators. His code should work, but IMHO, it
doesn't look very idiomatic, and would raise questions in a code
review. I think something like:

bool iteratorsValid = array.capacity() >= newSize ;
array.resize( newSize ) ;
if ( iteratorsValid ) // ...

would be more typical and easier to understand.

zr · Mar 25, 2009

[...]

Now I'm just curious On x86, two pointers are compared using
"compare integer" instructions, so if one the operands is
actually a dangling pointer, no exception would be raised. Is
there any platform where this is not the case?

Click to expand...

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's). So something more is needed. The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it. If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

--
James Kanze (GABI Software) email:[email protected]
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

But sizeof(void*) will return 4 bytes, which is 32 bits, so how
exactly does the language implementor support far pointers?

rep_movsd · Mar 26, 2009

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's). So something more is needed. The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it. If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

Not true.....
on 32 bit compilers sizeof(void*) == sizeof(int)
I've never heard of 48 bit pointers.
The memory model for 32 bit protected mode from the application and
compilers point of view is flat.

On 64 bit compilers this int/void* guarantee is broken...

Loading an invalid address into a register will not trap!

James Kanze · Mar 26, 2009

But sizeof(void*) will return 4 bytes, which is 32 bits, so
how exactly does the language implementor support far
pointers?

That depends on the implementation. I've used C++ compilers for
the 80x86 where sizeof(void*) was 6. (And sizeof(long) was 4,
and there was no long long, so there was no integral type large
enough to handle a pointer.) And of course, sizeof(void*) could
be 4 on the earlier 16 bit processors.

James Kanze · Mar 26, 2009

Not true.....

Obviously, you aren't familiar with the Intel architecture.

on 32 bit compilers sizeof(void*) == sizeof(int)
I've never heard of 48 bit pointers.

And I've used them. On an Intel 80386.

The memory model for 32 bit protected mode from the
application and compilers point of view is flat.

Only if the OS artificially chooses to restrict the users to
that.

On 64 bit compilers this int/void* guarantee is broken...

Loading an invalid address into a register will not trap!

It does on an Intel 80x86. I've actually seen it happen.

zr · Mar 26, 2009

That depends on the implementation. I've used C++ compilers for
the 80x86 where sizeof(void*) was 6. (And sizeof(long) was 4,
and there was no long long, so there was no integral type large
enough to handle a pointer.) And of course, sizeof(void*) could
be 4 on the earlier 16 bit processors.

--
James Kanze (GABI Software) email:[email protected]
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Very interesting, thanks.

rep_movsd · Mar 27, 2009

Obviously, you aren't familiar with the Intel architecture.

And I've used them. On an Intel 80386.

All Right, I heard of them now...

Only if the OS artificially chooses to restrict the users to
that.

Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ), I don't get how 48 bit
addresses would be beneficial. Could you elaborate ( since you
actually used them ) ?

It does on an Intel 80x86. I've actually seen it happen.

Which instruction(s) specifically?

Balog Pal · Mar 28, 2009

rep_movsd said:
All Right, I heard of them now...

Guess Intel's architecture manuals and data sheets are available for DL to
anyone actually interested to learn how stuff works.

Well, considering that a 32 bit CPU can never address more than that

anyway ( forget the 36 bit PAE shenanigans ),

Says who? Normally the X-bit means the width of the processor's data bus.
Some other times the general-purpose registers. It has absolutely nothing to
do with the address bus width (-> accessible phisical memory) or the
internal address space.

I don't get how 48 bit addresses would be beneficial. Could you elaborate
( since you

actually used them ) ?

For the discussed architecture the 48 bit refers to the pointer not the
address. The pointer has a 16 bit selector and a 32 bit offset. The
selector has a few control bits, the rest is used to select a *descriptor*
(from the descriptor table), that defines a memory segment in the virtual
address space with its base, length, access rights, etc.

Which instruction(s) specifically?

anything that loads the segment

Windows 3.1 used the segmented 16 bit model, with 32 bit pointers having
selector+16 bit offset. To pass a pointer value to a function sometimes it
used

void foo(void * vp);

void * p; // not inited
foo(p);
--> assy as

LES bx, [_p]
push bx
push es
call _foo

that even made sense if opt for space was asked for, it takes 1 byte less of
opcode...
Certainly if the loaded value was not a valid selector, it causes a trap,
and the famous 'unrecoverable application error'.

James Kanze · Mar 28, 2009

All Right, I heard of them now...

Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ),

Why? Modern Intel processors do have a 36 bit physical memory
bus. But that's not the point; different segments can be mapped
in different manners.

I don't get how 48 bit addresses would be beneficial. Could
you elaborate ( since you actually used them ) ?

In one case, there was no disk, so no virtual memory;
segmentation was the protection mechanism. In another, the "OS"
was MS-DOS, which at the time only ran in native, 16 bit mode;
with no support for virtual memory, again, segments were used to
go beyond the 1 MB supported by the OS.

Which instruction(s) specifically?

Any instruction which loads a segment register: LDS, LES, LFS,
LGS, and possibly some others. This goes back 25 or 30 years
ago, so I've forgotten the details. But the argument given in
the C committee for having just reading an invalide pointer be
undefined behavior was based on the Intel (and other machines at
the time which had "address" registers, but I've no experience
with those).

Vector is deleted after resize()	9	May 24, 2007
Problem with static downcast of base type to derived type	12	Jul 2, 2008
Problem with sorting a vector	9	Oct 31, 2006
vector/array advice please	4	Sep 23, 2003
Casting base * to derived *	6	Aug 30, 2006
remove_if and member function pointers	3	Nov 16, 2006
optimizing function	0	Feb 26, 2009
several newbie questions about stl	10	Feb 8, 2009

Do vector iterators remain valid after resize() if base pointer ofelements remains unchanged?

zr

zr

Victor Bazarov

Balog Pal

Fred Zwarts

rep_movsd

Bo Persson

zr

Fred Zwarts

James Kanze

James Kanze

zr

rep_movsd

James Kanze

James Kanze

zr

rep_movsd

Balog Pal

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads