Do vector iterators remain valid after resize() if base pointer ofelements remains unchanged?

Z

zr

I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id::pointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}
 
Z

zr

I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
        std::vector<int> sorted_image_id;
        //..
        sorted_image_id::pointer base = &sorted_image_id.front();
        sorted_image_id.resize(sorted_image_id().size()*4);

        if (&sorted_image_id.front() == base)
        {
                //Is it safe to assume all iterators are still valid?
        } else
        {
                //Need to reassign iterators
        }
        //...

        return 0;

}

Here is a corrected version that passes compilation:

int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}
 
V

Victor Bazarov

zr said:
I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id::pointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;

}

Here is a corrected version that passes compilation:

"Passes compilation"? Not on my compiler. '_TCHAR' is undefined.
int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

'std::vector' is very sensitive WRT its iterators. If reallocation
occurs, all iterators are invalidated. So, if your pointer comparison
is designed to see whether the reallocation has occurred (which it would
seem to do), then you're probably OK. I don't know much about the
systems on which loading an invalid pointer into a register can trip the
hardware signal, so I can't vouch for the cases when 'base' is compared
with the address of the first element after becoming invalid on such
systems. FAIK, it could be a trap. The only value guaranteed to work
(when compared with any other pointer) is a null pointer value.

Of course, hardware interrupt upon loading of a pointer value that just
became "invalid" is highly unlikely, so you're *most likely* fine.

V
 
B

Balog Pal

int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)

if reallocation happened base has an invalid pointer, so using it this way
(lvalue to rvalue conversion) is undefined behavior.

You may try to play with capacity() before resize()
 
F

Fred Zwarts

I hope the following example illustrates the question:

#include <vector>


int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

I am not sure.
Could sorted_image_id.end() depend on the size of the vector even if no reallocation occurs?
If so, an iterator with this value before the resize may have a value after the
resize with at least a questionable value.
 
R

rep_movsd

A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason why
iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that faults
on invalid pointer assignment ( as opposed to dereference ) as clearly
evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
on Windows on x86. So I think musing about that is not fruitful.

Also since vector iterators are effectively pointers, there is nothing
special about end() , its just a pointer to data one element past the
last one.

The most sensible thing to do is to not use iterators but indexes...

There is really no significant performance difference between
accessing an element by an iterator or simply using the [] operator
with an index. [] gets inlined, its the compilers job....
 
B

Bo Persson

rep_movsd said:
A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason
why iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that
faults on invalid pointer assignment ( as opposed to dereference )
as clearly evidenced by the use of "_TCHAR*" - Its garden variety
MS Visual C++ on Windows on x86. So I think musing about that is
not fruitful.

There is nothing wrong in pointing out the undefined behaviour, should
the OP release his "working" code for others to reuse. Remenber that
the x86 does have checking in segmented mode, we are just not using
that most of the time.


Bo Persson
 
Z

zr

There is nothing wrong in pointing out the undefined behaviour, should
the OP release his "working" code for others to reuse. Remenber that
the x86 does have checking in segmented mode, we are just not using
that most of the time.

Bo Persson

Now I'm just curious
On x86, two pointers are compared using "compare integer"
instructions, so if one the operands is actually a dangling pointer,
no exception would be raised. Is there any platform where this is not
the case?
 
F

Fred Zwarts

rep_movsd said:
A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason why
iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that faults
on invalid pointer assignment ( as opposed to dereference ) as clearly
evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
on Windows on x86. So I think musing about that is not fruitful.

Also since vector iterators are effectively pointers, there is nothing
special about end() , its just a pointer to data one element past the
last one.

But if end() was saved in an iterator, after the resize this iterator no longer equals end().
If that iterator was used as a shortcut for end(), this iterator can no longer be used for this purpose after the resize.
The iterator may still point to something, but no longer to the end of the vector.
It depends on the meaning of "valid" if that iterator is still valid.
 
J

James Kanze

[...]
Now I'm just curious On x86, two pointers are compared using
"compare integer" instructions, so if one the operands is
actually a dangling pointer, no exception would be raised. Is
there any platform where this is not the case?

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's). So something more is needed. The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it. If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.
 
J

James Kanze

messagenews:1e21430a-6060-4167-a61a-df2248f8a83b@o36g2000yqh.googlegroups..com...
I hope the following example illustrates the question:
#include <vector>
int _tmain(int argc, _TCHAR* argv[])
{
float resize_factor = 1.3;
std::vector<int> sorted_image_id;
//..
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);
// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...
return 0;
}
I am not sure.
Could sorted_image_id.end() depend on the size of the vector
even if no reallocation occurs?

Or course it could.
If so, an iterator with this value before the resize may have
a value after the resize with at least a questionable value.

I think it's evident that he can only be talking about valid,
dereferenceable iterators. His code should work, but IMHO, it
doesn't look very idiomatic, and would raise questions in a code
review. I think something like:

bool iteratorsValid = array.capacity() >= newSize ;
array.resize( newSize ) ;
if ( iteratorsValid ) // ...

would be more typical and easier to understand.
 
Z

zr

    [...]
Now I'm just curious On x86, two pointers are compared using
"compare integer" instructions, so if one the operands is
actually a dangling pointer, no exception would be raised. Is
there any platform where this is not the case?

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's).  So something more is needed.  The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it.  If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

--
James Kanze (GABI Software)             email:[email protected]
Conseils en informatique orientée objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

But sizeof(void*) will return 4 bytes, which is 32 bits, so how
exactly does the language implementor support far pointers?
 
R

rep_movsd

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's).  So something more is needed.  The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it.  If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

Not true.....
on 32 bit compilers sizeof(void*) == sizeof(int)
I've never heard of 48 bit pointers.
The memory model for 32 bit protected mode from the application and
compilers point of view is flat.

On 64 bit compilers this int/void* guarantee is broken...

Loading an invalid address into a register will not trap!
 
J

James Kanze

But sizeof(void*) will return 4 bytes, which is 32 bits, so
how exactly does the language implementor support far
pointers?

That depends on the implementation. I've used C++ compilers for
the 80x86 where sizeof(void*) was 6. (And sizeof(long) was 4,
and there was no long long, so there was no integral type large
enough to handle a pointer.) And of course, sizeof(void*) could
be 4 on the earlier 16 bit processors.
 
J

James Kanze

Not true.....

Obviously, you aren't familiar with the Intel architecture.
on 32 bit compilers sizeof(void*) == sizeof(int)
I've never heard of 48 bit pointers.

And I've used them. On an Intel 80386.
The memory model for 32 bit protected mode from the
application and compilers point of view is flat.

Only if the OS artificially chooses to restrict the users to
that.
On 64 bit compilers this int/void* guarantee is broken...
Loading an invalid address into a register will not trap!

It does on an Intel 80x86. I've actually seen it happen.
 
Z

zr

That depends on the implementation.  I've used C++ compilers for
the 80x86 where sizeof(void*) was 6.  (And sizeof(long) was 4,
and there was no long long, so there was no integral type large
enough to handle a pointer.)  And of course, sizeof(void*) could
be 4 on the earlier 16 bit processors.

--
James Kanze (GABI Software)             email:[email protected]
Conseils en informatique orientée objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Very interesting, thanks.
 
R

rep_movsd

Obviously, you aren't familiar with the Intel architecture.
And I've used them.  On an Intel 80386.

All Right, I heard of them now...
Only if the OS artificially chooses to restrict the users to
that.

Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ), I don't get how 48 bit
addresses would be beneficial. Could you elaborate ( since you
actually used them ) ?
It does on an Intel 80x86.  I've actually seen it happen.

Which instruction(s) specifically?
 
B

Balog Pal

rep_movsd said:
All Right, I heard of them now...

Guess Intel's architecture manuals and data sheets are available for DL to
anyone actually interested to learn how stuff works.
Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ),

Says who? Normally the X-bit means the width of the processor's data bus.
Some other times the general-purpose registers. It has absolutely nothing to
do with the address bus width (-> accessible phisical memory) or the
internal address space.
I don't get how 48 bit addresses would be beneficial. Could you elaborate
( since you
actually used them ) ?

For the discussed architecture the 48 bit refers to the pointer not the
address. The pointer has a 16 bit selector and a 32 bit offset. The
selector has a few control bits, the rest is used to select a *descriptor*
(from the descriptor table), that defines a memory segment in the virtual
address space with its base, length, access rights, etc.
Which instruction(s) specifically?

anything that loads the segment

Windows 3.1 used the segmented 16 bit model, with 32 bit pointers having
selector+16 bit offset. To pass a pointer value to a function sometimes it
used

void foo(void * vp);

void * p; // not inited
foo(p);
--> assy as

LES bx, [_p]
push bx
push es
call _foo

that even made sense if opt for space was asked for, it takes 1 byte less of
opcode...
Certainly if the loaded value was not a valid selector, it causes a trap,
and the famous 'unrecoverable application error'.
 
J

James Kanze

All Right, I heard of them now...
Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ),

Why? Modern Intel processors do have a 36 bit physical memory
bus. But that's not the point; different segments can be mapped
in different manners.
I don't get how 48 bit addresses would be beneficial. Could
you elaborate ( since you actually used them ) ?

In one case, there was no disk, so no virtual memory;
segmentation was the protection mechanism. In another, the "OS"
was MS-DOS, which at the time only ran in native, 16 bit mode;
with no support for virtual memory, again, segments were used to
go beyond the 1 MB supported by the OS.
Which instruction(s) specifically?

Any instruction which loads a segment register: LDS, LES, LFS,
LGS, and possibly some others. This goes back 25 or 30 years
ago, so I've forgotten the details. But the argument given in
the C committee for having just reading an invalide pointer be
undefined behavior was based on the Intel (and other machines at
the time which had "address" registers, but I've no experience
with those).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top