Test if const_iterator may be dereferenced - with no direct accessto original vector.

Discussion in 'C++' started by mathog, May 1, 2013.

  1. mathog

    mathog Guest

    What does one do in this situation:

    ....
    Glib::ustring::const_iterator icc;
    ....
    icc = _spans[lastspan].input_stream_first_character;
    ....
    // need a test here to see if the next line is safe
    if(*icc){

    In the textbook examples one has both the vector and the iterator, so
    the test can be rolled together on one line like:

    if(icc != avector.end() && *icc)

    In this case it isn't entirely clear which vector that iterator is
    referencing. (Because _spans hangs onto the iterator but does not
    store the vector, at least not publicly.) It might (might!) be possible
    to hunt the vector down, by chasing backwards through half a dozen
    objects to find it, but why should that be necessary? Is there not in
    C++ something like:

    if(icc->dereferencable()){

    ?

    This came up in a situation where an empty text span was embedded
    between others with characters. So on the 3rd span (or whatever it was)
    the value of icc was set to a non-dereferencable value from the get go,
    that value having been stored there long ago and far away in the code.

    Of course without the missing test the program segfaulted when it tried
    to dereference the const_iterator for this empty span.

    I suppose that the desired result could be accomplished with try/catch,
    but wonder if C++ iterators do not in general have some method for doing
    this.

    Thank you,

    David Mathog
    mathog, May 1, 2013
    #1
    1. Advertising

  2. Re: Test if const_iterator may be dereferenced - with no directaccess to original vector.

    On Wed, 01 May 2013 13:57:36 -0700, mathog wrote:

    > What does one do in this situation:
    >
    > ...
    > Glib::ustring::const_iterator icc;
    > ...
    > icc = _spans[lastspan].input_stream_first_character;
    > ...
    > // need a test here to see if the next line is safe if(*icc){
    >
    > In the textbook examples one has both the vector and the iterator, so
    > the test can be rolled together on one line like:
    >
    > if(icc != avector.end() && *icc)
    >
    > In this case it isn't entirely clear which vector that iterator is
    > referencing. (Because _spans hangs onto the iterator but does not store
    > the vector, at least not publicly.)


    Within the concept of C++ iterators, you always need *two* iterators: one
    to indicate the current position and another to indicate the end of the
    range. And although it is common for the end of a range to coincide with
    the end of a container, this is by no means part of the concept of
    iterators.
    For that reason, the common solution would be:

    ...
    Glib::ustring::const_iterator icc, end;
    ...
    icc = _spans[lastspan].input_stream_first_character;
    end = _spans[lastspan].input_stream_end;
    ...
    // need a test here to see if the next line is safe if(*icc){
    if (icc != end && *icc) /* do something */

    The important change here is that a span knows where it ends. For the
    calling code, it does not matter if that end coincides with the end of a
    vector, or if that end happens to be the start of the next span.

    > It might (might!) be possible to
    > hunt the vector down, by chasing backwards through half a dozen objects
    > to find it, but why should that be necessary? Is there not in C++
    > something like:
    >
    > if(icc->dereferencable()){
    >
    > ?


    There are several problems with requiring such a function.
    First of all, the function can't tell if the iterator is still within the
    range it is meant to iterate over, because ranges are not required to end
    on a non-dereferenceable iterator.
    Secondly, iterators are meant to be lightweight objects. Not much more
    than a pointer or a wrapper around one with knowledge how to access the
    next element. As such, determining dereferenceability becomes as hard as
    determining dereferencability for a plain pointer, which means
    practically impossible.

    >
    > This came up in a situation where an empty text span was embedded
    > between others with characters. So on the 3rd span (or whatever it was)
    > the value of icc was set to a non-dereferencable value from the get go,
    > that value having been stored there long ago and far away in the code.
    >
    > Of course without the missing test the program segfaulted when it tried
    > to dereference the const_iterator for this empty span.
    >
    > I suppose that the desired result could be accomplished with try/catch,
    > but wonder if C++ iterators do not in general have some method for doing
    > this.


    As the segfault was the result of undefined behaviour, try/catch would
    not have reliably helped you.
    The general method for checking if an iterator is still within range is
    to test if it has not reached the end iterator for that range yet.

    >
    > Thank you,
    >
    > David Mathog


    Bart van Ingen Schenau
    Bart van Ingen Schenau, May 2, 2013
    #2
    1. Advertising

  3. On 02.05.13 09.56, Andy Champ wrote:
    > If I then append to the string, so the buffer now contains
    >
    > "ABCdefghijklmnop"
    >
    > Without any change whatsoever to the iterator it has now become valid -
    > it points at d.


    no, this is undefined behavior. Changing a vector or string invalidates
    all existing iterators of this instance. You must consider that the
    append operation could require a reallocation.

    However, your answer that you can't check whether an iterator is valid
    and dereferencable is right. I think this has mainly be done for
    performance reasons. In C++ iterators are intended to be very cheap to
    copy. In many cases they are only one machine size word in size.

    In other languages like Java iterators are heap objects. It doesn't
    matter whether they are a few bytes larger or not.


    Marcel
    Marcel Müller, May 2, 2013
    #3
  4. mathog

    James Kanze Guest

    Re: Test if const_iterator may be dereferenced - with no directaccess to original vector.

    On Thursday, 2 May 2013 08:56:10 UTC+1, Andy Champ wrote:
    > On 01/05/2013 21:57, mathog wrote:


    > > Is there not in C++ something like:


    > > if(icc->dereferencable()){


    > No, there isn't, and for good reasons.


    The good reason is probably because there's no way of
    implementing it, given that you need a second iterator to know
    whether you're at the end or not. Every iterator I wrote before
    STL came along supported something like this (usually
    icc.isValid()).

    > Inside a std::string there's usually a buffer containing the string. (I
    > don't think there _has_ to be, but that's another matter (1) ). That
    > string is a load of characters, usually bytes.


    > Imagine I have an internal vector, which for efficiency the string code
    > has initially allocated as 16 bytes even though it only contains "ABC".
    > I'll use ? as a marker for "undefined". The bytes are then


    > "ABC?????????????"


    > An iterator to C can be de-referenced, but if you increment it you get
    > one that cannot be de-referenced. There's nothing about that ? that


    > marks it as something that can't be accessed. Without
    > accessing the original collection there's nothing the iterator
    > can use either - and for reasons I don't know the original STL
    > design doesn't contain references from iterators into the
    > collection (2). And if you _do_ de-reference it you'll just
    > get whatever character happens to be in the first question
    > mark.


    With most modern implementations, you'll get an assertion
    failure. (At least, this is the case with VC++ and g++.)

    > If I then append to the string, so the buffer now contains


    > "ABCdefghijklmnop"


    > Without any change whatsoever to the iterator it has now become valid -
    > it points at d.


    What happens in this case is undefined behavior. I suspect,
    however, that most implementations would miss that error
    (supposing that capacity() had been larger than the new string).

    > I can then set it to end(). Typically this will be an address one more
    > than p. Again, without reference to the collection you can't tell if
    > it's valid. And if you do de-reference it - well, you might get the byte
    > that follows p. Or that page in the processor's memory space might not
    > have been allocated, and you get an exception. So once more you are in
    > the realms of undefined behaviour.


    > (1) I just checked. In C++98 there's no requirement for there to be an
    > internal buffer, but for C++11 there is!


    > (2) But I can guess. Suppose the collection was on the heap, and was
    > deleted? Suppose the iterator was re-pointed into a different
    > collection? And if the collection was deleted, and another one of the
    > same type created in the same heap location, what then?


    In most implementations, iterators register with the container,
    so that they can be marked as invalid in such cases.

    --
    James
    James Kanze, May 2, 2013
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. matt
    Replies:
    10
    Views:
    3,320
    Grant Wagner
    Sep 2, 2005
  2. haig
    Replies:
    15
    Views:
    27,061
    java_programmer
    Jan 11, 2006
  3. B. Penn
    Replies:
    6
    Views:
    559
    Old Wolf
    Aug 9, 2004
  4. Replies:
    8
    Views:
    1,913
    Csaba
    Feb 18, 2006
  5. m0shbear
    Replies:
    17
    Views:
    1,607
Loading...

Share This Page