Checking the available range while iterating through a string

Discussion in 'C++' started by Ángel José Riesgo, Feb 16, 2011.

  1. Hi,

    I'm writing some code that parses a string and tries to find some
    tokens and extract some data from the string. The problem is simple
    and the code I've just written works fine. However, I need some ugly
    casts to get rid of a signed/unsigned mismatch warning, and I was
    wondering if there may be a more elegant way of doing this. A dumbed
    down version of my code follows:

    #include <string>

    const std::string kToken = "TOKEN";

    void FindToken(std::string::const_iterator& position,
    std::string::const_iterator end)
    {
    std::string::size_type tokenLength = kToken.length();
    if(tokenLength <= (end - position)) // <- Signed - unsigned
    conversion warning here.
    {
    std::string expectedToken(position, position + tokenLength);
    if(expectedToken == kToken)
    position += tokenLength; // The token has been found and the
    iterator is advanced before returning.
    }
    }

    Basically, I've moved the bit of code I'm interested in to the above
    function FindToken, which tries to find a certain token ("TOKEN"), and
    advances the "position" iterator by the token's length if it is found.
    Otherwise, the function returns quietly leaving the iterator
    unchanged. In the actual code, I can assume that the two iterators
    come from the same string object and that position <= end.

    Now the problem with the above code (building it with MSVC 10) is that
    I get a warning because of the conversion between the signed type
    returned by the (end - position) iterator subtraction and the
    std::string::size_type unsigned integer type. I can
    static_cast<std::string::size_type> the warning away, of course, but
    it's a bit ugly, so I was wondering if anyone knows of a way of doing
    this sort of thing without either warnings or casts.

    Thanks in advance,

    Ángel José Riesgo
    Ángel José Riesgo, Feb 16, 2011
    #1
    1. Advertising

  2. Ángel José Riesgo

    itaj sherman Guest

    On Feb 16, 9:41 pm, Ángel José Riesgo <> wrote:
    > Hi,
    >
    > I'm writing some code that parses a string and tries to find some
    > tokens and extract some data from the string. The problem is simple
    > and the code I've just written works fine. However, I need some ugly
    > casts to get rid of a signed/unsigned mismatch warning, and I was
    > wondering if there may be a more elegant way of doing this. A dumbed
    > down version of my code follows:
    >
    > #include <string>
    >
    > const std::string kToken = "TOKEN";
    >
    > void FindToken(std::string::const_iterator& position,
    > std::string::const_iterator end)
    > {
    > std::string::size_type tokenLength = kToken.length();
    > if(tokenLength <= (end - position)) // <- Signed - unsigned
    > conversion warning here.
    > {
    > std::string expectedToken(position, position + tokenLength);
    > if(expectedToken == kToken)
    > position += tokenLength; // The token has been found and the
    > iterator is advanced before returning.
    > }
    >
    > }
    >


    I think you should either work with indexs or with iterators. You're
    trying to mix both.

    With indexs use:
    string::length()
    string::at()
    string::size_type

    with indexs use:
    string::begin()
    string::end()
    string::iterator
    string::iterator::difference_type

    Most of the code uses iterators. You just need to fix 1 line:

    //std::string::size_type tokenLength = kToken.length();
    std::string::const_iterator::difference_type const tokenLength
    = ( kToken.end() - kToken.begin() );

    Then your concepts will match, and you won't be needing any type
    conversions.

    itaj
    itaj sherman, Feb 16, 2011
    #2
    1. Advertising

  3. Ángel José Riesgo

    itaj sherman Guest

    On Feb 17, 12:42 am, itaj sherman <> wrote:
    > On Feb 16, 9:41 pm, Ángel José Riesgo <> wrote:
    >


    >
    > With indexs use:
    > string::length()
    > string::at()
    > string::size_type
    >
    > with indexs use:


    Yeah, obviously I meant "with iterators use:"

    > string::begin()
    > string::end()
    > string::iterator
    > string::iterator::difference_type
    >


    itaj
    itaj sherman, Feb 16, 2011
    #3
  4. Ángel José Riesgo

    itaj sherman Guest

    On Feb 17, 12:42 am, itaj sherman <> wrote:
    > On Feb 16, 9:41 pm, Ángel José Riesgo <> wrote:
    >
    >


    Sorry, I type faster than I think:

    >
    > string::iterator::difference_type
    >


    that's std::string::difference_type

    >
    > //std::string::size_type tokenLength = kToken.length();
    > std::string::const_iterator::difference_type const tokenLength
    > = ( kToken.end() - kToken.begin() );
    >


    that's
    std::string::difference_type const tokenLength
    = ( kToken.end() - kToken.begin() );


    Oh, and I don't mean always ever use either iteroator or indexes. I
    mean don't mix them in wrong ways.

    itaj
    itaj sherman, Feb 16, 2011
    #4
  5. On Feb 16, 10:17 pm, Leigh Johnston <> wrote:
    > On 16/02/2011 20:11, Paavo Helde wrote:
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > Ingel José Riesgo<>  wrote in
    > >news::

    >
    > >> Hi,

    >
    > >> I'm writing some code that parses a string and tries to find some
    > >> tokens and extract some data from the string. The problem is simple
    > >> and the code I've just written works fine. However, I need some ugly
    > >> casts to get rid of a signed/unsigned mismatch warning, and I was
    > >> wondering if there may be a more elegant way of doing this. A dumbed
    > >> down version of my code follows:

    >
    > >> #include<string>

    >
    > >> const std::string kToken = "TOKEN";

    >
    > >> void FindToken(std::string::const_iterator&  position,
    > >> std::string::const_iterator end)
    > >> {
    > >>       std::string::size_type tokenLength = kToken.length();
    > >>       if(tokenLength<= (end - position)) //<- Signed - unsigned
    > >> conversion warning here.
    > >>       {
    > >>            std::string expectedToken(position, position + tokenLength);
    > >>            if(expectedToken == kToken)
    > >>                 position += tokenLength; // The token has been found
    > >>                 and the
    > >> iterator is advanced before returning.
    > >>       }
    > >> }

    >
    > >> Basically, I've moved the bit of code I'm interested in to the above
    > >> function FindToken, which tries to find a certain token ("TOKEN"), and
    > >> advances the "position" iterator by the token's length if it is found.
    > >> Otherwise, the function returns quietly leaving the iterator
    > >> unchanged. In the actual code, I can assume that the two iterators
    > >> come from the same string object and that position<= end.

    >
    > >> Now the problem with the above code (building it with MSVC 10) is that
    > >> I get a warning because of the conversion between the signed type
    > >> returned by the (end - position) iterator subtraction and the
    > >> std::string::size_type unsigned integer type. I can
    > >> static_cast<std::string::size_type>  the warning away, of course, but
    > >> it's a bit ugly, so I was wondering if anyone knows of a way of doing
    > >> this sort of thing without either warnings or casts.

    >
    > >> Thanks in advance,

    >
    > > The warning can be avoided by writing:

    >
    > >    if(position+tokenLength<= end)

    >
    > This looks dubious: position + tokenLength could be an invalid iterator
    > if it is past std::string::end().  The static_cast version is fine as
    > there is nothing wrong with using the C++ style casts.
    >


    Thanks for your suggestions. I forgot to mention that I had actually
    tried the if(position+tokenLength<= end) approach. That compiles
    without any warnings, but then I was bitten by a run-time assertion
    coming from the checked iterators because, as Leigh explains, the
    addition operation was taking the value past the end iterator when the
    available range was too short.

    The C++ static_casts are not too bad, but I always feel that casts are
    somehow telling me that I'm being sloppy with the types. In another
    answer in this thread, itaj sherman has recommended comparing the
    iterator subtraction with another iterator subtraction (kToken.end() -
    kToken.begin()) so that the types match. I like the consistency of
    that approach, so I'm going to try that.

    >
    > > However, I would redesign this function so I could use
    > > std::string::compare() and avoid this explicit check and also a needless
    > > construction of the temporary string expectedToken.

    >
    > I agree.
    >
    > /Leigh


    I will look into that. Anyway, I'm not too worried about the temporary
    string. As long as the code is readable and robust (this is not a
    performance-intensive thing), it should be fine.

    Ángel José Riesgo
    Ángel José Riesgo, Feb 17, 2011
    #5
  6. On Feb 16, 11:59 pm, itaj sherman <> wrote:
    > On Feb 17, 12:42 am, itaj sherman <> wrote:
    >
    > > On Feb 16, 9:41 pm, Ángel José Riesgo <> wrote:

    >
    > Sorry, I type faster than I think:
    >
    >
    >
    > > string::iterator::difference_type

    >
    > that's std::string::difference_type
    >
    >
    >
    > > //std::string::size_type tokenLength = kToken.length();
    > > std::string::const_iterator::difference_type const tokenLength
    > >   = ( kToken.end() - kToken.begin() );

    >
    > that's
    > std::string::difference_type const tokenLength
    >   = ( kToken.end() - kToken.begin() );
    >
    > Oh, and I don't mean always ever use either iteroator or indexes. I
    > mean don't mix them in wrong ways.
    >
    > itaj


    Thanks. That's the sort of consistency I was looking for. Now that you
    mention it, it seems quite straightforward: comparing a subtraction of
    iterators with, well, a subtraction of iterators. It makes perfect
    sense.

    Ángel José Riesgo
    Ángel José Riesgo, Feb 17, 2011
    #6
  7. Ángel José Riesgo

    James Kanze Guest

    On Feb 16, 7:41 pm, Ángel José Riesgo <> wrote:

    > I'm writing some code that parses a string and tries to find some
    > tokens and extract some data from the string. The problem is simple
    > and the code I've just written works fine. However, I need some ugly
    > casts to get rid of a signed/unsigned mismatch warning, and I was
    > wondering if there may be a more elegant way of doing this. A dumbed
    > down version of my code follows:


    > #include <string>


    > const std::string kToken = "TOKEN";


    > void FindToken(std::string::const_iterator& position,
    > std::string::const_iterator end)
    > {
    > std::string::size_type tokenLength = kToken.length();
    > if(tokenLength <= (end - position)) // <- Signed - unsigned
    > conversion warning here.
    > {
    > std::string expectedToken(position, position + tokenLength);
    > if(expectedToken == kToken)
    > position += tokenLength; // The token has been found and the
    > iterator is advanced before returning.
    > }
    > }


    > Basically, I've moved the bit of code I'm interested in to the above
    > function FindToken, which tries to find a certain token ("TOKEN"), and
    > advances the "position" iterator by the token's length if it is found.
    > Otherwise, the function returns quietly leaving the iterator
    > unchanged. In the actual code, I can assume that the two iterators
    > come from the same string object and that position <= end.


    > Now the problem with the above code (building it with MSVC 10) is that
    > I get a warning because of the conversion between the signed type
    > returned by the (end - position) iterator subtraction and the
    > std::string::size_type unsigned integer type. I can
    > static_cast<std::string::size_type> the warning away, of course, but
    > it's a bit ugly, so I was wondering if anyone knows of a way of doing
    > this sort of thing without either warnings or casts.


    Using int instead of std::string::size_type should get rid of
    the error. But you're doing a lot of extra work; my version
    would be just:

    if ( static_cast<size_t>(end - position) >= kToken.size()
    && std::equal(kToken.begin(), kToken.end(), position) )
    position += kToken.size();

    (Here, you need the static_cast, because of a design flaw in the
    standard library; kToken.size() should return int.)

    --
    James Kanze
    James Kanze, Feb 17, 2011
    #7
  8. Ángel José Riesgo

    itaj sherman Guest

    On Feb 17, 6:18 pm, James Kanze <> wrote:
    > On Feb 16, 7:41 pm, Ángel José Riesgo <> wrote:
    >
    >
    >
    > > I'm writing some code that parses a string and tries to find some
    > > tokens and extract some data from the string. The problem is simple
    > > and the code I've just written works fine. However, I need some ugly
    > > casts to get rid of a signed/unsigned mismatch warning, and I was
    > > wondering if there may be a more elegant way of doing this. A dumbed
    > > down version of my code follows:
    > > #include <string>
    > > const std::string kToken = "TOKEN";
    > > void FindToken(std::string::const_iterator& position,
    > > std::string::const_iterator end)
    > > {
    > >         std::string::size_type tokenLength = kToken.length();
    > >         if(tokenLength <= (end - position)) // <- Signed - unsigned
    > > conversion warning here.
    > >         {
    > >                 std::string expectedToken(position, position + tokenLength);
    > >                 if(expectedToken == kToken)
    > >                         position += tokenLength; // The token has been found and the
    > > iterator is advanced before returning.
    > >         }
    > > }
    > > Basically, I've moved the bit of code I'm interested in to the above
    > > function FindToken, which tries to find a certain token ("TOKEN"), and
    > > advances the "position" iterator by the token's length if it is found.
    > > Otherwise, the function returns quietly leaving the iterator
    > > unchanged. In the actual code, I can assume that the two iterators
    > > come from the same string object and that position <= end.
    > > Now the problem with the above code (building it with MSVC 10) is that
    > > I get a warning because of the conversion between the signed type
    > > returned by the (end - position) iterator subtraction and the
    > > std::string::size_type unsigned integer type. I can
    > > static_cast<std::string::size_type> the warning away, of course, but
    > > it's a bit ugly, so I was wondering if anyone knows of a way of doing
    > > this sort of thing without either warnings or casts.

    >
    > Using int instead of std::string::size_type should get rid of
    > the error.  But you're doing a lot of extra work; my version
    > would be just:
    >
    >     if ( static_cast<size_t>(end - position) >= kToken.size()
    >             && std::equal(kToken.begin(), kToken.end(), position) )
    >         position += kToken.size();
    >
    > (Here, you need the static_cast, because of a design flaw in the
    > standard library; kToken.size() should return int.)
    >
    > --
    > James Kanze


    On Feb 17, 6:18 pm, James Kanze <> wrote:
    > On Feb 16, 7:41 pm, Ángel José Riesgo <> wrote:
    >


    >
    > (Here, you need the static_cast, because of a design flaw in the
    > standard library; kToken.size() should return int.)
    >


    kToken.size() should return int specifically?
    What if int is smaller than std::string::difference_type?

    Does in mean there shouldn't be a container::size_type, and it should
    all be container::difference_type?
    Is there any good reason why there is such a distinction in the
    standard containers?

    itaj
    itaj sherman, Feb 17, 2011
    #8
  9. Ángel José Riesgo

    itaj sherman Guest

    On Feb 17, 6:47 pm, Leigh Johnston <> wrote:
    > On 17/02/2011 16:44, itaj sherman wrote:
    >
    > Kanze is trolling; ignore him.
    >
    > /Leigh


    When I read comp.lang.c++.*, Kanze is one of the few people that I go
    around looking specifically for their posts. I usually find his
    answers and opinions ingeniously accurate and effective.

    I hope he ellaborates on what I asked about his post here.

    I don't see how your posts in this thread do any of that.
    I've noticed your "Troll meets Monolith" thread, and I think they fit
    better in that thread.
    If Kanze is a troll as you say I'm sure he would join your thread and
    you can converse about it there.

    itaj
    itaj sherman, Feb 17, 2011
    #9
  10. Ángel José Riesgo

    itaj sherman Guest

    On Feb 17, 7:07 pm, Leigh Johnston <> wrote:
    > On 17/02/2011 16:59, itaj sherman wrote:
    >
    >
    > > I don't see how your posts in this thread do any of that.

    >
    > In this thread I pointed out an error that somebody made whilst trying
    > to avoid a cast which is kind of ironic.  There is nothing wrong with
    > using the C++ style casts.
    >


    I was refferring to your replies under Kanze's answer, not to your
    first one.

    itaj
    itaj sherman, Feb 17, 2011
    #10
  11. Ángel José Riesgo

    Ian Collins Guest

    On 02/18/11 05:59 AM, itaj sherman wrote:
    > On Feb 17, 6:47 pm, Leigh Johnston<> wrote:
    >> On 17/02/2011 16:44, itaj sherman wrote:
    >>
    >> Kanze is trolling; ignore him.
    >>
    >> /Leigh

    >
    > When I read comp.lang.c++.*, Kanze is one of the few people that I go
    > around looking specifically for their posts. I usually find his
    > answers and opinions ingeniously accurate and effective.


    They invariably are, unlike some others around here.

    --
    Ian Collins
    Ian Collins, Feb 17, 2011
    #11
  12. Ángel José Riesgo

    itaj sherman Guest

    All containers in the standard have:
    container::difference_type, with signed integer semantics
    and also
    container::size_t, with unsigned semantics

    Because this is how it is in the standard containers, I always took it
    for granted that this is the right way to go. And that this
    distinction should be considered important.
    Apparently, many people think that this distinction is bad.

    There's a long discussion/argument about in this thread:

    http://groups.google.com/group/comp.lang.c /browse_frm/thread/ddf9b5acb66b7099#

    I guess I'll have to read more about it. Maybe the distinction is
    pointless.

    That would mean that, in the first place, container::size() should
    have return container::diffrence_type.
    And also that container::size() would be in both the iterator and
    indexs tool kits. Or maybe even that the iterator/index distinction
    shouldn't be a strict as I'm used to.

    kToken.size() would be exactly equivalent to ( kToken.end() -
    kToken.begin() ).

    In practice, many of these functions would receive only iterators
    (begin,end) as separate parameters, or a single range object. So the
    container::size() cannot be used anyway.

    itaj
    itaj sherman, Feb 17, 2011
    #12
  13. Ángel José Riesgo

    James Kanze Guest

    On Feb 17, 4:44 pm, itaj sherman <> wrote:
    > On Feb 17, 6:18 pm, James Kanze <> wrote:


    > > Using int instead of std::string::size_type should get rid of
    > > the error. But you're doing a lot of extra work; my version
    > > would be just:


    > > if ( static_cast<size_t>(end - position) >= kToken.size()
    > > && std::equal(kToken.begin(), kToken.end(), position) )
    > > position += kToken.size();


    > > (Here, you need the static_cast, because of a design flaw in the
    > > standard library; kToken.size() should return int.)


    > On Feb 17, 6:18 pm, James Kanze <> wrote:


    > > On Feb 16, 7:41 pm, Ángel José Riesgo <> wrote:


    > > (Here, you need the static_cast, because of a design flaw in the
    > > standard library; kToken.size() should return int.)


    > kToken.size() should return int specifically?


    That would have been a better design.

    > What if int is smaller than std::string::difference_type?


    Maybe. Then difference_type might be more appropriate.

    > Does in mean there shouldn't be a container::size_type, and it should
    > all be container::difference_type?


    The two should probably be identical. Otherwise, you end up
    with problems like yours.

    > Is there any good reason why there is such a distinction in the
    > standard containers?


    I believe that the STL was originally developped on a 16 bit PC,
    and the author wanted that extra bit. Although even on a 16 bit
    PC, the value is arguable, and the problems are legion.

    --
    James Kanze
    James Kanze, Feb 17, 2011
    #13
  14. On Feb 17, 5:18 pm, James Kanze <> wrote:
    > On Feb 16, 7:41 pm, Ángel José Riesgo <> wrote:
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > I'm writing some code that parses a string and tries to find some
    > > tokens and extract some data from the string. The problem is simple
    > > and the code I've just written works fine. However, I need some ugly
    > > casts to get rid of a signed/unsigned mismatch warning, and I was
    > > wondering if there may be a more elegant way of doing this. A dumbed
    > > down version of my code follows:
    > > #include <string>
    > > const std::string kToken = "TOKEN";
    > > void FindToken(std::string::const_iterator& position,
    > > std::string::const_iterator end)
    > > {
    > >         std::string::size_type tokenLength = kToken.length();
    > >         if(tokenLength <= (end - position)) // <- Signed - unsigned
    > > conversion warning here.
    > >         {
    > >                 std::string expectedToken(position, position + tokenLength);
    > >                 if(expectedToken == kToken)
    > >                         position += tokenLength; // The token has been found and the
    > > iterator is advanced before returning.
    > >         }
    > > }
    > > Basically, I've moved the bit of code I'm interested in to the above
    > > function FindToken, which tries to find a certain token ("TOKEN"), and
    > > advances the "position" iterator by the token's length if it is found.
    > > Otherwise, the function returns quietly leaving the iterator
    > > unchanged. In the actual code, I can assume that the two iterators
    > > come from the same string object and that position <= end.
    > > Now the problem with the above code (building it with MSVC 10) is that
    > > I get a warning because of the conversion between the signed type
    > > returned by the (end - position) iterator subtraction and the
    > > std::string::size_type unsigned integer type. I can
    > > static_cast<std::string::size_type> the warning away, of course, but
    > > it's a bit ugly, so I was wondering if anyone knows of a way of doing
    > > this sort of thing without either warnings or casts.

    >
    > Using int instead of std::string::size_type should get rid of
    > the error.  But you're doing a lot of extra work; my version
    > would be just:
    >
    >     if ( static_cast<size_t>(end - position) >= kToken.size()
    >             && std::equal(kToken.begin(), kToken.end(), position) )
    >         position += kToken.size();
    >
    > (Here, you need the static_cast, because of a design flaw in the
    > standard library; kToken.size() should return int.)
    >
    > --
    > James Kanze


    Thanks for the suggestion. I forgot about the std::equal algorithm,
    and it actually fits very nicely within my code, as it's mostly based
    on iterators and comparisons.

    Ángel José Riesgo
    Ángel José Riesgo, Feb 18, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fernando Rodríguez

    getting the index while iterating through a list

    Fernando Rodríguez, May 12, 2004, in forum: Python
    Replies:
    4
    Views:
    423
    Steven Rumbalski
    May 12, 2004
  2. Replies:
    4
    Views:
    392
    Steven D'Aprano
    Feb 26, 2007
  3. Alan
    Replies:
    5
    Views:
    1,027
    Grizlyk
    Feb 19, 2007
  4. =?iso-8859-1?q?Erik_Wikstr=F6m?=

    Erasing in a vector while iterating through it

    =?iso-8859-1?q?Erik_Wikstr=F6m?=, Jun 11, 2007, in forum: C++
    Replies:
    3
    Views:
    793
    Christopher Dearlove
    Jun 11, 2007
  5. Jerry Krinock
    Replies:
    7
    Views:
    296
    Eric Pozharski
    Sep 12, 2008
Loading...

Share This Page