Checking the available range while iterating through a string

  • Thread starter Ángel José Riesgo
  • Start date
Á

Ángel José Riesgo

Hi,

I'm writing some code that parses a string and tries to find some
tokens and extract some data from the string. The problem is simple
and the code I've just written works fine. However, I need some ugly
casts to get rid of a signed/unsigned mismatch warning, and I was
wondering if there may be a more elegant way of doing this. A dumbed
down version of my code follows:

#include <string>

const std::string kToken = "TOKEN";

void FindToken(std::string::const_iterator& position,
std::string::const_iterator end)
{
std::string::size_type tokenLength = kToken.length();
if(tokenLength <= (end - position)) // <- Signed - unsigned
conversion warning here.
{
std::string expectedToken(position, position + tokenLength);
if(expectedToken == kToken)
position += tokenLength; // The token has been found and the
iterator is advanced before returning.
}
}

Basically, I've moved the bit of code I'm interested in to the above
function FindToken, which tries to find a certain token ("TOKEN"), and
advances the "position" iterator by the token's length if it is found.
Otherwise, the function returns quietly leaving the iterator
unchanged. In the actual code, I can assume that the two iterators
come from the same string object and that position <= end.

Now the problem with the above code (building it with MSVC 10) is that
I get a warning because of the conversion between the signed type
returned by the (end - position) iterator subtraction and the
std::string::size_type unsigned integer type. I can
static_cast<std::string::size_type> the warning away, of course, but
it's a bit ugly, so I was wondering if anyone knows of a way of doing
this sort of thing without either warnings or casts.

Thanks in advance,

Ángel José Riesgo
 
I

itaj sherman

Hi,

I'm writing some code that parses a string and tries to find some
tokens and extract some data from the string. The problem is simple
and the code I've just written works fine. However, I need some ugly
casts to get rid of a signed/unsigned mismatch warning, and I was
wondering if there may be a more elegant way of doing this. A dumbed
down version of my code follows:

#include <string>

const std::string kToken = "TOKEN";

void FindToken(std::string::const_iterator& position,
std::string::const_iterator end)
{
std::string::size_type tokenLength = kToken.length();
if(tokenLength <= (end - position)) // <- Signed - unsigned
conversion warning here.
{
std::string expectedToken(position, position + tokenLength);
if(expectedToken == kToken)
position += tokenLength; // The token has been found and the
iterator is advanced before returning.
}

}

I think you should either work with indexs or with iterators. You're
trying to mix both.

With indexs use:
string::length()
string::at()
string::size_type

with indexs use:
string::begin()
string::end()
string::iterator
string::iterator::difference_type

Most of the code uses iterators. You just need to fix 1 line:

//std::string::size_type tokenLength = kToken.length();
std::string::const_iterator::difference_type const tokenLength
= ( kToken.end() - kToken.begin() );

Then your concepts will match, and you won't be needing any type
conversions.

itaj
 
I

itaj sherman

Sorry, I type faster than I think:
string::iterator::difference_type

that's std::string::difference_type
//std::string::size_type tokenLength = kToken.length();
std::string::const_iterator::difference_type const tokenLength
= ( kToken.end() - kToken.begin() );

that's
std::string::difference_type const tokenLength
= ( kToken.end() - kToken.begin() );


Oh, and I don't mean always ever use either iteroator or indexes. I
mean don't mix them in wrong ways.

itaj
 
Á

Ángel José Riesgo

This looks dubious: position + tokenLength could be an invalid iterator
if it is past std::string::end().  The static_cast version is fine as
there is nothing wrong with using the C++ style casts.

Thanks for your suggestions. I forgot to mention that I had actually
tried the if(position+tokenLength<= end) approach. That compiles
without any warnings, but then I was bitten by a run-time assertion
coming from the checked iterators because, as Leigh explains, the
addition operation was taking the value past the end iterator when the
available range was too short.

The C++ static_casts are not too bad, but I always feel that casts are
somehow telling me that I'm being sloppy with the types. In another
answer in this thread, itaj sherman has recommended comparing the
iterator subtraction with another iterator subtraction (kToken.end() -
kToken.begin()) so that the types match. I like the consistency of
that approach, so I'm going to try that.
I agree.

/Leigh

I will look into that. Anyway, I'm not too worried about the temporary
string. As long as the code is readable and robust (this is not a
performance-intensive thing), it should be fine.

Ángel José Riesgo
 
Á

Ángel José Riesgo

that's std::string::difference_type




that's
std::string::difference_type const tokenLength
  = ( kToken.end() - kToken.begin() );

Oh, and I don't mean always ever use either iteroator or indexes. I
mean don't mix them in wrong ways.

itaj

Thanks. That's the sort of consistency I was looking for. Now that you
mention it, it seems quite straightforward: comparing a subtraction of
iterators with, well, a subtraction of iterators. It makes perfect
sense.

Ángel José Riesgo
 
J

James Kanze

I'm writing some code that parses a string and tries to find some
tokens and extract some data from the string. The problem is simple
and the code I've just written works fine. However, I need some ugly
casts to get rid of a signed/unsigned mismatch warning, and I was
wondering if there may be a more elegant way of doing this. A dumbed
down version of my code follows:
#include <string>
const std::string kToken = "TOKEN";
void FindToken(std::string::const_iterator& position,
std::string::const_iterator end)
{
std::string::size_type tokenLength = kToken.length();
if(tokenLength <= (end - position)) // <- Signed - unsigned
conversion warning here.
{
std::string expectedToken(position, position + tokenLength);
if(expectedToken == kToken)
position += tokenLength; // The token has been found and the
iterator is advanced before returning.
}
}
Basically, I've moved the bit of code I'm interested in to the above
function FindToken, which tries to find a certain token ("TOKEN"), and
advances the "position" iterator by the token's length if it is found.
Otherwise, the function returns quietly leaving the iterator
unchanged. In the actual code, I can assume that the two iterators
come from the same string object and that position <= end.
Now the problem with the above code (building it with MSVC 10) is that
I get a warning because of the conversion between the signed type
returned by the (end - position) iterator subtraction and the
std::string::size_type unsigned integer type. I can
static_cast<std::string::size_type> the warning away, of course, but
it's a bit ugly, so I was wondering if anyone knows of a way of doing
this sort of thing without either warnings or casts.

Using int instead of std::string::size_type should get rid of
the error. But you're doing a lot of extra work; my version
would be just:

if ( static_cast<size_t>(end - position) >= kToken.size()
&& std::equal(kToken.begin(), kToken.end(), position) )
position += kToken.size();

(Here, you need the static_cast, because of a design flaw in the
standard library; kToken.size() should return int.)
 
I

itaj sherman

Using int instead of std::string::size_type should get rid of
the error.  But you're doing a lot of extra work; my version
would be just:

    if ( static_cast<size_t>(end - position) >= kToken.size()
            && std::equal(kToken.begin(), kToken.end(), position) )
        position += kToken.size();

(Here, you need the static_cast, because of a design flaw in the
standard library; kToken.size() should return int.)

(Here, you need the static_cast, because of a design flaw in the
standard library; kToken.size() should return int.)

kToken.size() should return int specifically?
What if int is smaller than std::string::difference_type?

Does in mean there shouldn't be a container::size_type, and it should
all be container::difference_type?
Is there any good reason why there is such a distinction in the
standard containers?

itaj
 
I

itaj sherman

On 17/02/2011 16:44, itaj sherman wrote:

Kanze is trolling; ignore him.

/Leigh

When I read comp.lang.c++.*, Kanze is one of the few people that I go
around looking specifically for their posts. I usually find his
answers and opinions ingeniously accurate and effective.

I hope he ellaborates on what I asked about his post here.

I don't see how your posts in this thread do any of that.
I've noticed your "Troll meets Monolith" thread, and I think they fit
better in that thread.
If Kanze is a troll as you say I'm sure he would join your thread and
you can converse about it there.

itaj
 
I

itaj sherman

In this thread I pointed out an error that somebody made whilst trying
to avoid a cast which is kind of ironic.  There is nothing wrong with
using the C++ style casts.

I was refferring to your replies under Kanze's answer, not to your
first one.

itaj
 
I

Ian Collins

When I read comp.lang.c++.*, Kanze is one of the few people that I go
around looking specifically for their posts. I usually find his
answers and opinions ingeniously accurate and effective.

They invariably are, unlike some others around here.
 
I

itaj sherman

All containers in the standard have:
container::difference_type, with signed integer semantics
and also
container::size_t, with unsigned semantics

Because this is how it is in the standard containers, I always took it
for granted that this is the right way to go. And that this
distinction should be considered important.
Apparently, many people think that this distinction is bad.

There's a long discussion/argument about in this thread:

http://groups.google.com/group/comp.lang.c++/browse_frm/thread/ddf9b5acb66b7099#

I guess I'll have to read more about it. Maybe the distinction is
pointless.

That would mean that, in the first place, container::size() should
have return container::diffrence_type.
And also that container::size() would be in both the iterator and
indexs tool kits. Or maybe even that the iterator/index distinction
shouldn't be a strict as I'm used to.

kToken.size() would be exactly equivalent to ( kToken.end() -
kToken.begin() ).

In practice, many of these functions would receive only iterators
(begin,end) as separate parameters, or a single range object. So the
container::size() cannot be used anyway.

itaj
 
J

James Kanze

kToken.size() should return int specifically?

That would have been a better design.
What if int is smaller than std::string::difference_type?

Maybe. Then difference_type might be more appropriate.
Does in mean there shouldn't be a container::size_type, and it should
all be container::difference_type?

The two should probably be identical. Otherwise, you end up
with problems like yours.
Is there any good reason why there is such a distinction in the
standard containers?

I believe that the STL was originally developped on a 16 bit PC,
and the author wanted that extra bit. Although even on a 16 bit
PC, the value is arguable, and the problems are legion.
 
Á

Ángel José Riesgo

Using int instead of std::string::size_type should get rid of
the error.  But you're doing a lot of extra work; my version
would be just:

    if ( static_cast<size_t>(end - position) >= kToken.size()
            && std::equal(kToken.begin(), kToken.end(), position) )
        position += kToken.size();

(Here, you need the static_cast, because of a design flaw in the
standard library; kToken.size() should return int.)

Thanks for the suggestion. I forgot about the std::equal algorithm,
and it actually fits very nicely within my code, as it's mostly based
on iterators and comparisons.

Ángel José Riesgo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top