Not-equal string searching

D

DavidW

Hello,

Is the function below the simplest way to produce an iterator to the next
non-space in a string? (Or the upper-bound iterator if none is found).
Searching for a sequence is overkill and inefficient IMO.

#include <string>
#include <algorithm>
#include <functional>

std::string::iterator find_not_space(std::string &s)
{
char chSpace = ' ';
return std::search(s.begin(), s.end(), &chSpace , &chSpace+1,
std::not_equal_to<char>());
}
 
D

Daniel T.

DavidW said:
Is the function below the simplest way to produce an iterator to the next
non-space in a string?
No.

std::string::iterator find_not_space(std::string &s)
{
char chSpace = ' ';
return std::search(s.begin(), s.end(), &chSpace , &chSpace+1,
std::not_equal_to<char>());
}

If you aren't using boost's lambda library then:

std::string::iterator find_not_space(std::string &s)
{
return find_if( s.begin(), s.end(),
bind2nd( not_equal_to<char>(), ' ' ) );
}

If you are using boost's lambda library:

std::string::iterator find_not_space(std::string &s)
{
return find_if( s.begin(), s.end(), _1 != ' ' );
}
 
K

Kai-Uwe Bux

DavidW said:
Hello,

Is the function below the simplest way to produce an iterator to the next
non-space in a string? (Or the upper-bound iterator if none is found).
Searching for a sequence is overkill and inefficient IMO.

#include <string>
#include <algorithm>
#include <functional>

std::string::iterator find_not_space(std::string &s)
{
char chSpace = ' ';
return std::search(s.begin(), s.end(), &chSpace , &chSpace+1,
std::not_equal_to<char>());
}

a) You could use std::find_if with not_equal_to ' ' as the predicate. There
ought to be a way to fiddle around with binders to get it into a single
line. Or using lambda:

find_if( s.begin(), s.end(), _1 != ' ' );


b) Also notice the member function find_first_not_of of std::string. It
almost does what you want, except that it returns the index of the element
and not an iterator.



Best

Kai-Uwe Bux
 
J

James Kanze

Is the function below the simplest way to produce an iterator to the next
non-space in a string? (Or the upper-bound iterator if none is found).
Searching for a sequence is overkill and inefficient IMO.
#include <string>
#include <algorithm>
#include <functional>
std::string::iterator find_not_space(std::string &s)
{
char chSpace = ' ';
return std::search(s.begin(), s.end(), &chSpace , &chSpace+1,
std::not_equal_to<char>());
}

I'm not sure I understand. If all you're looking for is the
next non-space, std::find_if should work using the standard
functional objects, e.g.:
std::find_if(
begin, end,
std::bind2nd( std::not_equal_to< char >( ' ' ) ) ) ;

Generally speaking, however, this isn't a good idea, since it
doesn't consider things like '\t' as spaces. I usually use
functional object wrappers for ctype<>::is(), with the
appropriate mask.
 
J

James Kanze

DavidW wrote:
b) Also notice the member function find_first_not_of of
std::string. It almost does what you want, except that it
returns the index of the element and not an iterator.

To get the iterator, of course, just add it to begin(). I still
prefer std::find_if, since it's what I'd do with any other
container.
 
D

Daniel T.

DavidW said:
Is the function below the simplest way to produce an iterator to the next
non-space in a string? (Or the upper-bound iterator if none is found).
Searching for a sequence is overkill and inefficient IMO.

#include <string>
#include <algorithm>
#include <functional>

std::string::iterator find_not_space(std::string &s)
{
char chSpace = ' ';
return std::search(s.begin(), s.end(), &chSpace , &chSpace+1,
std::not_equal_to<char>());
}

Someone correct me on this, but isn't the above undefined behavior?
chSpace isn't an array so I don't think &chSpace + 1 is valid even if it
isn't dereferenced.

I think it would have to be something more like:

char chSpace[] = " ";
return search( s.begin(), s.end(), chSpace, chSpace + 1,
not_equal_to<char>() );
 
D

DavidW

Daniel T. said:
Someone correct me on this, but isn't the above undefined behavior?
chSpace isn't an array so I don't think &chSpace + 1 is valid even if it
isn't dereferenced.

It would be invalid to dereference it even if it were an array (of length 1).
I think it would have to be something more like:

char chSpace[] = " ";

Except that the null terminator is a wasted character. Better would be: char
chSpace[] = {' '};
return search( s.begin(), s.end(), chSpace, chSpace + 1,
not_equal_to<char>() );

I don't know if it's undefined behaviour not to use an array. It seems an
unnecessary restriction if so.
 
K

Kai-Uwe Bux

James said:
To get the iterator, of course, just add it to begin().

I am not sure whether the index-based string function mix so well with
iterators. I had a cursory look into the standard, but I was not able to
confirm what happens to

s.begin() + s.find_first_not_of( " " );

if find_first_not_of() does not find anything and returns npos. My gut tells
me, I better be worried that it might be undefined behavior.
I still
prefer std::find_if, since it's what I'd do with any other
container.

Agreed.


Best

Kai-Uwe Bux
 
D

DavidW

James Kanze said:
I'm not sure I understand. If all you're looking for is the
next non-space, std::find_if should work using the standard
functional objects, e.g.:
std::find_if(
begin, end,
std::bind2nd( std::not_equal_to< char >( ' ' ) ) ) ;

Thanks, and to the others who suggested bind2nd. I wanted to use find_if and I
looked for such a predicate, but obviously not hard enough.

I would also like a std::find with a predicate, e.g., std::find(s.begin(),
Generally speaking, however, this isn't a good idea, since it
doesn't consider things like '\t' as spaces. I usually use
functional object wrappers for ctype<>::is(), with the
appropriate mask.

In this case it's specific to the space character.
 
D

DavidW

Krishanu Debnath said:
DavidW said:
Daniel T. said:
[snip]
return search( s.begin(), s.end(), chSpace, chSpace + 1,
not_equal_to<char>() );

I don't know if it's undefined behaviour not to use an array. It seems an
unnecessary restriction if so.

Yes, it is. It has to be an array object.

Well, I guess this isn't the place to ask why, only how. It would be an odd
machine that has an object in addressable memory but falls over with any use of
its address+1. Perhaps a memory-hungry program might want to store a single
object at 0xFFFFFFFF, on the assumption that the programmer won't cause a
wrapping exception?
 
A

Alf P. Steinbach

* Krishanu Debnath:
DavidW said:
Daniel T. said:
[snip]
return search( s.begin(), s.end(), chSpace, chSpace + 1,
not_equal_to<char>() );

I don't know if it's undefined behaviour not to use an array. It seems an
unnecessary restriction if so.

Yes, it is. It has to be an array object.

It's unclear what you guys try to say.

But it seems that §6.7/4 about additive operations (+, -) applies:

"For the purposes of these operators, a pointer to a nonarray object
behaves the same as a pointer to first element of an array of length one
with the type of the object as its element type."


Cheers, & hth.,

- Alf
 
A

Andrey Tarasevich

Daniel said:
...
Someone correct me on this, but isn't the above undefined behavior?
chSpace isn't an array so I don't think &chSpace + 1 is valid even if it
isn't dereferenced.
...

It is not undefined behavior. C++ standard explicitly states that rules of
pointer arithmetic are immediately applicable to a standalone non-array object,
as if this object is an array with 1 element (see 5.7/4)
 
J

Jeff Schwab

Krishanu said:
DavidW said:
Daniel T. said:
[snip]
return search( s.begin(), s.end(), chSpace, chSpace + 1,
not_equal_to<char>() );

I don't know if it's undefined behaviour not to use an array. It seems an
unnecessary restriction if so.

Yes, it is. It has to be an array object.

Could you please explain why you believe that? I'm pretty sure you're
mistaken.
 
J

James Kanze

I am not sure whether the index-based string function mix so well with
iterators. I had a cursory look into the standard, but I was not able to
confirm what happens to
s.begin() + s.find_first_not_of( " " );
if find_first_not_of() does not find anything and returns
npos. My gut tells me, I better be worried that it might be
undefined behavior.

Good point. It would be undefined behavior.
 
J

James Kanze

Someone correct me on this, but isn't the above undefined behavior?
No.

chSpace isn't an array so I don't think &chSpace + 1 is valid
even if it isn't dereferenced.

For purposes of address calculation, a scalar object behaves
like an array with one element.
 
R

Ralph D. Ungermann

Kai-Uwe Bux said:
I am not sure whether the index-based string function mix so well with
iterators. I had a cursory look into the standard, but I was not able to
confirm what happens to

s.begin() + s.find_first_not_of( " " );

if find_first_not_of() does not find anything and returns npos. My gut tells
me, I better be worried that it might be undefined behavior.

Yup, but good old strspn( s.c_str(), " " ) works better.

But I feel somewhat uneasy: should I replace a function with a clear
name, but confusing effect by another one with a confusing name, but a
straightforward result?

-- ralph
 
R

Ralph D. Ungermann

Andrey said:
It is not undefined behavior. C++ standard explicitly states that rules
of pointer arithmetic are immediately applicable to a standalone
non-array object, as if this object is an array with 1 element (see 5.7/4)

Ok so far. But if I need an object, that must behave like an array of
size 1, I just define it as such:

char const chSpace[1] = " ";

It saves me from digging in the reference, and from using the address-of
operator.

-- ralph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top