check if line is whitespace

P

puzzlecracker

What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

Thanks
 
Z

Zeppe

puzzlecracker said:
What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

const line[127];

doesn't mean anything in c++. Apart from that, if line is an array of
char, I'm pretty much sure that somebody with "puzzlecracker" as
nickname will be more than able to solve it ;)

Best wishes,

Zeppe
 
D

Darío

What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

Thanks

bool isLineSpaced(const char line[127])
{
int i = 0;
for(; i<127 && line[i++] == ' '; );
return i==127;
}
 
P

puzzlecracker

Guys, yeah, I wrote something similar to yours suggestions:

if( (line[strlen(line) -1] == '\n') )
line[strlen(line) -1] = '\0';

//ignore whitespace lines
unsigned int i;
for(i=0; line!='\0' && isspace(line);i++)
;
if(i==strlen(line))
continue;
 
J

James Kanze

What is the quickest way to check that the following:
const line[127]; only contains whitespace, in which case to ignore it.

You mean std::string line, don't you. The above isn't a legal
C++ declaration.
something along these lines:
isspacedLine(line);

Well, the standard library already has direct support for this,
but it's interface isn't the most friendly. But something like
the following should do the trick:

bool
isOnlySpaces(
std::string const& line,
std::locale const& locale = std::locale() )
{
return std::use_facet< std::ctype< char > >( locale )
.scan_not( std::ctype_base::space,
line.data(), line.data() + line.size() )
== line.data() + line.size() ;
}

(If you're forced to use arrays of char, instead of string, this
solution still works perfectly well.)

More generally, however, I tend to use regular expressions in
such cases. If the line matches "^[:space:]*$", ignore it.
With a good implementation of regular expressions (which uses a
DFA if the expression contains no extensions), this can be just
as fast as the above, if not faster. (Just make sure you only
construct the regular expression once, and not every time you
call the function.
 
J

James Kanze

Darío said:
What is the quickest way to check that the following:
const line[127]; only contains whitespace, in which case to ignore it.
something along these lines:
isspacedLine(line);
bool isLineSpaced(const char line[127])
{
int i = 0;
for(; i<127 && line[i++] == ' '; );
return i==127;
}
That's C, not C++.

Well, it's also C++, albeit not idiomatic or good C++.
The C++ solution would be:
#include <algorithm>
#include <cctype>

and not <cctype>:-). (With said:
#include <functional>
#include <vector>
bool isLineSpaced(const std::vector<char> &line)
{
return std::find_if(line.begin(), line.end(),
std::not1(std::ptr_fun(isspace))) == line..end();
}

Which is fine, except that it has undefined behavior. What you
probably meant was somthing like:

struct NotIsSpace
{
bool operator()( char ch ) const
{
return ! std::isspace(
static_cast< unsigned char >( ch ) ) ;
}
} ;

bool
isEmptyLine(
std::string const& line )
{
return std::find_if( line.begin(), line.end(), NotIsSpace() )
== line.end() ;
}

(You cannot call the version of isspace in <cctype> with a char
without risking undefined behavior.)

Still, a quick benchmark shows that something like:

myCtype.scan_not( std::ctype_base::space,
myData.data(),
myData.data() + myData.size() )
== myData.data() + myData.size() ;

, with myCtype initialized with "std::use_facet< std::ctype<
char > >( std::locale()" is roughly five times faster (at least
on one system: g++ 4.1 under Linux on an Intel). And it's
certainly more idiotic^H^H^Hmatic with regards to C++.

(FWIW, using a full regular expression was only about three
times slower than your solution. And is a lot more powerful.)
 
N

Nick Keighley

What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

is a C solution any good?

#include <cstring>

bool isspacedLine (const char* line)
{
size_t i = strspn (line, " \t\f\n");
return line = '\0';
}
 
G

Gennaro Prota

James Kanze wrote:
[...]
More generally, however, I tend to use regular expressions in
such cases. If the line matches "^[:space:]*$", ignore it.
With a good implementation of regular expressions (which uses a
DFA if the expression contains no extensions), this can be just
as fast as the above, if not faster.

I see that you mention execution speed here and in other posts of this
thread. Since you aren't in the Premature-Optimization "school of
thought", I re-read the original post, and it says "quickest way". I
think that wasn't meant as "the way which executes fastest", though; I
get it as: "how do I avoid spending time implementing this?". And, of
course, the best solution is letting others, like you, implement it.
 
J

James Kanze

James Kanze wrote:
[...]
More generally, however, I tend to use regular expressions in
such cases. If the line matches "^[:space:]*$", ignore it.
With a good implementation of regular expressions (which uses a
DFA if the expression contains no extensions), this can be just
as fast as the above, if not faster.
I see that you mention execution speed here and in other posts
of this thread. Since you aren't in the Premature-Optimization
"school of thought", I re-read the original post, and it says
"quickest way". I think that wasn't meant as "the way which
executes fastest", though; I get it as: "how do I avoid
spending time implementing this?".

I suspect that that's wishful thinking on your part. That's
what it should mean, but most of the time, most programmers do
still use "quickest" to refer to execution time. Since the
issue of execution time was raised, I felt it necessary to
address it. The regular expression solution is by far the
simplest, and it's execution time is NOT necessarily too bad.

Of course, the regular expression class I use here is my own,
not that of Boost. The two are significantly different, being
designed from the start with different goals in mind. For most
general use, Boost's regular expression is better than mine, but
in this particular case: my regular expression class supports
the or'ing of multiple regular expressions, with different
return values. So you can write something like:

enum { emptyLine, sectionHeader, attrValuePair } ;
static RegularExpression const re =
RegularExpression( "[[:space:]]*$", emptyLine )
| RegularExpression( "\[.*\][[:space:]]*$", sectionHeader )
| RegularExpression( ".*=.*", attrValuePair ) ;
std::string line ;
while ( std::getline( source, line ) ) {
switch ( re.match( line.begin(), line.end() ).acceptCode ) {
case emptyLine :
break ;

case sectionHeader :
// ...
break ;

case attrValuePair :
// ...
break ;

default :
// process syntax error...
break ;
}

Of course, for the empty line, I'd probably use:
"[[:space:]]*(#.*)?$", to allow comments.

And a small warning: the version of RegularExpression doesn't
support the $ at the end to require a complete match, so you'd
have to add special code to handle this. I've recently reworked
the class considerably, however, for various reasons, and my
current version does have an option to require matching the
complete string, instead of just the start. It also supports
dumping the regular expression as a StaticRegularExpression, a
POD with static initialization that you then compile and link
into your program. (Not that the time to initialize the regular
expression would be an issue here, but I have some that are
complicated enough that parsing and initialing the expression
takes several minutes.)
 
G

Gennaro Prota

James said:
I suspect that that's wishful thinking on your part.

I certainly couldn't wish that people made such requests. It was the
way I got it, given the OP precedents; a suspect, if you wish, like
your erroneous suspect that I was wishing that.
 
J

Jorgen Grahn

What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

Reformulate your problem to use std::string, and then:

/**
* True iff s is empty or only contains space and/or TABs.
*/
bool util::isblank(const std::string& s)
{
return s.find_first_not_of(" \t")==std::string::npos;
}

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top