check if line is whitespace

Discussion in 'C++' started by puzzlecracker, Sep 3, 2008.

  1. What is the quickest way to check that the following:

    const line[127]; only contains whitespace, in which case to ignore it.

    something along these lines:

    isspacedLine(line);

    Thanks
     
    puzzlecracker, Sep 3, 2008
    #1
    1. Advertising

  2. puzzlecracker

    Zeppe Guest

    puzzlecracker wrote:
    > What is the quickest way to check that the following:
    >
    > const line[127]; only contains whitespace, in which case to ignore it.
    >
    > something along these lines:
    >
    > isspacedLine(line);
    >


    const line[127];

    doesn't mean anything in c++. Apart from that, if line is an array of
    char, I'm pretty much sure that somebody with "puzzlecracker" as
    nickname will be more than able to solve it ;)

    Best wishes,

    Zeppe
     
    Zeppe, Sep 3, 2008
    #2
    1. Advertising

  3. puzzlecracker

    Darío Guest

    On Sep 3, 2:21 pm, puzzlecracker <> wrote:
    > What is the quickest way to check that the following:
    >
    > const line[127]; only contains whitespace, in which case to ignore it.
    >
    > something along these lines:
    >
    > isspacedLine(line);
    >
    > Thanks


    bool isLineSpaced(const char line[127])
    {
    int i = 0;
    for(; i<127 && line[i++] == ' '; );
    return i==127;
    }
     
    Darío, Sep 3, 2008
    #3
  4. Guys, yeah, I wrote something similar to yours suggestions:

    if( (line[strlen(line) -1] == '\n') )
    line[strlen(line) -1] = '\0';

    //ignore whitespace lines
    unsigned int i;
    for(i=0; line!='\0' && isspace(line);i++)
    ;
    if(i==strlen(line))
    continue;
     
    puzzlecracker, Sep 3, 2008
    #4
  5. puzzlecracker

    James Kanze Guest

    On Sep 3, 7:21 pm, puzzlecracker <> wrote:
    > What is the quickest way to check that the following:


    > const line[127]; only contains whitespace, in which case to ignore it.


    You mean std::string line, don't you. The above isn't a legal
    C++ declaration.

    > something along these lines:


    > isspacedLine(line);


    Well, the standard library already has direct support for this,
    but it's interface isn't the most friendly. But something like
    the following should do the trick:

    bool
    isOnlySpaces(
    std::string const& line,
    std::locale const& locale = std::locale() )
    {
    return std::use_facet< std::ctype< char > >( locale )
    .scan_not( std::ctype_base::space,
    line.data(), line.data() + line.size() )
    == line.data() + line.size() ;
    }

    (If you're forced to use arrays of char, instead of string, this
    solution still works perfectly well.)

    More generally, however, I tend to use regular expressions in
    such cases. If the line matches "^[:space:]*$", ignore it.
    With a good implementation of regular expressions (which uses a
    DFA if the expression contains no extensions), this can be just
    as fast as the above, if not faster. (Just make sure you only
    construct the regular expression once, and not every time you
    call the function.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Sep 4, 2008
    #5
  6. puzzlecracker

    James Kanze Guest

    On Sep 4, 1:03 am, Sam <> wrote:
    > Darío writes:
    > > On Sep 3, 2:21 pm, puzzlecracker <> wrote:
    > >> What is the quickest way to check that the following:


    > >> const line[127]; only contains whitespace, in which case to ignore it.


    > >> something along these lines:


    > >> isspacedLine(line);


    > > bool isLineSpaced(const char line[127])
    > > {
    > > int i = 0;
    > > for(; i<127 && line[i++] == ' '; );
    > > return i==127;
    > > }


    > That's C, not C++.


    Well, it's also C++, albeit not idiomatic or good C++.

    > The C++ solution would be:


    > #include <algorithm>
    > #include <cctype>


    The C++ solution would use <locale>, and not <cctype>:). (With
    subsequent changes in the code, of course.)

    > #include <functional>
    > #include <vector>


    > bool isLineSpaced(const std::vector<char> &line)
    > {
    > return std::find_if(line.begin(), line.end(),
    > std::not1(std::ptr_fun(isspace))) == line..end();
    > }


    Which is fine, except that it has undefined behavior. What you
    probably meant was somthing like:

    struct NotIsSpace
    {
    bool operator()( char ch ) const
    {
    return ! std::isspace(
    static_cast< unsigned char >( ch ) ) ;
    }
    } ;

    bool
    isEmptyLine(
    std::string const& line )
    {
    return std::find_if( line.begin(), line.end(), NotIsSpace() )
    == line.end() ;
    }

    (You cannot call the version of isspace in <cctype> with a char
    without risking undefined behavior.)

    Still, a quick benchmark shows that something like:

    myCtype.scan_not( std::ctype_base::space,
    myData.data(),
    myData.data() + myData.size() )
    == myData.data() + myData.size() ;

    , with myCtype initialized with "std::use_facet< std::ctype<
    char > >( std::locale()" is roughly five times faster (at least
    on one system: g++ 4.1 under Linux on an Intel). And it's
    certainly more idiotic^H^H^Hmatic with regards to C++.

    (FWIW, using a full regular expression was only about three
    times slower than your solution. And is a lot more powerful.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Sep 4, 2008
    #6
  7. On 3 Sep, 18:21, puzzlecracker <> wrote:

    > What is the quickest way to check that the following:
    >
    > const line[127]; only contains whitespace, in which case to ignore it.
    >
    > something along these lines:
    >
    > isspacedLine(line);


    is a C solution any good?

    #include <cstring>

    bool isspacedLine (const char* line)
    {
    size_t i = strspn (line, " \t\f\n");
    return line = '\0';
    }

    --
    Nick Keighley
     
    Nick Keighley, Sep 4, 2008
    #7
  8. James Kanze wrote:
    [...]
    > More generally, however, I tend to use regular expressions in
    > such cases. If the line matches "^[:space:]*$", ignore it.
    > With a good implementation of regular expressions (which uses a
    > DFA if the expression contains no extensions), this can be just
    > as fast as the above, if not faster.


    I see that you mention execution speed here and in other posts of this
    thread. Since you aren't in the Premature-Optimization "school of
    thought", I re-read the original post, and it says "quickest way". I
    think that wasn't meant as "the way which executes fastest", though; I
    get it as: "how do I avoid spending time implementing this?". And, of
    course, the best solution is letting others, like you, implement it.

    --
    Gennaro Prota | name.surname yahoo.com
    Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
    Do you need expertise in C++? I'm available.
     
    Gennaro Prota, Sep 4, 2008
    #8
  9. puzzlecracker

    James Kanze Guest

    On Sep 4, 5:41 pm, Gennaro Prota <gennaro/> wrote:
    > James Kanze wrote:


    > [...]
    > > More generally, however, I tend to use regular expressions in
    > > such cases. If the line matches "^[:space:]*$", ignore it.
    > > With a good implementation of regular expressions (which uses a
    > > DFA if the expression contains no extensions), this can be just
    > > as fast as the above, if not faster.


    > I see that you mention execution speed here and in other posts
    > of this thread. Since you aren't in the Premature-Optimization
    > "school of thought", I re-read the original post, and it says
    > "quickest way". I think that wasn't meant as "the way which
    > executes fastest", though; I get it as: "how do I avoid
    > spending time implementing this?".


    I suspect that that's wishful thinking on your part. That's
    what it should mean, but most of the time, most programmers do
    still use "quickest" to refer to execution time. Since the
    issue of execution time was raised, I felt it necessary to
    address it. The regular expression solution is by far the
    simplest, and it's execution time is NOT necessarily too bad.

    Of course, the regular expression class I use here is my own,
    not that of Boost. The two are significantly different, being
    designed from the start with different goals in mind. For most
    general use, Boost's regular expression is better than mine, but
    in this particular case: my regular expression class supports
    the or'ing of multiple regular expressions, with different
    return values. So you can write something like:

    enum { emptyLine, sectionHeader, attrValuePair } ;
    static RegularExpression const re =
    RegularExpression( "[[:space:]]*$", emptyLine )
    | RegularExpression( "\[.*\][[:space:]]*$", sectionHeader )
    | RegularExpression( ".*=.*", attrValuePair ) ;
    std::string line ;
    while ( std::getline( source, line ) ) {
    switch ( re.match( line.begin(), line.end() ).acceptCode ) {
    case emptyLine :
    break ;

    case sectionHeader :
    // ...
    break ;

    case attrValuePair :
    // ...
    break ;

    default :
    // process syntax error...
    break ;
    }

    Of course, for the empty line, I'd probably use:
    "[[:space:]]*(#.*)?$", to allow comments.

    And a small warning: the version of RegularExpression doesn't
    support the $ at the end to require a complete match, so you'd
    have to add special code to handle this. I've recently reworked
    the class considerably, however, for various reasons, and my
    current version does have an option to require matching the
    complete string, instead of just the start. It also supports
    dumping the regular expression as a StaticRegularExpression, a
    POD with static initialization that you then compile and link
    into your program. (Not that the time to initialize the regular
    expression would be an issue here, but I have some that are
    complicated enough that parsing and initialing the expression
    takes several minutes.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Sep 5, 2008
    #9
  10. James Kanze wrote:
    >> I re-read the original post, and it says
    >> "quickest way". I think that wasn't meant as "the way which
    >> executes fastest", though; I get it as: "how do I avoid
    >> spending time implementing this?".

    >
    > I suspect that that's wishful thinking on your part.


    I certainly couldn't wish that people made such requests. It was the
    way I got it, given the OP precedents; a suspect, if you wish, like
    your erroneous suspect that I was wishing that.

    --
    Gennaro Prota | name.surname yahoo.com
    Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
    Do you need expertise in C++? I'm available.
     
    Gennaro Prota, Sep 5, 2008
    #10
  11. puzzlecracker

    Jorgen Grahn Guest

    On Wed, 3 Sep 2008 10:21:58 -0700 (PDT), puzzlecracker <> wrote:
    > What is the quickest way to check that the following:
    >
    > const line[127]; only contains whitespace, in which case to ignore it.
    >
    > something along these lines:
    >
    > isspacedLine(line);


    Reformulate your problem to use std::string, and then:

    /**
    * True iff s is empty or only contains space and/or TABs.
    */
    bool util::isblank(const std::string& s)
    {
    return s.find_first_not_of(" \t")==std::string::npos;
    }

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
    \X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!
     
    Jorgen Grahn, Sep 8, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    2,727
  2. Oli Filth
    Replies:
    9
    Views:
    3,370
    Uncle Pirate
    Jan 17, 2005
  3. IndyChris
    Replies:
    1
    Views:
    9,014
    bruce barker \(sqlwork.com\)
    Aug 9, 2006
  4. Replies:
    10
    Views:
    804
    Eric Brunel
    Dec 16, 2008
  5. MRAB
    Replies:
    3
    Views:
    411
Loading...

Share This Page