memory block -> C++ strings

Discussion in 'C++' started by barcaroller, Mar 29, 2008.

  1. barcaroller

    barcaroller Guest

    I have a large block of memory. I need to (1) check if it contains only
    ASCII characters (including newlines and/or carriage-returns) and, if so,
    (2) extract the lines into individual C++ strings.

    Currently I loop the entire block (byte for byte), run isascii(byte) on each
    byte, and then call getline() (either string.getline or iostream.getline
    does the job). This is proving too slow. I'm sure this problem has been
    solved using more efficient methods. Any suggestions?
     
    barcaroller, Mar 29, 2008
    #1
    1. Advertising

  2. On Sat, 29 Mar 2008 15:00:11 -0400, barcaroller wrote:

    > I have a large block of memory. I need to (1) check if it contains only
    > ASCII characters (including newlines and/or carriage-returns) and, if
    > so, (2) extract the lines into individual C++ strings.
    >
    > Currently I loop the entire block (byte for byte), run isascii(byte) on
    > each byte, and then call getline() (either string.getline or
    > iostream.getline does the job). This is proving too slow. I'm sure
    > this problem has been solved using more efficient methods. Any
    > suggestions?


    For each asserted byte that contains only the basic latin alphabet,
    copy it directly before continuing testing. Or if you're familiar
    working with iterators, return an iterator pair and prolong copying
    the data; do you even need std::string?

    --
    OU
     
    Obnoxious User, Mar 29, 2008
    #2
    1. Advertising

  3. barcaroller

    Kai-Uwe Bux Guest

    barcaroller wrote:

    >
    > I have a large block of memory. I need to (1) check if it contains only
    > ASCII characters (including newlines and/or carriage-returns) and, if so,
    > (2) extract the lines into individual C++ strings.
    >
    > Currently I loop the entire block (byte for byte), run isascii(byte) on
    > each byte, and then call getline() (either string.getline or
    > iostream.getline
    > does the job). This is proving too slow. I'm sure this problem has been
    > solved using more efficient methods. Any suggestions?


    You could use the first pass to store information about where the lines
    start and end. Something like:

    /*
    appends the lines in [from,to) to the_text provided, all
    characters in the range are ascii. If not, no lines will
    be appended.
    */
    template < typename ConstCharIter, typename StringSequence >
    bool append_lines ( ConstCharIter from,
    ConstCharIter to,
    StringSequence & the_text ) {
    typedef std::pair< ConstCharIter, ConstCharIter > line;
    std::deque< line > the_lines;
    CharConstIter line_beg = from;
    CharConstIter line_end = line_beg;
    while ( true ) {
    if ( line_end == to ) {
    the_lines.push_back( line( line_beg, line_end ) );
    break;
    }
    if ( *line_end == '\n' ) {
    the_text.push_back( line( line_beg, line_end ) );
    ++line_end;
    line_beg = line_end;
    continue;
    }
    if ( ! isascii( *line_end ) ) {
    return ( false );
    }
    ++ line_end;
    }
    for ( std::deque< line >::const_iterator line_iter = the_lines.begin();
    line_iter != the_lines.end(); ++ line_iter ) {
    // prematurely optimizing away a copy-constructor that might
    // be elided by the implementation anyway:
    // the_text.push_back
    // ( std::string( line_iter->first, line_iter->second ) );
    the_text.push_back( std::string() );
    the_text.back().swap
    ( std::string( line_iter->first, line_iter->second ) );
    }
    return ( true );
    }

    Note: code not touched by a compiler.


    Also: if it is expected that non-ascii characters only occur with negligible
    probability, you might be able to save time by inserting the lines right
    away and roll-back the transaction if you encounter a non-ascii character.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Mar 29, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Showjumper
    Replies:
    1
    Views:
    716
    Showjumper
    Mar 19, 2005
  2. Noozer

    Block DIV within a block DIV?

    Noozer, Jan 6, 2005, in forum: HTML
    Replies:
    3
    Views:
    11,414
    Mitja
    Jan 6, 2005
  3. Andy
    Replies:
    0
    Views:
    555
  4. morrell
    Replies:
    1
    Views:
    990
    roy axenov
    Oct 10, 2006
  5. Ben

    Strings, Strings and Damned Strings

    Ben, Jun 22, 2006, in forum: C Programming
    Replies:
    14
    Views:
    799
    Malcolm
    Jun 24, 2006
Loading...

Share This Page