Reading columns in a text file

Discussion in 'C++' started by C++ Newbie, May 16, 2008.

  1. C++ Newbie

    C++ Newbie Guest

    Suppose I have a text file with the input:
    1 2 3 4 5 6 7 8 9 10 ! Comment: Integers 1 - 10

    How do I write a C++ program that reads in this line into a 10-element
    vector and ignores the comment?

    Thanks.
    C++ Newbie, May 16, 2008
    #1
    1. Advertising

  2. C++ Newbie wrote:

    > Suppose I have a text file with the input:
    > 1 2 3 4 5 6 7 8 9 10 ! Comment: Integers 1 - 10
    >
    > How do I write a C++ program that reads in this line into a 10-element
    > vector and ignores the comment?


    First of all, you need to read the line from a file, preferably using
    std::ifstream and std::getline.

    Now that you've got a std::string that contains the line, use
    std::string's member functions to get the substring up to the start of
    the comment. Hint: find() and substr() will probably be useful.

    In order to tokenize the remaining substring by spaces, use
    boost::tokenizer and just push the tokens into a std::vector.
    boost::lexical_cast can be used for string->int conversion.

    (If you've never used the Boost libraries before, now is the perfect
    moment to start.)

    Stay away from strtok(), atoi() and arrays.


    --
    Christian Hackl
    Christian Hackl, May 16, 2008
    #2
    1. Advertising

  3. C++ Newbie

    Martin York Guest

    On May 16, 8:00 am, Christian Hackl <> wrote:

    > Now that you've got a std::string that contains the line, use
    > std::string's member functions to get the substring up to the start of
    > the comment. Hint: find() and substr() will probably be useful.


    That sounds like an awful lot of work when streams will do all that
    for you automatically.


    > In order to tokenize the remaining substring by spaces, use
    > boost::tokenizer and just push the tokens into a std::vector.
    > boost::lexical_cast can be used for string->int conversion.



    Wow. More work. When again the streams do it automatically.


    > (If you've never used the Boost libraries before, now is the perfect
    > moment to start.)



    Yep. Learn how to use boost.
    But first learn how to use the STL and the stream operators.


    --------------------
    // If your data file is just one line long
    std::fstream file("file");

    // read 1 integer.
    int x;
    file >> x;
    // repeat 10 time (probably in a loop)

    -------------------------------------
    // If your data file is line based.
    // Read 1 line into s string stream then use the stream operators.

    std::fstream file("file");


    // Repeat this for each line
    std::string line;
    std::getline(file,line);

    std::stringstream lineStream(line);

    // repeat above code to get the integers just use lineStream rather
    than file.
    int x;
    lineStream >> x;


    Now learn how to use the STL to do all the above nearly automatically.




    > Stay away from strtok(), atoi() and arrays.


    Yep.
    Martin York, May 16, 2008
    #3
  4. C++ Newbie

    James Kanze Guest

    On 16 mai, 16:54, Victor Bazarov <> wrote:
    > C++ Newbie wrote:
    > > Suppose I have a text file with the input:
    > > 1 2 3 4 5 6 7 8 9 10 ! Comment: Integers 1 - 10


    > > How do I write a C++ program that reads in this line into a
    > > 10-element vector and ignores the comment?


    > How do you know there are 10 elements?


    Or more to the point, does he know? And what determines what is
    a comment, and what isn't?

    > Basically, you read integers and stuff them into a vector
    > until you get an error or the end of the line. If you get an
    > error, ignore the rest of the line.


    Maybe. Until we know what the actual specification is, any
    suggestions are just guesswork. If the specification says that
    the '!' character starts a comment, then the simplest solution
    might be to use a filtering streambuf, so that characters from
    the '!' to the line end simply don't show up in the input.
    Although if the syntax is otherwise line oriented, this might be
    overkill, since you can use getline, as you propose later. If,
    on the other hand, the syntax is 10 elements, and anything else
    is a comment, you need some other approach.

    > I strongly suggest two step processing: first, read a line
    > from your file, then, second, process the line you just read
    > to extract the individual integers (until the end of the line
    > or an error which should mean the end of the vector).


    That's generally a good solution if the syntax is line oriented.
    If the syntax says that anything following a '!' is a comment,
    then it is trivial to trim anything after the first '!' from the
    input line. It can be made to work in more complicated cases as
    well.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, May 16, 2008
    #4
  5. C++ Newbie

    Jim Langston Guest

    Victor Bazarov wrote:
    > C++ Newbie wrote:
    >> Suppose I have a text file with the input:
    >> 1 2 3 4 5 6 7 8 9 10 ! Comment: Integers 1 - 10
    >>
    >> How do I write a C++ program that reads in this line into a
    >> 10-element vector and ignores the comment?

    >
    > How do you know there are 10 elements?
    >
    > Basically, you read integers and stuff them into a vector until you
    > get an error or the end of the line. If you get an error, ignore the
    > rest of the line. I strongly suggest two step processing: first,
    > read a line from your file, then, second, process the line you just
    > read to extract the individual integers (until the end of the line or
    > an error which should mean the end of the vector).
    >
    > To read a line use 'std::geline' function. Then define a
    > istringstream from the line you just read into a string, and loop
    > while it's "good". Read an individual int, and if successful, stuff
    > it into your vector. Once your istringstream is no good, proceed to
    > reading the next line from the file. Do that until the file has no
    > more lines.


    One addition I would make is to peek to see if the next character to read is
    a ! or not. If it wasn't, and you get an error, I would produce some
    diagnosis stating I was expecting a number or a !, but I received something
    else.

    --
    Jim Langston
    Jim Langston, May 16, 2008
    #5
  6. C++ Newbie

    C++ Newbie Guest

    Martin York wrote:

    > On May 16, 8:00 am, Christian Hackl <> wrote:
    >
    > > Now that you've got a std::string that contains the line, use
    > > std::string's member functions to get the substring up to the start of
    > > the comment. Hint: find() and substr() will probably be useful.

    >
    > That sounds like an awful lot of work when streams will do all that
    > for you automatically.


    Hi everyone, thanks for the replies. How does this look?

    inputfile.txt
    3 ! Rows
    5 ! Column entries
    1 2 3 4 5
    6 7 8 9 0
    1 2 3 4 5

    fstream myfile;
    myfile.open("inputfile.txt");
    string inputline;
    string comment_starts("!"); // Comments flagged by "!"
    unsigned int offset;
    getline(myfile, inputline);
    offset = inputline.find(comment_starts); // Find location of comment
    inputline = inputline.substr(0,offset); // Trim string
    unsigned int rows = atoi(inputline.c_str());

    [Repeated for columns. Sorry about using atoi; I thought it would be
    OK given that there should be only 1 integer in the first two lines of
    the file.]

    // Read in the 2D data
    int i, j;
    int x[columns][rows];
    for (j = 0; j < rows; j++)
    {for (i = 0; i < columns; i++)
    {myfile >> x[j];} // Line #
    }

    How is it that the line # correctly reads in the columns and advances
    to the next row of the array when the inputfile.txt's line hits a
    carriage return? By analogy if we were writing the contents of x[j]
    out to myfile, we would have to explicitly specify a carriage return,
    i.e.:
    // Write out the 2D data
    for (j = 0; j < rows; j++)
    {myfile << "\n";
    for (i = 0; i < columns; i++)
    {myfile << x[j];} // Line #
    }

    Why is it a bad idea to use arrays? I need to store the 2D data in a
    2D array for later manipulation.
    C++ Newbie, May 19, 2008
    #6
  7. C++ Newbie

    James Kanze Guest

    On May 19, 3:09 pm, "C++ Newbie" <> wrote:
    > Martin York wrote:
    > > On May 16, 8:00 am, Christian Hackl <> wrote:


    > > > Now that you've got a std::string that contains the line,
    > > > use std::string's member functions to get the substring up
    > > > to the start of the comment. Hint: find() and substr()
    > > > will probably be useful.


    > > That sounds like an awful lot of work when streams will do
    > > all that for you automatically.


    > Hi everyone, thanks for the replies. How does this look?


    > inputfile.txt
    > 3 ! Rows
    > 5 ! Column entries
    > 1 2 3 4 5
    > 6 7 8 9 0
    > 1 2 3 4 5


    Do the comments really mean what they seem to mean? That is: is
    the format of the file fixed so that the first line contains a
    single integer with the number of rows, the second a single
    integer with the number of columns, and there are then number of
    rows lines, each with number of columns integers. And what
    determines what is a comment? Anything after a '!'?

    Until we know this, it's impossible to say whether your code is
    right or not. If I suppose the above, however (and that empty
    lines or just comment line are not allowed---IMHO, not a good
    idea), then your code has a number of problems.

    > fstream myfile;
    > myfile.open("inputfile.txt");
    > string inputline;
    > string comment_starts("!"); // Comments flagged by "!"
    > unsigned int offset;
    > getline(myfile, inputline);
    > offset = inputline.find(comment_starts); // Find location of comment
    > inputline = inputline.substr(0,offset); // Trim string


    Since you have to do this for every line, it really needs to be
    in a separate function:

    std::istream&
    getInputLine( std::istream& source, std::string& dest )
    {
    std::string line ;
    std::getline( source, line ) ;
    if ( source ) {
    dest = std::string(
    line.begin(),
    std::find( line.begin(), line.end(), '!' ) ) ;
    }
    return source ;
    }

    (I'd actually probably have it returning a Fallible, but the
    above corresponds closest to the standard idiom.)

    > unsigned int rows = atoi(inputline.c_str());


    > [Repeated for columns. Sorry about using atoi; I thought it
    > would be OK given that there should be only 1 integer in the
    > first two lines of the file.]


    Except that it doesn't allow for any error handling. What
    happens if the line doesn't contain an integer?

    Again, I'd go with a separate function:

    std::istream&
    getIntegers(
    std::istream& source,
    std::vector< int >& dest,
    int count )
    {
    std::string line ;
    if ( getInputLine( line ) ) {
    // To support "blank" lines, insert a loop with the
    // getInputLine, reading until you get either a line
    // with at least one non-blank character or an error.
    // Alternatively, the loop could be in
    // getInputLine().
    std::istringstream s( line ) ;
    std::vector< int > tmp( count ) ;
    for ( int i = 0 ; s && i < count ; ++ i ) {
    s >> tmp[ i ] ;
    }
    s >> std::ws ;
    if ( s && s.get() == EOF ) {
    dest = tmp ;
    } else {
    source.setstate( std::ios::failbit ) ;
    }
    }
    }

    If you don't mind partially mangling the vector if there is an
    error, you can skip the intermediate `tmp', resize dest, and
    read directly to it.

    > // Read in the 2D data
    > int i, j;
    > int x[columns][rows];


    This isn't legal C++, and shouldn't compile. For it to be
    legal, both columns and rows must be constants.

    > for (j = 0; j < rows; j++)
    > {for (i = 0; i < columns; i++)
    > {myfile >> x[j];} // Line #
    > }


    > How is it that the line # correctly reads in the columns and
    > advances to the next row of the array when the inputfile.txt's
    > line hits a carriage return?


    It doesn't. By default, end of line is just white space, like
    any other white space. Between each read, you skip blank space.
    Your code doesn't care if the structure of the file is correct
    or not.

    > By analogy if we were writing the contents of x[j] out to
    > myfile, we would have to explicitly specify a carriage return,
    > i.e.:


    If that's what you wanted. On output, you have to manually
    insert white space; on input, it is skipped (but some separator
    had better be there, or you won't be able to read the file).

    > // Write out the 2D data
    > for (j = 0; j < rows; j++)
    > {myfile << "\n";
    > for (i = 0; i < columns; i++)
    > {myfile << x[j];} // Line #
    > }


    > Why is it a bad idea to use arrays?


    Because they're broken in the language. They're second class
    objects, which don't behave like other objects.

    In your case, also, because they must have compile-time constant
    dimensions.

    > I need to store the 2D data in a 2D array for later
    > manipulation.


    What's wrong with `std::vector< std::vector< int > >'. If
    nothing else, it will make input an order of magnitude simpler.
    Using the above functions:

    std::vector< int > line ;
    if ( ! getIntegers( source, line, 1 )
    || line[ 0 ] < 1 ) {
    // Fatal error...
    }
    int rows = line[ 0 ] ;
    if ( ! getIntegers( source, line, 1 )
    || line[ 0 ] < 1 ) {
    // Fatal error...
    }
    int columns = line[ 0 ] ;
    std::vector< std::vector< int > >
    data ;
    while ( source && data.size() != rows ) {
    getIntegers( source, line, columns ) ) ;
    if ( source ) {
    data.push_back( line ) ;
    }
    }
    if ( data.size() != rows ) {
    // Error, not enough data...
    }
    source >> std::ws ;
    if ( ! source || source.get() != EOF ) {
    // Error, unexpected garbage at end of file
    }

    Of course, this all supposes that my assumptions concerning your
    file format are correct. Before writing a single line of code,
    you should specify the file format exactly, and program to that.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, May 20, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darrel
    Replies:
    3
    Views:
    659
    Kevin Spencer
    Nov 11, 2004
  2. helpful sql
    Replies:
    0
    Views:
    795
    helpful sql
    May 19, 2005
  3. Replies:
    5
    Views:
    578
  4. Replies:
    4
    Views:
    925
  5. =?Utf-8?B?YmVub2l0?=

    Read CSV - string Columns - Int columns

    =?Utf-8?B?YmVub2l0?=, May 8, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    426
    =?Utf-8?B?YmVub2l0?=
    May 8, 2006
Loading...

Share This Page