How to GET multi-word input from a *file* stream as opposed to a *console* stream?

S

sherifffruitfly

Hi,

I've a got a little (exercise) program that reads data from a file and
puts it into struct members. I run into trouble when one of the data
pieces is comprised of several words (eg "john doe", with a space in
it).

For console input, cin.getline(var, howMuchIWant) or cin.get() has done
the trick for me in the past. It doesn't seem to work for me nearly so
well with a file stream. I wouldn't have thought cpp regarded
file/console streams as significantly different, so I assume I'm doing
something wrong. What am I doing wrong?

thanks for letting me in on the joke! :)

cdj

======example data txt file========
4
john
2000
seattle
peter
4000
san francisco
paul
100
greenlake
mary
10000
seattle
======works fine without the "san" part though======

====== code ======
/*
declare struct type
open file
get # of structs required from file
dynamically make the array of structs
display them
*/

//This routine works fine when the name and location are one word long
only.
//Need to figure out how to utilize inputFile.get or inputFile.getline
to
//read multiword names, locations, and the like.

#include <iostream>
#include <fstream>
#include <cstdlib>

using namespace std;

const int STRSIZE = 60;
const int FILESIZEMAX = 1000;

int testfunction(int arg);

struct donors {
char name[STRSIZE];
double amount;
char location[STRSIZE];
};

int main()
{
char filename[STRSIZE];
ifstream inputFile;

cout << "File name: ";
cin.getline(filename,STRSIZE);

inputFile.open(filename);

if (!inputFile.is_open())
{
cout << "Couldn't open " << filename << endl;
cout << "Terminating execution\n";
exit(EXIT_FAILURE);
}

cout << filename << " successfully opened." << endl;

int numDonors;
inputFile >> numDonors;

cout << "Number of donors in " << filename << ": " << numDonors <<
endl << endl;

if (numDonors==0)
{
cout << "Exiting from \'no donors\' door.\n\n";
exit(EXIT_FAILURE);//Apparently no donor data to read
}

donors * myDonors = new donors[numDonors];
//Now I've allocated structured space for my data
cout << "Donors struct array created with " << numDonors << "
elements." << endl << endl;

//cout << "Name: ";
//cin.getline(myDonors[0].name,STRSIZE);
//cout << "You entered: " << myDonors[0].name << endl; - works fine
for console

for (int i=0; i<numDonors; i++)
{
cout << "Reading record " << i+1 << "... ";

//inputFile.getline(myDonors.name,STRSIZE);
//This works for getting multiword input from cin,
//why not for my inputFile object?

inputFile >> myDonors.name;
inputFile >> myDonors.amount;
inputFile >> myDonors.location;
//These work fine for single word items
//Doesn't work for multiword items

cout << "done!" << endl;
}


cout << endl;

for (int i=0; i<numDonors; i++)
{
cout << "Record #" << i+1 << ":" << endl;
cout << "Name: " << myDonors.name << endl;
cout << "Amount: " << myDonors.amount << endl;
cout << "Location: " << myDonors.location << endl << endl;
}

//Close file when done with it.
inputFile.close();

//Free allocated space when done with it.
delete [] myDonors;


return 0;
}
 
R

red floyd

Hi,

I've a got a little (exercise) program that reads data from a file and
puts it into struct members. I run into trouble when one of the data
pieces is comprised of several words (eg "john doe", with a space in
it).

For console input, cin.getline(var, howMuchIWant) or cin.get() has done
the trick for me in the past. It doesn't seem to work for me nearly so
well with a file stream. I wouldn't have thought cpp regarded
file/console streams as significantly different, so I assume I'm doing
something wrong. What am I doing wrong?

thanks for letting me in on the joke! :)

cdj

======example data txt file========
4
john
2000
seattle
peter
4000
san francisco
paul
100
greenlake
mary
10000
seattle
======works fine without the "san" part though======

====== code ======
/*
declare struct type
open file
get # of structs required from file
dynamically make the array of structs
display them
*/

//This routine works fine when the name and location are one word long
only.
//Need to figure out how to utilize inputFile.get or inputFile.getline
to
//read multiword names, locations, and the like.

#include <iostream>
#include <fstream>
#include <cstdlib>

using namespace std;

const int STRSIZE = 60;
const int FILESIZEMAX = 1000;

int testfunction(int arg);

struct donors {
char name[STRSIZE];
double amount;
char location[STRSIZE];
};

int main()
{
char filename[STRSIZE];
ifstream inputFile;

cout << "File name: ";
cin.getline(filename,STRSIZE);

inputFile.open(filename);

if (!inputFile.is_open())
{
cout << "Couldn't open " << filename << endl;
cout << "Terminating execution\n";
exit(EXIT_FAILURE);
}

cout << filename << " successfully opened." << endl;

int numDonors;
inputFile >> numDonors;

cout << "Number of donors in " << filename << ": " << numDonors <<
endl << endl;

if (numDonors==0)
{
cout << "Exiting from \'no donors\' door.\n\n";
exit(EXIT_FAILURE);//Apparently no donor data to read
}

donors * myDonors = new donors[numDonors];
//Now I've allocated structured space for my data
cout << "Donors struct array created with " << numDonors << "
elements." << endl << endl;

//cout << "Name: ";
//cin.getline(myDonors[0].name,STRSIZE);
//cout << "You entered: " << myDonors[0].name << endl; - works fine
for console

for (int i=0; i<numDonors; i++)
{
cout << "Reading record " << i+1 << "... ";

//inputFile.getline(myDonors.name,STRSIZE);
//This works for getting multiword input from cin,
//why not for my inputFile object?


What, specifically, are you seeing that doesn't work?
inputFile >> myDonors.name;
inputFile >> myDonors.amount;
inputFile >> myDonors.location;
//These work fine for single word items
//Doesn't work for multiword items

cout << "done!" << endl;
}


cout << endl;

for (int i=0; i<numDonors; i++)
{
cout << "Record #" << i+1 << ":" << endl;
cout << "Name: " << myDonors.name << endl;
cout << "Amount: " << myDonors.amount << endl;
cout << "Location: " << myDonors.location << endl << endl;
}

//Close file when done with it.
inputFile.close();

//Free allocated space when done with it.
delete [] myDonors;


return 0;
}




You might also want to consider using std::string instead of fixed
arrays, and then using the nonmember function std::getline() to read
strings.

You also don't deal with the possibility of corrupted data in the first
line of your file (# of donors) -- what happens if you have a negative
value there?
 
S

sherifffruitfly

Sorry - here's two outputs, depending on whether the data-piece is
"francisco" or "san francisco". To my limited knowledge, the space in
"san francisco" result in the "inputFile >>" assignment being "thrown
off" by one, yielding screwy results for the stream inputs thereafter.

And yah - there's not much by the way of validation/error handling. It
seems like "covering all the bases" in that respect would take a fair
bit of coding - I just want to get the basic piece functioning
correctly first.

====== good output w/"good" data ======
====== ie, no spaces in the data ======
File name: c:\cdj.txt
c:\cdj.txt successfully opened.
Number of donors in c:\cdj.txt: 4

Donors struct array created with 4 elements.

Reading record 1... done!
Reading record 2... done!
Reading record 3... done!
Reading record 4... done!

Record #1:
Name: john
Amount: 2000
Location: seattle

Record #2:
Name: peter
Amount: 4000
Location: francisco

Record #3:
Name: paul
Amount: 100
Location: greenlake

Record #4:
Name: mary
Amount: 10000
Location: seattle
========= end good output =======

======= bad output =======
======= ie, output when the data has spaces ======
File name: c:\cdj.txt
c:\cdj.txt successfully opened.
Number of donors in c:\cdj.txt: 4

Donors struct array created with 4 elements.

Reading record 1... done!
Reading record 2... done!
Reading record 3... done!
Reading record 4... done!

Record #1:
Name: john
Amount: 2000
Location: seattle

Record #2:
Name: peter
Amount: 4000
Location: san

Record #3:
Name: francisco
Amount: -6.27744e+066
Location:

Record #4:
Name:
Amount: -6.27744e+066
Location:

Press any key to continue . . .
====== end bad output =========
 
A

Alex Buell

Hi,

I've a got a little (exercise) program that reads data from a file and
puts it into struct members. I run into trouble when one of the data
pieces is comprised of several words (eg "john doe", with a space in
it).

For console input, cin.getline(var, howMuchIWant) or cin.get() has
done the trick for me in the past. It doesn't seem to work for me
nearly so well with a file stream. I wouldn't have thought cpp
regarded file/console streams as significantly different, so I assume
I'm doing something wrong. What am I doing wrong?

thanks for letting me in on the joke! :)

Look up token(), it'll help with reading words.
 
R

red floyd

Sorry - here's two outputs, depending on whether the data-piece is
"francisco" or "san francisco". To my limited knowledge, the space in
"san francisco" result in the "inputFile >>" assignment being "thrown
off" by one, yielding screwy results for the stream inputs thereafter.

And yah - there's not much by the way of validation/error handling. It
seems like "covering all the bases" in that respect would take a fair
bit of coding - I just want to get the basic piece functioning
correctly first.
[output redacted]

At the risk of asking the obvious, are you sure you're using getline?
The code you posted uses operator>>, which is whitespace sensitive.
 
J

Jim Blumberg

Hi,

I've a got a little (exercise) program that reads data from a file and
puts it into struct members. I run into trouble when one of the data
pieces is comprised of several words (eg "john doe", with a space in
it).

For console input, cin.getline(var, howMuchIWant) or cin.get() has done
the trick for me in the past. It doesn't seem to work for me nearly so
well with a file stream. I wouldn't have thought cpp regarded
file/console streams as significantly different, so I assume I'm doing
something wrong. What am I doing wrong?

thanks for letting me in on the joke! :)

cdj

======example data txt file========
4
john
2000
seattle
peter
4000
san francisco
paul
100
greenlake
mary
10000
seattle
======works fine without the "san" part though======

====== code ======
/*
declare struct type
open file
get # of structs required from file
dynamically make the array of structs
display them
*/

//This routine works fine when the name and location are one word long
only.
//Need to figure out how to utilize inputFile.get or inputFile.getline
to
//read multiword names, locations, and the like.

#include <iostream>
#include <fstream>
#include <cstdlib>

using namespace std;

const int STRSIZE = 60;
const int FILESIZEMAX = 1000;

int testfunction(int arg);

struct donors {
char name[STRSIZE];
double amount;
char location[STRSIZE];
};

int main()
{
char filename[STRSIZE];
ifstream inputFile;

cout << "File name: ";
cin.getline(filename,STRSIZE);

inputFile.open(filename);

if (!inputFile.is_open())
{
cout << "Couldn't open " << filename << endl;
cout << "Terminating execution\n";
exit(EXIT_FAILURE);
}

cout << filename << " successfully opened." << endl;

int numDonors;
inputFile >> numDonors;

This leaves the newline character. use inputFile.get() after this call
to extract and throw away the newline character.
cout << "Number of donors in " << filename << ": " << numDonors <<
endl << endl;

if (numDonors==0)
{
cout << "Exiting from \'no donors\' door.\n\n";
exit(EXIT_FAILURE);//Apparently no donor data to read
}

donors * myDonors = new donors[numDonors];
//Now I've allocated structured space for my data
cout << "Donors struct array created with " << numDonors << "
elements." << endl << endl;

//cout << "Name: ";
//cin.getline(myDonors[0].name,STRSIZE);
//cout << "You entered: " << myDonors[0].name << endl; - works fine
for console

for (int i=0; i<numDonors; i++)
{
cout << "Reading record " << i+1 << "... ";

//inputFile.getline(myDonors.name,STRSIZE);
//This works for getting multiword input from cin,
//why not for my inputFile object?

After putting the inputFile.get() after the extraction operator the
above getline should work.
inputFile >> myDonors.name;
inputFile >> myDonors.amount;

This wil also leave the newline character. use inputFile.get() after
this call to extract and throw away the newline character.
and then use the getline() function to get the location.
inputFile >> myDonors.location;
//These work fine for single word items
//Doesn't work for multiword items

cout << "done!" << endl;
}


cout << endl;

for (int i=0; i<numDonors; i++)
{
cout << "Record #" << i+1 << ":" << endl;
cout << "Name: " << myDonors.name << endl;
cout << "Amount: " << myDonors.amount << endl;
cout << "Location: " << myDonors.location << endl << endl;
}

//Close file when done with it.
inputFile.close();

//Free allocated space when done with it.
delete [] myDonors;


return 0;
}


The extraction operator ( >> ) does not remove the newline character
from the stream so the next call to getline() will only get the
newline character. To remove the newline character use inputFile.get()
after using the extraction operator (inputFile >> ).
 
A

Alex Buell

Where? I can't find it in ISO14882.

Ok well tokenising is a trick I use to parse sentences.

std::string input("hello world!");
std::stringstream ss(input);
std::vector<string> tokens;
std::string buffer;

while (ss >> buffer)
tokens.push_back(buffer);

Now you get a nice little vector<std::string> full of words. In this
case; tokens.at(0) has "hello", and tokens.at(1) has "world!".

Capise?
 
R

Richard Herring

Alex said:
Ok well tokenising is a trick I use to parse sentences.

std::string input("hello world!");
std::stringstream ss(input);
std::vector<string> tokens;
std::string buffer;

while (ss >> buffer)
tokens.push_back(buffer);

Now you get a nice little vector<std::string> full of words. In this
case; tokens.at(0) has "hello", and tokens.at(1) has "world!".

Capise?

Capisco, though I see nothing called token() above ;-).

But if the input is a big file, you may get a nice _big_ vector filling
most of your memory with strings.

That's an unnecessary overhead if the intention is to scan the strings
sequentially and then discard them. This is what istream_iterators are
for.

std::string input("hello world");
std::stringstream ss(input);
for (std::istream_iterator<std::string, char> p(ss), e; p!=e; ++p)
{
// do stuff with *p
}

The above also uses a rather simplistic definition of "word" - if your
text has punctuation, you probably need something more complex. Time to
start looking at boost::tokenizer, or even the Spirit parser.
 
A

Alex Buell

Capisco, though I see nothing called token() above ;-).

But if the input is a big file, you may get a nice _big_ vector
filling most of your memory with strings.

That's an unnecessary overhead if the intention is to scan the
strings sequentially and then discard them. This is what
istream_iterators are for.

std::string input("hello world");
std::stringstream ss(input);
for (std::istream_iterator<std::string, char> p(ss), e; p!=e; ++p)
{
// do stuff with *p
}

Oh yes, you're right my code is only good for small strings. Thanks for
the above, I might find it useful one day!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top