Suggestions to reading txt files?

N

none

I have a .txt file containing strings ordered as a row*column matrix, eg.:

a basd asdasd asdasd ada asdasd asdda
b basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda

The number of columns for each row is constant. I am trying to write a function that given a txt
file as the one above returns the i,j element like:


std::string file = "c:\test.txt";

int rows = numberOfRows(file);
int columns = numberOfColumns(file);

for (int i=0; i<rows; i++) {
for (int j=0; j<columns; j++) {
std::string elem = readCell(i,j,file);
}

}

Any hints on a smart version of readCell are welcome since my current version is pretty ugly...maybe
there are some build in functionality for this kind of txt file parsing that I have missed ?
 
I

Ian Collins

none said:
I have a .txt file containing strings ordered as a row*column matrix, eg.:

a basd asdasd asdasd ada asdasd asdda
b basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda

The number of columns for each row is constant. I am trying to write a
function that given a txt file as the one above returns the i,j element
like:


std::string file = "c:\test.txt";

int rows = numberOfRows(file);
int columns = numberOfColumns(file);

for (int i=0; i<rows; i++) {
for (int j=0; j<columns; j++) {
std::string elem = readCell(i,j,file);
}

}

Any hints on a smart version of readCell are welcome since my current
version is pretty ugly...maybe there are some build in functionality for
this kind of txt file parsing that I have missed ?

If your files are as you describe, you can simply >> to a string for
each item. You could store the values in a vector of vector of string.

Something like:

int main()
{
const size_t rows = 2;
const size_t columns = 4;

std::vector< std::vector<std::string> >
data(rows,std::vector<std::string>(columns));

for (int i=0; i<rows; i++) {
for (int j=0; j<columns; j++) {
std::cin >> data[j];
}
}
}
 
J

James Kanze

I have a .txt file containing strings ordered as a row*column
matrix, eg.:
a basd asdasd asdasd ada asdasd asdda
b basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda
The number of columns for each row is constant. I am trying to
write a function that given a txt file as the one above
returns the i,j element like:
std::string file = "c:\test.txt";
int rows = numberOfRows(file);
int columns = numberOfColumns(file);
for (int i=0; i<rows; i++) {
for (int j=0; j<columns; j++) {
std::string elem = readCell(i,j,file);
}
}
Any hints on a smart version of readCell are welcome since my
current version is pretty ugly...maybe there are some build in
functionality for this kind of txt file parsing that I have
missed ?

You'll have to define the problem a little bit better before we
can answer the question: are the number of rows and columns
known before hand, or does the program have to deduce them from
the code? What determines a column: white space, fixed length,
or? (I assume that '\n' determines a row. Otherwise, calling
them .txt files is perhaps not a good idea.) Are you looking
for true random access: read a particular element without having
to read all of the preceding?

If the file will fit into memory, and the columns are white
space separated, perhaps something like the following would do
the trick:

class FormattedFile
{
std::vector<std::vector<std::string> > myData;
static Fallible<std::vector<std::string> > readLine(
std::istream& source );
public:
explicit FormattedFile( std::istream const& source );
int rows() const { return myData.size(); }
int columns() const { return myData[0].size(); }
std::string at( int row, int column ) const
{
assert( row >= 0 && row < myData.size()
&& column >= 0 && column < myData[0].size() );
return myData[row][column];
}
};

FormattedFile::FormattedFile(
std::istream& source)
{
Fallible<std::vector<std::string> >
line( readLine( source ) );
if ( ! line.isValid() ) {
throw SomeError( "File is empty" );
}
int columns = line.value().size();
myData.push_back( line.value() );
line = readLine( source );
while ( line.isValid() ) {
if ( line.value().size() != columns ) {
throw SomeError( "Format error" );
}
myData.push_back( line.value() );
line = readLine( source );
}
}

Fallible<std::vector<std::string> >
FormattedFile::readLine( std::istream& source )
{
Fallible<std::vector<std::string> > result;
std::string line;
if ( std::getline( source, line ) ) {
std::istringstream tmp( line );
std::vector<std::string> entry;
std::string element;
while ( tmp >> element ) {
entry.push_back( element );
}
result.validate( entry );
}
return result;
}

This is only really valid for smaller files, however; once the
data starts occupying a significant part of the memory, it
ceases to be appropriate.
 
J

Jorgen Grahn

[please wrap your lines better]

I have a .txt file containing strings ordered as a row*column matrix, eg.:

a basd asdasd asdasd ada asdasd asdda
b basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda
c basd asdasd asdasd ada asdasd asdda

The number of columns for each row is constant. I am trying to write
a function that given a txt
file as the one above returns the i,j element like:


std::string file = "c:\test.txt";

int rows = numberOfRows(file);
int columns = numberOfColumns(file);

for (int i=0; i<rows; i++) {
for (int j=0; j<columns; j++) {
std::string elem = readCell(i,j,file);
}

}

Any hints on a smart version of readCell are welcome since my current
version is pretty ugly...

I assume the code above is *not* your current version; it would have
horrible performance since you have to read and parse the file
rows*columns+2 times.
maybe there are some build in functionality for this kind of txt
file parsing that I have missed ?

I'd not bother with istream >> str. If the file is small enough, I'd
read it line by line into a vector<string> and have a simple parser

string column(const string& row, int col);

Or a vector<vector<string> >, if that seems more efficient given how
you'll use it.

The code is really trivial to write, using just pointer arithmetics
and std::isspace().


Although I'd first consider if I *really* need random access to the
cells. It prevents pipelining (as in the Unix shell). If you can
manage with an interface which spits out row by row, you can support
interactive input, gzipped input, input from sockets, infinite input ...

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top