Good way to read tuple or tripel from text file?

F

fdm

I need to read numbers from a text file specifying vectors in either 2D or
3D space. I need to determine the best format such that the code does not
get to ugly.

I have considered the following formats (in the following 3D is assumed):

1
2
3

3
2
4

4
33
2


or:

1,2,3 3,2,4 4,33,2

But before deciding the format and how to read it, I would like to hear you
guys if you have any suggestions?
 
V

Victor Bazarov

fdm said:
I need to read numbers from a text file specifying vectors in either 2D
or 3D space. I need to determine the best format such that the code does
not get to ugly.

I have considered the following formats (in the following 3D is assumed):

1
2
3

3
2
4

4
33
2


or:

1,2,3 3,2,4 4,33,2

But before deciding the format and how to read it, I would like to hear
you guys if you have any suggestions?

Here is what I found useful: implement reading some known format, like
DXF or NASTRAN. Then you won't have to worry about what format is best,
and besides you're going to have plenty of models to test on, and it
will certainly help usability of your application...

V
 
F

Francesco S. Carta

Here is what I found useful: implement reading some known format, like
DXF or NASTRAN.  Then you won't have to worry about what format is best,
and besides you're going to have plenty of models to test on, and it
will certainly help usability of your application...

Hey, how come that this wasn't off-topic today?
Just kidding, don't take me bad ;-)

Bad, bad fdm, that's way non-C++-specific of a question ;-)

Anyway, here is what I was about to post, something along the lines of
Victor's advice:

"Although the file would become a bit larger, I suggest the array-
initialization style, that would be:
-------
{ 1, 2, 3}
{ 4, 5, 6}
-------

Whether or not nesting it...
-------
{
{ 1, 2, 3}
{ 4, 5, 6}
{ 7, 8, 9}
}
-------

....or completely format it for code reuse...

-------
triangle1 = {
{ 1, 2, 3},
{ 4, 5, 6},
{ 7, 8, 9}
};
triangle2 = {
{ 10, 20, 30},
{ 40, 50, 60},
{ 70, 80, 90}
};
-------

....depends on your needs/tastes/eagerness-to-code-the-parser ;-)

If the file size isn't a big problem, I'd suggest to the most explicit
and human readable format.

The parser itself can be beautiful or ugly regardless of the format,
have a try at it and eventually post the code for review, you will get
good help for sure."

But really, if you let one or more known formats into your program, as
suggested by Victor, that would be the optimal solution - instead of
inventing yet-another-data-format.

Implementing such parsers is a good exercise, too.

Cheers!
 
F

fdm

Francesco said:
Hey, how come that this wasn't off-topic today?
Just kidding, don't take me bad ;-)

Bad, bad fdm, that's way non-C++-specific of a question ;-)

Anyway, here is what I was about to post, something along the lines of
Victor's advice:

"Although the file would become a bit larger, I suggest the array-
initialization style, that would be:
-------
{ 1, 2, 3}
{ 4, 5, 6}
-------

Whether or not nesting it...
-------
{
{ 1, 2, 3}
{ 4, 5, 6}
{ 7, 8, 9}
}
-------

...or completely format it for code reuse...

-------
triangle1 = {
{ 1, 2, 3},
{ 4, 5, 6},
{ 7, 8, 9}
};
triangle2 = {
{ 10, 20, 30},
{ 40, 50, 60},
{ 70, 80, 90}
};
-------

...depends on your needs/tastes/eagerness-to-code-the-parser ;-)

If the file size isn't a big problem, I'd suggest to the most explicit
and human readable format.

The parser itself can be beautiful or ugly regardless of the format,
have a try at it and eventually post the code for review, you will get
good help for sure."

But really, if you let one or more known formats into your program, as
suggested by Victor, that would be the optimal solution - instead of
inventing yet-another-data-format.

Implementing such parsers is a good exercise, too.

Cheers!



Yes I am considering something like:

1 2 3
3 2 44.4
547 33 2.00

where blanks are used as seperater. But I am stuck in parsing it. My
approach is to read each line and somehow separate each double in the line:



std::ifstream location_file;
location_file.open(location_path.c_str(), std::ios_base::in);
std::vector<std::string> str_vectors;
std::string line;
while ( std::getline(location_file, line) ) {
str_vectors.push_back(line);
}

// Now I got each line in the str_vector.

int nums = str_vectors.size();
for (int i=0; i<nums; i++) {

std::string str = str_vectors;
std::string tmp = str;

// assuming that the number of blanks are 2.
for (int j=0; j<2; j++) {
int idx = tmp.find_first_of(" ");
// ...I feel that I am walking down the wrong path here...

}

}



But it seems to be a way to complicated approach...any ideas?

BTW: thanks for the QuoteFix suggestion it seems to work :)
 
F

fdm

fdm said:
Francesco said:
Hey, how come that this wasn't off-topic today?
Just kidding, don't take me bad ;-)

Bad, bad fdm, that's way non-C++-specific of a question ;-)

Anyway, here is what I was about to post, something along the lines
of Victor's advice:

"Although the file would become a bit larger, I suggest the array-
initialization style, that would be:
-------
{ 1, 2, 3}
{ 4, 5, 6}
-------

Whether or not nesting it...
-------
{
{ 1, 2, 3}
{ 4, 5, 6}
{ 7, 8, 9}
}
-------

...or completely format it for code reuse...

-------
triangle1 = {
{ 1, 2, 3},
{ 4, 5, 6},
{ 7, 8, 9}
};
triangle2 = {
{ 10, 20, 30},
{ 40, 50, 60},
{ 70, 80, 90}
};
-------

...depends on your needs/tastes/eagerness-to-code-the-parser ;-)

If the file size isn't a big problem, I'd suggest to the most
explicit and human readable format.

The parser itself can be beautiful or ugly regardless of the format,
have a try at it and eventually post the code for review, you will
get good help for sure."

But really, if you let one or more known formats into your program,
as suggested by Victor, that would be the optimal solution - instead
of inventing yet-another-data-format.

Implementing such parsers is a good exercise, too.

Cheers!



Yes I am considering something like:

1 2 3
3 2 44.4
547 33 2.00

where blanks are used as seperater. But I am stuck in parsing it. My
approach is to read each line and somehow separate each double in the
line:


std::ifstream location_file;
location_file.open(location_path.c_str(), std::ios_base::in);
std::vector<std::string> str_vectors;
std::string line;
while ( std::getline(location_file, line) ) {
str_vectors.push_back(line);
}

// Now I got each line in the str_vector.

int nums = str_vectors.size();
for (int i=0; i<nums; i++) {

std::string str = str_vectors;
std::string tmp = str;

// assuming that the number of blanks are 2.
for (int j=0; j<2; j++) {
int idx = tmp.find_first_of(" ");
// ...I feel that I am walking down the wrong path here...

}

}



But it seems to be a way to complicated approach...any ideas?

BTW: thanks for the QuoteFix suggestion it seems to work :)





On possible solution:

1) The format is:

1.11 2 3
3 2 44.4
547 33 2.00

2) the code is:

int main() {
std::ifstream location_file;
location_file.open(location_path.c_str(), std::ios_base::in);
std::vector<std::string> str_vectors;
std::vector<VectorType> vectors;
std::string line;
while ( std::getline(location_file, line) ) {
std::vector<std::string> double_tokens;
Tokenize(line, double_tokens, " ");
int numberOfTokens = double_tokens.size();
VectorType vec;
for(int i=0; i<numberOfTokens; i++) {
double dd = to_double(double_tokens);
vec = dd;
}
vectors.push_back(vec);
}

return 0;

}

where (found on the net):

void Tokenize(const std::string& str, std::vector<std::string>& tokens,
const std::string& delimiters = " ") {
std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
std::string::size_type pos = str.find_first_of(delimiters, lastPos);
while (std::string::npos != pos || std::string::npos != lastPos) {
tokens.push_back(str.substr(lastPos, pos - lastPos));
lastPos = str.find_first_not_of(delimiters, pos);
pos = str.find_first_of(delimiters, lastPos);
}
}

double to_double(std::string const& str) {
std::istringstream ss(str);
double d;
ss >> d;
return d;
}



Not the most beautiful solution but it gets the job done. But I am still a
curious to know if can be done better.
 
F

Francesco S. Carta

On possible solution:

1) The format is:

1.11 2 3
3 2 44.4
547 33 2.00

2) the code is:

int main() {
std::ifstream location_file;
location_file.open(location_path.c_str(), std::ios_base::in);
std::vector<std::string> str_vectors;
std::vector<VectorType> vectors;
std::string line;
while ( std::getline(location_file, line) ) {
std::vector<std::string> double_tokens;
Tokenize(line, double_tokens, " ");
int numberOfTokens = double_tokens.size();
VectorType vec;
for(int i=0; i<numberOfTokens; i++) {
double dd = to_double(double_tokens);
vec = dd;
}
vectors.push_back(vec);
}

return 0;

}

where (found on the net):

void Tokenize(const std::string& str, std::vector<std::string>& tokens,
const std::string& delimiters = " ") {
std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
std::string::size_type pos = str.find_first_of(delimiters, lastPos);
while (std::string::npos != pos || std::string::npos != lastPos) {
tokens.push_back(str.substr(lastPos, pos - lastPos));
lastPos = str.find_first_not_of(delimiters, pos);
pos = str.find_first_of(delimiters, lastPos);
}

}

double to_double(std::string const& str) {
std::istringstream ss(str);
double d;
ss >> d;
return d;

}

Not the most beautiful solution but it gets the job done. But I am still a
curious to know if can be done better.


Uhm... seems a bit overkill for such a simple format, but having such
facilities is useful indeed.

Try putting them apart for a moment and to follow these two different
approaches/hints...

- Reading doubles directly from the file:

-------
double d;
location_file >> d;
-------

- Calling getline with a space separator:

-------
getline(location_file, line, ' ');
-------

There are pros and cons for both approaches, I don't want to spoil the
fun - take the occasion to practice as much as you can, since you're
not accustomed to streams and their facilities.

Using working code that you find hanging around is OK, but heading
towards being able to fully understand such code (and eventually
improve it) is way better.

(I'm not referring to the snippet you posted in particular, I didn't
test it and neither I analyzed its details)
BTW: thanks for the QuoteFix suggestion it seems to work :)

You're welcome :)
 
L

LR

On possible solution:

1) The format is:

1.11 2 3
3 2 44.4
547 33 2.00


Have you considered something like this?
This assumes Point is a class with a ctor that looks like
Point(const double x, const double y, const double z);
and also that your values are delimited by whitespace.

typedef std::vector<Point> PointVector;
PointVector get(const std::string &filename) {
PointVector result;
std::ifstream in(filename.c_str());
// I didn't check to see if it's open
std::string line;
while(std::getline(in,line)) {
std::istringstream ix(line);
double x,y,z;
if(!(ix >> x >> y)) {
std::cout << "Invalid x or y on line:" << line << std::endl;
continue;
}
if(!(ix >> z)) {
// is this an error? Maybe, but for now,
z = 0.0;
}
result.push_back(Point(x,y,z));
}
return result;
}

HTH

LR
 
F

Francesco S. Carta

Have you considered something like this?
This assumes Point is a class with a ctor that looks like
Point(const double x, const double y, const double z);
and also that your values are delimited by whitespace.

typedef std::vector<Point> PointVector;
PointVector get(const std::string &filename) {
    PointVector result;
    std::ifstream in(filename.c_str());
    // I didn't check to see if it's open
    std::string line;
    while(std::getline(in,line)) {
        std::istringstream ix(line);
        double x,y,z;
        if(!(ix >> x >> y)) {
            std::cout << "Invalid x or y on line:" << line << std::endl;
            continue;
        }
        if(!(ix >> z)) {
            // is this an error? Maybe, but for now,
            z = 0.0;
        }
        result.push_back(Point(x,y,z));
    }
    return result;

}

HTH

LR

Parsing data can become hard and risky.

For those who missed to dig those pitfalls - assuming they did -
practicing on the following program could be useful.

The program is left intentionally ingenuous and buggy, try running it
exactly as it is, then spot the problems from its output.

Then try uncommenting the first commented line, run it again and check
how the output changes.

Finally, find a way to avoid it hanging on the second commented line.

-------
#include <iostream>
#include <sstream>
#include <vector>

using namespace std;

void test_parser(const string& input) {

cout << "input:\n---" << endl;
cout << input << "\n---\n" << endl;

istringstream stream(input);
vector<double> doubles;
double d;

while (!stream.eof()) {
stream >> d;
//stream >> ws;
doubles.push_back(d);
}

cout << "output: " << doubles.size();
cout << " elements\n---\n[ ";
if(doubles.size()) {
for (unsigned i = 0, e = doubles.size()-1; i < e; ++i) {
cout << doubles << ", ";
}
cout << doubles.back();
} else {
cout << "(empty)";
}
cout << " ]\n---\n" << endl;
}

double to_double(std::string const& str) {
std::istringstream ss(str);
double d;
ss >> d;
return d;
}

int main() {

cout << "***************************" << endl;
cout << "test_parser(\"\");" << endl;
cout << endl;
test_parser("");

cout << "***************************" << endl;
cout << "test_parser(\"\\n\");" << endl;
cout << endl;
test_parser("\n");

cout << "***************************" << endl;
cout <<
"test_parser(\"1.11 2 3\\n3 2 44.4\\n547 33 2.00\\n\");" << endl;
cout << endl;
test_parser("1.11 2 3\n3 2 44.4\n547 33 2.00\n");

cout << "***************************" << endl;
cout << "to_double(\"a b c\") == ";
cout << to_double("a b c") << endl;

cout << "***************************" << endl;
//test_parser("a b c");

return 0;
}
 
D

Default User

fdm said:
I need to read numbers from a text file specifying vectors in either
2D or 3D space. I need to determine the best format such that the
code does not get to ugly.

I have considered the following formats (in the following 3D is
assumed):

1
2
3

3
2
4

4
33
2


or:

1,2,3 3,2,4 4,33,2

But before deciding the format and how to read it, I would like to
hear you guys if you have any suggestions?

I'd go with CSV (comma separated value) or TSV (tab separated value).
In general, I prefer the latter as I find it easier to read when
examining the files, and you don't have to worry as much about embedded
separators in values. For your simple needs, that's not a problem.
Either of those formats can be read or generated by most spreadsheet
programs, and some others as well.

<http://en.wikipedia.org/wiki/Delimiter-separated_values>




Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,436
Messages
2,571,696
Members
48,796
Latest member
Greg L.
Top