Parsing (a Series of) Variables

M

Mike Copeland

How do I (or can I?) parse a variable number of numeric values from a
line of text? Below is an example of a data line I need to parse (14
values, but there can be more or less):

1 37 36 4 7 5 6 8 9 10 20 32 23 16

I'm currently using the routine below, which works in a sense, but I
have to ignore blank values in the output vector. In the example data,
I get 20 vector elements, but I have to ignore 6 of them. 8<{{

typedef vector<string> TOKENS1; // parsing structure
TOKENS1 tokArray;
void parseSpace(string line) // parse & store blank-separated tokens
{
string tok1, tok2;
istringstream iss1(line);
tokArray.clear();
while(getline(iss1, tok1, ' '))
{
if(tok1.find(' ') != string::npos)
{
istringstream iss1(tok1);
while(getline(iss1, tok2, ' '))
{
if(!tok2.empty()) tokArray.push_back(tok2);
} // while
} // if
else tokArray.push_back(tok1);
} // while
return;
} // parseSpace

I assume there is a better way to parse such data (it might involve
stringstream). If so, how would I do such a thing in C++? TIA
 
S

Stefan Ram

How do I (or can I?) parse a variable number of numeric values from a
line of text? Below is an example of a data line I need to parse (14
values, but there can be more or less):
1 37 36 4 7 5 6 8 9 10 20 32 23 16

#include <iostream>
#include <ostream>
#include <sstream>

int main()
{ ::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
for( int i; source >> i; )::std::cout << i << '\n'; }
I'm currently using the routine below, which works in a sense, but I
have to ignore blank values in the output vector. In the example data,
I get 20 vector elements, but I have to ignore 6 of them. 8<{{

What does »ignore blank values in the output vector« mean?
(Which output vector? What is a »blank value«? What is the
operation »ignore« supposed to do?)
 
S

Stefan Ram

(Which output vector? What is a »blank value«? What is the
operation »ignore« supposed to do?)

Sorry, in the meantime, I saw that there is an output vector
in the example code. So that answers my first question.
 
M

Mike Copeland

#include <iostream>
#include <ostream>
#include <sstream>

int main()
{ ::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
for( int i; source >> i; )::std::cout << i << '\n'; }


What does »ignore blank values in the output vector« mean?
(Which output vector? What is a »blank value«? What is the
operation »ignore« supposed to do?)

Ultimately, I want to store the converted values into an array
(perhaps another vector). In your example, I don't understand why the
"for loop" passes over the data values. Also, I'm not interested in
displaying the numbers, but instead want to store them somewhere. The
"cout" doesn't do that for me, and I don't see how to access/store each
value as it passes through the loop. 8<{{
Please advise. TIA
 
S

Stefan Ram

"cout" doesn't do that for me, and I don't see how to access/store each
value as it passes through the loop. 8<{{

#include <iostream>
#include <ostream>
#include <sstream>
#include <vector>

int main()
{ using number = int;
::std::vector<number> vector;
::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
for( number i; source >> i; )vector.push_back( i ); }
 
S

Stefan Ram

#include <iostream>
#include <ostream>
#include <sstream>
#include <vector>
int main()
{ using number = int;
::std::vector<number> vector;
::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
for( number i; source >> i; )vector.push_back( i ); }

Refactor »rename variable«: »vector« -> »target«
(corresponding to »source«), »i« -> »num« (corresponding to
»number«) and removed two include directives (untested):

#include <sstream>
#include <vector>

int main()
{ using number = int;
::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
::std::vector<number> target;
for( number num; source >> num; )target.push_back( num ); }
 
M

Mike Copeland

It is. However, as I've tried to work with solutions I've received
I've learned that the approaches I've taken aren't as good as I'd like.
Consequently, I've had to refine my query...and this has been the
result.
Sorry if you consider it a waste of time. I continue to work at
improving my skills, as I evolve from procedural coding to more up-to-
date technologies and methods. At my age (73), it's not as easy as it
may be for most here.
My apologies...
 
H

Haochen Xie

Mike said:
Ultimately, I want to store the converted values into an array
(perhaps another vector). In your example, I don't understand why the
"for loop" passes over the data values.

OK, so since Stefan has already given you the version with a vector, I'm
just going to explain why the for loop would work.

The for loop is like this:

for(int i; source >> i; )
collection.push_back(i); // The collection is the vector

it works because "source >> i" would return a reference to the stream,
in this case, source. so as of its return value, it is the same to put
source in the test field. And for a istream, try to evaluate it as a
bool (actually a inexplicit type convention happens here from istream&
to bool), it will return true if it's in a "good" state, and false
otherwise. So if there's no more stuff in the buffer, the source would
be stated of eof (source.eof() will return true), which is not a "good"
state so the expression will return false to end the loop.

And my version to your function is like (to make the vector a return
value and rename the function):

#include <iostream>
#include <istream>
#include <sstream>
#include <vector>
using namespace std;

vector<int> parseLine(string line)
{
vector<int> coll; // "coll" for "collection", it's just my naming.
istringstream source(line);
for(int x; source >> x; )
coll.push_back(x);

return coll;
}

For a full version of the program with a test, see
<http://codepad.org/E5p2UR2r>.
 
R

Rui Maciel

Mike said:
How do I (or can I?) parse a variable number of numeric values from a
line of text? Below is an example of a data line I need to parse (14
values, but there can be more or less):

Do you need to parse only positive integer values, or are you looking to
parse also decimal and exponential numbers?


Rui Maciel
 
8

88888 Dihedral

Luca Risoliaæ–¼ 2013å¹´3月24日星期日UTC+8上åˆ10時57分05秒寫é“:

Uhn, I have not tested C++ in new multi-core cpus running
multiple threads that encourage to separate I/O blocking
operations in a thread for collecting data into some
container object which will be used by the main thread that is
running in another core.
 
M

Mike Copeland

Refactor »rename variable«: »vector« -> »target«
(corresponding to »source«), »i« -> »num« (corresponding to
»number«) and removed two include directives (untested):

#include <sstream>
#include <vector>

int main()
{ using number = int;
::std::stringstream source{ "1 37 36 4 7 5 6 8 9 10 20 32 23 16" };
::std::vector<number> target;
for( number num; source >> num; )target.push_back( num ); }
That's perfect! One thing that threw me was the variant of the
"for" loop which doesn't have a termination condition: I'm not sure how
it works, but it does. 8<}}
Follow up question: is there a way to use this technique to parse
non-integer values, such as doubles? I can't seem to get my compiler to
accept something like this:
for(number int, double xx; source >> xx;)
so I don't see how this technique can be applied. TIA
 
M

Mike Copeland

Are you seriously saying that you don't know how to change the std::cout
to eg. a std::vector push_back() call?

(Embarrassingly), yes. My background (in procedural programming) was
many years of very simple stuff. I've had to pick up STL piece-by-piece
as I try to convert old code to new concepts such as OOP, containers,
etc.
If your level of understanding is so primitive, I think you should start
with more basic stuff than this.

I'm stumbling along slowly, and at age 73 I don't think I have the
time or energy to start from scratch. 8<{{
 
M

Mike Copeland

Do you need to parse only positive integer values, or are you looking to
parse also decimal and exponential numbers?
All types of data, but mostly positive integers. I have a number of
"sscanf"s which process doubles and reals. I would like to parse them
"more elegantly", too.
 
T

Tobias Müller

Mike Copeland said:
That's perfect! One thing that threw me was the variant of the
"for" loop which doesn't have a termination condition: I'm not sure how
it works, but it does. 8<}}

Actually it does have one. Remember that the syntax of for-loops is:
for (start-expression; termination-condition; step-expression) {...}
The 'step-expression' is missing, not the 'termination-condition'.

In this case, the termination condition does two things:
- Parse the value
- test for failure

Maybe it is easier to understand if you do everything step by step:

number num;
while (true)
{
source >> num;
if (!source)
break;
target.push_back(num);
}
Follow up question: is there a way to use this technique to parse
non-integer values, such as doubles? I can't seem to get my compiler to
accept something like this:
for(number int, double xx; source >> xx;)

how did you come to that solution?
so I don't see how this technique can be applied.

Just change 'int' for 'double' and you're done:
using number = double;

Tobi
 
M

Mike Copeland

Actually it does have one. Remember that the syntax of for-loops is:
for (start-expression; termination-condition; step-expression) {...}
The 'step-expression' is missing, not the 'termination-condition'.

In this case, the termination condition does two things:
- Parse the value
- test for failure

Maybe it is easier to understand if you do everything step by step:

number num;
while (true)
{
source >> num;
if (!source)
break;
target.push_back(num);
}


how did you come to that solution?

I just tried to use what I know of for loop syntax in an
extrapolation of this new form. (Obviously, it didn't compile...)
Just change 'int' for 'double' and you're done:
using number = double;

In my system (both VS6.0 and VS10.0 Express) the "using number = x"
nomenclature doesn't compile. That's why I tried the above. 8<{{
 
R

Rui Maciel

Mike said:
All types of data, but mostly positive integers. I have a number of
"sscanf"s which process doubles and reals. I would like to parse them
"more elegantly", too.

It looks like a job for a proper parser. Any solution which can be pulled
together with sscanfs and the sort will inevitable be broken and fallible.

Here's a starting point:
http://en.wikipedia.org/wiki/LL_parser


Hope this helps,
Rui Maciel
 
Ö

Öö Tiib

It is. However, as I've tried to work with solutions I've received
I've learned that the approaches I've taken aren't as good as I'd like.

Perhaps you should elaborate what is the issue with solutions proposed
so far to you. Why you feel those aren't as good as you'd like?
Consequently, I've had to refine my query...and this has been the
result.
Sorry if you consider it a waste of time.

People propose the solutions that work for them in similar situations.
If you expected something different then you should describe that
difference. What criteria the solutions did not meet? Asking
again without those criteria does not likely get different answers.
I continue to work at
improving my skills, as I evolve from procedural coding to more up-to-
date technologies and methods. At my age (73), it's not as easy as it
may be for most here.

No one enjoys too sharp mind, whatever the age. Sharpening it up with
intellectual puzzles like coding is possible in any age. You need to
realize that measuring algorithms efficiency, robustness and correctness
are all part of the puzzle ... just that the criteria are often harder to
specify for efficiency and robustness.
 
H

Haochen Xie

Mike said:
I just tried to use what I know of for loop syntax in an
extrapolation of this new form. (Obviously, it didn't compile...)

In my system (both VS6.0 and VS10.0 Express) the "using number = x"
nomenclature doesn't compile. That's why I tried the above. 8<{{

Tobias must not have read your whole code snippet. You should write the
for loop like this:

for(double xx; source >> xx; )

He must have thought your code is like "for(number xx; source >> xx;)".
In this case, simply adding "using name = double;" would work. Just to
point out, that form of using statement is a c++11 standard and may not
work on old c++ compilers. Even if you have an up-to-date one, you are
likely have to tell it treat your code using the new standard. For
example, if using gcc, you should pass "--std=c++0x". An equivalent in
old standard is "typedef double number", which makes "number" an alias
of "double"... So the following codes are semantically equal:

typedef double number; // or "using number=double;"
for(number xx; source >> xx; )

and

for(double xx; source >> xx; ) // you should actually write code like
// this...

And just to mention, it is impossible to declare variables more than one
type in the initialization section of a for statement without using
tricks like the unnamed struct trick (you probably don't want to know
about it.. It's the dark side of c++ and may be too early for you.)
 
S

Stefan Ram

Rui Maciel said:
It looks like a job for a proper parser. Any solution which can be pulled
together with sscanfs and the sort will inevitable be broken and fallible.

The following code uses a simplistic parser to parse numbers from

"1 2. .3 4"

into a heterogenous target, so that

for( numval * v : target )
::std::cout << static_cast< ::std::string >( *v )<< '\n';

prints

( int )1
( double )2.000000
( double )0.300000
( int )4

and

double sum = 0; for( numval * v : target )sum += *v;
::std::cout << sum << '\n';

prints

7.3

. I am not sure whether my use of dynamic_cast is correct since it's
the first time that I actually used it.

#include <iostream>
#include <ostream>
#include <sstream>
#include <vector>
#include <cstdio>
#include <cctype>
#include <string>

struct numval
{ virtual ::std::string text() const = 0;
virtual operator ::std::string() const = 0;
virtual operator double() const = 0;};

struct intval : public numval
{ int val;
intval( ::std::string & text )
{ val = ::std::stoi( text ); }
virtual operator double() const
{ return this->val; }
virtual operator ::std::string() const
{ return text(); }
::std::string text() const
{ ::std::string s;
s += "( int )";
s += ::std::to_string( this->val );
return s; }};

struct doubleval : public numval
{ double val;
doubleval( ::std::string & text )
{ val = ::std::stod( text ); }
virtual operator double() const
{ return this->val; }
virtual operator ::std::string() const
{ return text(); }
::std::string text() const
{ ::std::string s;
s += "( double )";
s += ::std::to_string( this->val );
return s; }};

numval * new_numval
( bool point, ::std::string & text )
{ return point ? dynamic_cast< numval * >( new doubleval( text )):
dynamic_cast< numval * >( new intval( text )); }

void numeral
( ::std::stringstream & source,
::std::string & seen,
::std::vector<numval *> & target )
{ int c;
bool looping = true;
bool point = false;
while( looping )
{ c = source.peek();
if( c == EOF )looping = false; else
if( isdigit( c ))seen.push_back( source.get() );
else if( c == '.' )
{ if( point )looping = false; else
{ point = true; seen.push_back( source.get() ); }}
else looping = false; }
if( seen == "." )seen.push_back( '0' );
target.push_back( new_numval( point, seen ));
seen.clear(); }

void other
( ::std::stringstream & source,
::std::string & seen,
::std::vector<numval *> & target )
{ int c = source.peek();
while( !( isdigit( c )|| c == '.' ))
{ seen.push_back( source.get() );
c = source.peek(); }
seen.clear(); }

int main()
{ ::std::stringstream source{ "1 2. .3 4" };
::std::string seen;
::std::vector<numval *> target;
int c; do
{ c = source.peek();
if( isdigit( c ) || c == '.' )numeral( source, seen, target );
else other( source, seen, target ); }
while( source.good() );
for( numval * v : target )
::std::cout << static_cast< ::std::string >( *v )<< '\n';
double sum = 0;
for( numval * v : target )sum += *v;
::std::cout << sum << '\n'; }
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top