parsing a string

A

Allan Bruce

I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping
colons and using memcpy() which is messy.

Thanks
Allan
 
U

Unforgiven

Allan Bruce said:
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a
way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am
skipping
colons and using memcpy() which is messy.

How's about something like this?

#include <string>
#include <sstream>
using namespace std;

int main()
{
string s = "FL:1234ABCD:3:FileName With Spaces.txt\n";
string FileName;
int ID;
string GUID;
istringstream stream(s);
stream.ignore(3); // ignore "FL:"
getline(stream, GUID, ':');
stream >> ID;
stream.ignore(); // ignore the ':' after id
getline(stream, FileName, '\n');

return 0;
}
 
H

Hendrik Schober

Allan Bruce said:
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping
colons and using memcpy() which is messy.

const std::string str = "FL:1234ABCD:3:FileName With Spaces.txt\n";
std::istringstream iss( str );

std::string FL;
std::getline( iss, FL, ':' );
if( FL!="FL" || !iss.good() ) throw invalid_input(str);

std::string GUID;
std::getline( iss, GUID, ':' );
if( !isValid(GUID) || !iss.good() ) throw invalid_input(str);

int ID;
iss >> ID;
if( !iss.good() ) throw invalid_input(str);

char ch = '\0';
iss >> ch;
if( ch!=':' || !iss.good() ) throw invalid_input(str);

std::string fname;
std::getline( iss, fname );
if( !iss.eof() ) throw invalid_input(str);
Thanks
HTH,

Allan


Schobi

--
(e-mail address removed) is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers
 
A

Allan Bruce

Allan Bruce said:
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping
colons and using memcpy() which is messy.

Thanks
Allan

Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);

and it works a treat
Allan
 
U

Unforgiven

Allan Bruce said:
Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);

It is shorter, but I feel obliged to point out that it is not safer. If you
set any of the types wrong, either in the format string or in the argument
list, sscanf will fail. If the type sscanf tries to read is smaller than the
variable you pass it, it'll likely fail rather benevolently, but if it's the
other way around you have the possibility of a stack overflow and all the
security problems associated with that. Same goes for overrunning your
string buffers, can be dangerous on the heap, even more so on the stack. The
question you need to ask yourself is whether you can trust yourself (and any
future maintainers of the code) to always ensure type safety and buffer
length safety where sscanf does not enforce it, and (even more importantly)
if you can always trust the input to be well-formed. If the string is coming
from a network/internet source especially, you need something a lot more
secure than sscanf since network input can't be trusted. Local file input is
slightly better, but might still be malformed either purpously or by
filesystem malfunction. This still applies even if some other part of your
program generates the string, because then you are assuming that that part
of the program is 100% correct and can never generate a wrong string, and
that the memory in which the string resides is not corrupted by some
external (or internal) force.

Of course I can't force you to use any kind of construct, but I just hope
you're aware of the dangers that lie with sscanf, and that those few more
lines of code of the examples I and Hendrik provided can easily help avoid
them.
 
D

Dave Moore

Allan Bruce said:
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

Use std::istringstream ... this has get and getline functions that
allow you to specify a termination character. You might find
something like this useful:

#include <string>
#include <sstream>
#include <iostream>

class my_parser {
std::string text;
public:
static const int max_field_size=100;

explicit my_parser(const char *s) : text(s) {}

void tokenize(char *fields[], int nf, const char delimiter=':') {
// parses internal string, breaking at instances of <delimiter>
// which are thrown away. Returns separated values as fields
std::istringstream buffer(text);
int f=0;
while (f<nf) {
buffer.getline(fields[f], max_field_size, delimiter);
++f;
}
}
};

int main() {
using namespace std;
const char* s="FL:1234ABCD:3:FileName With Spaces.txt\n";
char **fields = new char*[4];
for (int i=0; i<4; ++i)
fields=new char[my_parser::max_field_size];

my_parser parse(s);
parse.tokenize(fields, 4);
for (int i=0; i<4; ++i)
cout << fields << endl;

return 0;
}

Obviously this is somewhat lobotomized, but I only spent a few minutes
on it .. it compiles (under gcc 3.3) and runs, giving the desired
result. Hopefully you can get the gist and adapt it to your purposes.

HTH, Dave Moore
 
K

Kevin Goodsell

Allan said:
Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);

and it works a treat

Dear god, man, think of the children!!

Seriously, that code has very serious problems and should not be used.
Never, NEVER use %s, or %[ in scanf() without specifying a correct field
width. It's a security exploit waiting to happen. It's gets()
reinvented. Just... no.

By the time you manage to get proper field widths in there, you'll find
a few things: 1) It's nearly as complex as the stringstream solutions.
Maybe even more complex. 2) It's less flexible than the stringstream
solutions. 3) It's less clear and maintainable than the stringstream
solutions. 4) Over the life of your program, it will harbor more bugs
than the stringstream solution.

Besides that, it's still wrong. I count five conversion fields, and only
three additional arguments. That gives undefined behavior (likely
result: stack corruption leading to a program crash or random bizarre
behavior). I also see %s followed by a ':'. That can never succeed,
since %s will not stop on a ':' character.

-Kevin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

sscanf 11
Parsing a string 44
Copy string from 2D array to a 1D array in C 1
Parsing Numeric Data 2
sscanf and string 5
Parsing String#dump data? 4
Parsing (a Series of) Variables 21
parsley parsing question 0

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,067
Latest member
HunterTere

Latest Threads

Top