Parsing through a file and collect data ...

J

Johnny Sandaire

Greetings,

I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code

There may be hundreds of these entries in the file. I would like to
parse through it and collect this infor and assigne each value to a
variable, which I can later insert into a database.

I count the number of entries at the begining of the read and know how
many records that I need to parse through. I am having difficulties
parsing through the semicolon and the two brackets to gram what is in
between and after.

Any assistance would be greatly appreciated.

Thank you and best regards,

Johnny
 
E

Elliot

Hi Johnny
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code

My first suggestion here is to delimit each field, something like..

Charlene|1719056|:2011392059|"1.908.555.1212"|07083

And then seperate each record by a DIFFERENT delimeter. Some people prefer a newline, but you could use spaces or any other such character.
Example:

Charlene|1719056|:2011392059|"1.908.555.1212"|07083 Charlene|1719056|:2011392059|"1.908.555.1212"|07083
-or-
Charlene|1719056|:2011392059|"1.908.555.1212"|07083
Charlene|1719056|:2011392059|"1.908.555.1212"|07083

Then you could simply do..

open( file );

read_a_record(); /* Take each record one at a time. Try looking into strtok() */
-> split_record_up(); /* Take each entry of the record. Try strtok() again */

continue read_a_record() until EOF

close( file );

Hope this helped!

-Elliot :)
 
J

Johnny Sandaire

Elliot said:
Hi Johnny


My first suggestion here is to delimit each field, something like..

Charlene|1719056|:2011392059|"1.908.555.1212"|07083

And then seperate each record by a DIFFERENT delimeter. Some people prefer a newline, but you could use spaces or any other such character.
Example:

Charlene|1719056|:2011392059|"1.908.555.1212"|07083 Charlene|1719056|:2011392059|"1.908.555.1212"|07083
-or-
Charlene|1719056|:2011392059|"1.908.555.1212"|07083
Charlene|1719056|:2011392059|"1.908.555.1212"|07083

Then you could simply do..

open( file );

read_a_record(); /* Take each record one at a time. Try looking into strtok() */
-> split_record_up(); /* Take each entry of the record. Try strtok() again */

continue read_a_record() until EOF

close( file );

Hope this helped!

-Elliot :)


Elliot,

Thank you for your suggestions, however, I have no control over the
structure of the data. i have to deal with it as is and manipulate it
as posted.

Any loop suggestions or string manipulation concept and techniques
would be greatly appreciated.

Thanks,

Johnny
 
E

Elliot

On 15 Nov 2003 21:49:12 -0800
Thank you for your suggestions, however, I have no control over the
structure of the data. i have to deal with it as is and manipulate it
as posted.

Any loop suggestions or string manipulation concept and techniques
would be greatly appreciated.

Thanks,

Johnny

Eek!
Things got a bit tougher, but they shouldn't be too hard.
The only way to get the computer to be able to parse the data would be to make sure the majority of the fields have a DEFINITE LENGTH.

Charlene1719056:2011392059"1.908.555.1212"07083
name size Unique key phone number zip code

My first logical guess would be to do this..
read the whole string
strrev( string );
zip_code = strrev( read_chars( 5 ) ); /* 07083 */
phone = strrev( read_chars( 16 ) ); /* "1.908.555.1212" */
key_no = strrev( read_chars( 11 ) ); /* :2011392059 */
size = strrev( read_chars( 7 ) ); /* 1719056 */

name = strrev( read_chars( strlen( string ) - 39 ) ) /* 39 = (5+16+11+7) */

This, of course, would mean that the zip, phone, key, and size would all be the same length. Unfortunately I don't expect that your size field will always be constant - i.e. it won't always be a 7-figure number. In my opinion, I can only see this problem becoming something more technical. If the sizes were all different lengths, you would have to read one digit at a time (as characters!) until you reached a real character (as that would signify the start of the name).

I hope this helps!
-Elliot :)
 
J

Johnny Sandaire

Elliot said:
On 15 Nov 2003 21:49:12 -0800


Eek!
Things got a bit tougher, but they shouldn't be too hard.
The only way to get the computer to be able to parse the data would be to make sure the majority of the fields have a DEFINITE LENGTH.

Charlene1719056:2011392059"1.908.555.1212"07083
name size Unique key phone number zip code

My first logical guess would be to do this..
read the whole string
strrev( string );
zip_code = strrev( read_chars( 5 ) ); /* 07083 */
phone = strrev( read_chars( 16 ) ); /* "1.908.555.1212" */
key_no = strrev( read_chars( 11 ) ); /* :2011392059 */
size = strrev( read_chars( 7 ) ); /* 1719056 */

name = strrev( read_chars( strlen( string ) - 39 ) ) /* 39 = (5+16+11+7) */

This, of course, would mean that the zip, phone, key, and size would
all be the same length. Unfortunately I don't expect that your size
field will always be constant - i.e. it won't always be a 7-figure
number. In my opinion, I can only see this problem becoming something
more technical. If the sizes were all different lengths, you would
have to read one digit at a time (as characters!) until you reached a
real character (as that would signify the start of the name).
I hope this helps!
-Elliot :)


Elliot,

Thank you for your advice. Since I am not sure on how the string will
change over time, I used the String functions to parse through it
looking for the first instance of " and the last of " etc... Then, I
used a substring function call to grab the data in between. Seems to
be working now.

Thanks,

Johnny
 
S

Simon Elliott

Johnny Sandaire said:
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code

I'd read this record by record into a char array, and use sscanf to
split it up. Or perhaps read it record by record into a std::string and
use sscanf and the std::string c_str() method.
 
J

Johnny Sandaire

Simon Elliott said:
I'd read this record by record into a char array, and use sscanf to
split it up. Or perhaps read it record by record into a std::string and
use sscanf and the std::string c_str() method.

Elliott,

If I have the following:

char ScannedData[256]="proc x86 family 6 model 7 type 3"

How can I use sscanf to grab x86, 6, 7 and 3?

I then want to replace the x with the value that is after family to create 686.

Thanks,

Johnny
 
K

Karl Heinz Buchegger

Johnny said:
Elliott,

If I have the following:

char ScannedData[256]="proc x86 family 6 model 7 type 3"

How can I use sscanf to grab x86, 6, 7 and 3?

Depends.
Are those texts constant or can they vary? Is the format fixed or is it variable?

I assume the simplest case:

char Filler1[80], Filler2[80], Filler3[80], Filler4[80];
char Proc[80], Family[80], Model[80], Type[80];

sscanf( ScannedData, "%s %s %s %s %s %s %s %s", Filler1, Proc,
Filler2, Family,
Filler3, Model,
Filler4, Type );
I then want to replace the x with the value that is after family to create 686.

So Family is always 1 character?

Proc[0] = Family[0];


Of course, the above would need some error checking, etc.
Additionally: This is just one (simple) way to do it. Since your
requirements may vary, so does the way to solve that thing.
Also: Since this is C++, a swtich from character arrays and sscsanf
to std::string and std::stringstreams would be a good idea.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top