extracting data from a file

T

Tony Clarke

Hi All,

I have been trying to extract data from a text file using the fscanf()
functions and sscanf() functions. The file is of various characters and
integers separated by semicolons, the problem I'm having is that each line
is of varying length and the fields separated by semicolons are of varying
length also. Is there a way that I could check the first field and depending
on this extract data from certain fields contained in this line. An example
of the type of information in the text file is given below. What I want to
do is depending on the first field i.e. "1031" extract the time i.e.
"15:09:27" or some other details. I'm just wondering if anyone could suggest
an appropriate method for approaching this. I think the problem is that each
line is not formatted the same.

1031;00005882;admin;5;Printer;2;103001-;STD;Lodg
;12.06.2003;15:09:27;13.06.2003;08:30:31;1;1

1032;00005882;;;;;;;

1040;00005882;12.06.2003;15:09:33;12.06.2003;17:01:21;1;0;;3;12400;0;;;12400
;0;0;;11366

1041;00005882;1;1
 
B

Ben Fitzgerald

Hi All,

I have been trying to extract data from a text file using the fscanf()
functions and sscanf() functions. The file is of various characters and
integers separated by semicolons, the problem I'm having is that each line
is of varying length and the fields separated by semicolons are of varying
length also. Is there a way that I could check the first field and depending
on this extract data from certain fields contained in this line. An example
of the type of information in the text file is given below. What I want to
do is depending on the first field i.e. "1031" extract the time i.e.
"15:09:27" or some other details. I'm just wondering if anyone could suggest
an appropriate method for approaching this. I think the problem is that each
line is not formatted the same.

You have to be able to describe your desired system. Try writing some
pseudo-code and work out what you need to do, *then* write the program.

Sounds to me like you need to check a field (based on how many ';' have
been read in if it's, say, the 4th field, then read the rest or just
skip on through until you hit '\n' and then start testing again.

Anyway, write the pseudo-code first!

good luck,
 
D

Dan Pop

In said:
I have been trying to extract data from a text file using the fscanf()
functions and sscanf() functions. The file is of various characters and
integers separated by semicolons, the problem I'm having is that each line
is of varying length and the fields separated by semicolons are of varying
length also. Is there a way that I could check the first field and depending
on this extract data from certain fields contained in this line. An example
of the type of information in the text file is given below. What I want to
do is depending on the first field i.e. "1031" extract the time i.e.
"15:09:27" or some other details. I'm just wondering if anyone could suggest
an appropriate method for approaching this. I think the problem is that each
line is not formatted the same.

1031;00005882;admin;5;Printer;2;103001-;STD;Lodg
;12.06.2003;15:09:27;13.06.2003;08:30:31;1;1

1032;00005882;;;;;;;

1040;00005882;12.06.2003;15:09:33;12.06.2003;17:01:21;1;0;;3;12400;0;;;12400
;0;0;;11366

1041;00005882;1;1

The easiest solution is to use a regexp (regular expression) library.
There are some portable ones floating around.

Depending on what the rest of the application consists of, you may want
to use a language with built-in support for regular expressions, like
Perl.

Dan
 
K

Kevin Easton

Tony Clarke said:
Hi All,

I have been trying to extract data from a text file using the fscanf()
functions and sscanf() functions. The file is of various characters and
integers separated by semicolons, the problem I'm having is that each line
is of varying length and the fields separated by semicolons are of varying
length also. Is there a way that I could check the first field and depending
on this extract data from certain fields contained in this line. An example
of the type of information in the text file is given below. What I want to
do is depending on the first field i.e. "1031" extract the time i.e.
"15:09:27" or some other details. I'm just wondering if anyone could suggest
an appropriate method for approaching this. I think the problem is that each
line is not formatted the same.

1031;00005882;admin;5;Printer;2;103001-;STD;Lodg
;12.06.2003;15:09:27;13.06.2003;08:30:31;1;1

1032;00005882;;;;;;;

1040;00005882;12.06.2003;15:09:33;12.06.2003;17:01:21;1;0;;3;12400;0;;;12400
;0;0;;11366

1041;00005882;1;1

Read each line using fgets(), then use strchr() to find the ';'
characters, replacing them with a '\0' and retaining a pointer to the
following character. Then you'll end up with something like:

f1 => "1031"
f2 => "00005882"
f3 => "admin"
f4 => "5"
f5 => "Printer"
f6 => "2"
f7 => "103001-"
f8 => "STD"
f9 => "Lodg"

where f1 to f9 are char * objects, and => denotes what they are pointing
to.

After that it should be pretty simple to code the logic you want. You
will need to decide what to do about really long lines - depending on
your data source, you may be able to set a fixed maximum line length and
silently or non-silently truncate or ignore lines that exceed it.

- Kevin.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top