? about reading a comma delimited file

H

Hilary Cotter

Thanks for all the help you gave me yesterday.

here is another question.

I have a comma delimited file called redirect.txt which looks like
this

test, /test.htm
test 123,/test123.htm

I am reading these values and processing them, but it seems like the
way I am doing it is not efficient. I was hoping for pointers on how
to make this more efficient.

// testparse.cpp : Defines the entry point for the console
application.
//

#include "stdafx.h"

//#include <stdio.h>
//#include <stdlib.h>
//#include <string.h>
//#include <ctype.h>

int main(int argc, char* argv[])
{

FILE *fp;
int i;

struct test
{
char in[100];
char out[100];
} my_test [150];


fp =fopen("c:\\Redirect.txt", "r");
if (!fp)
{
printf ("Can't open test file!\n");
return 1;
}
i=0;

while ((fscanf(fp, "%[a-z \\.] %[a-z \\.,]", &my_test.in)) !=
EOF)
{
fgetc(fp);
fscanf(fp, "%s", &my_test.out);
fgetc(fp);
printf("in %s out %s\n",my_test.in, my_test.out);
++i;
}
fclose( fp);
return 0;
}
 
I

Irrwahn Grausewitz

(e-mail address removed) (Hilary Cotter) wrote in
Thanks for all the help you gave me yesterday.

here is another question.

I have a comma delimited file called redirect.txt which looks like
this

test, /test.htm
test 123,/test123.htm

I am reading these values and processing them, but it seems like the
way I am doing it is not efficient. I was hoping for pointers on how
to make this more efficient.
<SNIP>
I will not catch on the non-standard header file and some minor
flaws in your code. Instead I'll do a sketch for an algorithm:

- read the file char-by-char, checking for EOF
- skip any leading whitespace
- copy characters to your 1st buffer till you hit ','
- skip the ',' and any following whitespace
- copy characters to your 2nd buffer till you hit '\n'
or whitespace
- continue till EOF

And, of course, make sure you're not producing any buffer overflows -
consider dynamical memory (re)allocation for your buffers.
Problems with implementing this? Don't hesitate to ask.

Regards

Irrwahn
 
A

Al Bowers

Hilary said:
Thanks for all the help you gave me yesterday.

here is another question.

I have a comma delimited file called redirect.txt which looks like
this

test, /test.htm
test 123,/test123.htm

I am reading these values and processing them, but it seems like the
way I am doing it is not efficient. I was hoping for pointers on how
to make this more efficient.

// testparse.cpp : Defines the entry point for the console
application.
//

#include "stdafx.h"

//#include <stdio.h>
//#include <stdlib.h>
//#include <string.h>
//#include <ctype.h>

int main(int argc, char* argv[])
{

FILE *fp;
int i;

struct test
{
char in[100];
char out[100];
} my_test [150];


fp =fopen("c:\\Redirect.txt", "r");
if (!fp)
{
printf ("Can't open test file!\n");
return 1;
}
i=0;

while ((fscanf(fp, "%[a-z \\.] %[a-z \\.,]", &my_test.in)) !=
EOF)


You might try. using the format string "%99[^,],%99s".
Another possibility is using function strtok.
{
fgetc(fp);
fscanf(fp, "%s", &my_test.out);
fgetc(fp);
printf("in %s out %s\n",my_test.in, my_test.out);
++i;
}
fclose( fp);
return 0;
}


If the file's data is formatted as you describe with each line
containing the "in" data and the "out" data then you could use
function fgets and function sscanf.

#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[])
{
FILE *fp;
int i,count;
char buf[100], *s;
struct test
{
char in[100];
char out[100];
}my_test [150];

fp = fopen("c:\\Redirect.txt","r");
if (!fp)
{
printf ("Can't open test file!\n");
return 1;
}
for(count = 0;NULL != fgets(buf, sizeof buf, fp);count++)
{
if((s = strchr(buf,'\n')) != NULL) *s = '\0';
else {
puts("File format error");
return 1;
}
if(2 != sscanf(buf,"%99[^,],%99s",my_test[count].in,
my_test[count].out))
{
puts("File format error");
return 1;
}
}
fclose( fp);
/* Testing */
for(i = 0; i < count; i++)
printf("my_test[%d].in = %s\n"
"my_test[%d].out = %s\n\n",
i,my_test.in,i,my_test.out);
return 0;
}
 
J

j

Hilary Cotter said:
Thanks for all the help you gave me yesterday.

here is another question.

I have a comma delimited file called redirect.txt which looks like
this

test, /test.htm
test 123,/test123.htm

I am reading these values and processing them, but it seems like the
way I am doing it is not efficient. I was hoping for pointers on how
to make this more efficient.

// testparse.cpp : Defines the entry point for the console
application.
//

#include "stdafx.h"

//#include <stdio.h>
//#include <stdlib.h>
//#include <string.h>
//#include <ctype.h>

int main(int argc, char* argv[])
{

FILE *fp;
int i;

struct test
{
char in[100];
char out[100];
} my_test [150];


fp =fopen("c:\\Redirect.txt", "r");
if (!fp)
{
printf ("Can't open test file!\n");
return 1;
}
i=0;

while ((fscanf(fp, "%[a-z \\.] %[a-z \\.,]", &my_test.in)) !=
EOF)
{
fgetc(fp);
fscanf(fp, "%s", &my_test.out);
fgetc(fp);
printf("in %s out %s\n",my_test.in, my_test.out);
++i;
}
fclose( fp);
return 0;
}


Why not instead read the entire file into memory(fseek & ftell to get file
size, then malloc that size+1, then fread) and then tokenize(strtok) using
the delimiter "\n". Then further split up each word (using delimiter ','
with strchr)based on the current string you are operating on that was
returned from strtok.

Although, I am not sure if this is the most efficient way.

Oh and, you might want to use ``indent -kr -nut'' next time you post your
code(if you have a copy of indent) :)
 
D

Dave Thompson

Hilary Cotter wrote:
while ((fscanf(fp, "%[a-z \\.] %[a-z \\.,]", &my_test.in)) !=
EOF)


The range format a-z (rather than abcdef etc.) is nonstandard.
You might try. using the format string "%99[^,],%99s".

While the length limit is certainly an improvement and the simpler
complement class probably is (though you might want [^,\n] in case the
input contains any misformatted line(s), both of these contain two
conversions and one variable. You should either do one conversion
here and the other in the body of the loop below, or both here, using
%*c to skip instead of fgetc(), and none in the body.
Another possibility is using function strtok.
On lines read with fgets(), presumably, and copy the results with
strcpy() plus overflow checks, or alternative like zero + strncat().
{
fgetc(fp);
fscanf(fp, "%s", &my_test.out);


This doesn't allow whitespace within the second value; did you want
that?
fgetc(fp);
printf("in %s out %s\n",my_test.in, my_test.out);
++i;


No protection about i overflowing the declared array size.
If the file's data is formatted as you describe with each line
containing the "in" data and the "out" data then you could use
function fgets and function sscanf.
Or strtok() and strcpy() or variant as above.

if(2 != sscanf(buf,"%99[^,],%99s",my_test[count].in,
my_test[count].out))

This doesn't allow whitespace in second value, per above.

<snip>

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top