Looking for a C program to parse CSV

V

vvk4

I have an excel spreadsheet that I need to parse. I was thinking of saving
this as a CSV file. And then reading the file using C. The actual in EXCEL
looks like:
a,b a"b a","b a,",b

In CSV format looks like:
"a,b","a""b","a"",""b","a,"",b"

Does anybody have suggestions or have C program based code to parse CSV.
Please reply to the message board itself. I do not wish to get spam.
 
F

Flash Gordon

vvk4 said:
I have an excel spreadsheet that I need to parse. I was thinking of saving
this as a CSV file. And then reading the file using C. The actual in EXCEL
looks like:
a,b a"b a","b a,",b

In CSV format looks like:
"a,b","a""b","a"",""b","a,"",b"

Yes, this is a sensible approach.
Does anybody have suggestions or have C program based code to parse CSV.

Yes, write a program in standard C and post here with any problems. This
is not a sources wanted group, it is a group for discussing the language.
Please reply to the message board itself.

This is not a message board, it is a news group. The fact that you use a
web interface does not change this, and most users do *not* use a web
interface.
> I do not wish to get spam.

We don't wish to get off topic posts. Did you read the FAQ, welcome
message, and a few days worth of posts before posting here? I think not.
 
C

Christopher Benson-Manica

vvk4 said:
Does anybody have suggestions or have C program based code to parse CSV.
Please reply to the message board itself. I do not wish to get spam.

Read the strings from the file with fgets(), then parse the strings
with strtok() and do whatever you like with them.
 
A

Anonymous 7843

Read the strings from the file with fgets(), then parse the strings
with strtok() and do whatever you like with them.

I think this is bad advice for the kind of file the original poster had
in mind. strtok() doesn't understand quotes, plus it treats runs of
separators as a single unit so empty fields in a CSV file can be
inadvertently skipped. You can use strtok, but you'll have to do quite
some work to overcome its limitations.

Better advice might be to just look at each character in the string
left to right and keep track of whether you're in a quoted string or
not. When you reach an unquoted comma, that's a field separator.
That, combined with turning "" into " inside of quotes would be
sufficient and probably simpler than strtok().

Along the way, one might learn things about parsing, which is perhaps
more interesting than the code itself.
 
C

Christopher Benson-Manica

Anonymous 7843 said:
Better advice might be to just look at each character in the string
left to right and keep track of whether you're in a quoted string or
not. When you reach an unquoted comma, that's a field separator.
That, combined with turning "" into " inside of quotes would be
sufficient and probably simpler than strtok().

I had forgotten about commas in the quoted fields, but OP didn't
specify what quality of suggestions he or she was looking for :)
 
V

vvk4

Here is a C program that needs to be debugged. It will read first 999
characters of a line.
#include <stdio.h>
#include <string.h>


void main()
{
FILE *myfile;
char line1[1000];
char line2[1000];
char line3[1000]; /* Each field in the line */
char *stptr;
int flag = 0;
int idx = 0;
int lcount = 0; /* Loop counter for debugging */
myfile = fopen("alive.txt","r");
if(!myfile)
{
puts("Some kind of file error!");
exit(0);
}

/* Get a line from file */
while (fgets(line1,999,myfile) != NULL)
{
strcpy(line2,line1);
stptr = line2;

/* start going character by character thro the line */
while (*stptr != '\0')
{ lcount++;
printf("%d",lcount);
/* If field begins with " */
if (*stptr == '"')
{ int flag = 0;
while (flag = 0)
{ idx = 0;
stptr++;
/* Find corresponding closing " */
while (*stptr != '"')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
stptr++;
idx++;
if (*stptr == ',' || *stptr == '\0')
{
line3[idx] = '\0';
printf("%s",line3);
flag = 1;
}
else if (*stptr == '"')
{ line3[idx] = *stptr;
stptr++;
idx++;
}
}
}
else
{ idx = 0;
while (*stptr != '\0' && *stptr != ',')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
line3[idx] = '\0';
printf("%s",line3);
}
if (*stptr != '\0' && *stptr == ',')
stptr++;
strcpy(line2,stptr);
stptr = line2;
}
}
fclose(myfile);
}
 
V

vvk4

Here is the program. It needs debugging. It is looping.
#include <stdio.h>
#include <string.h>


void main()
{
FILE *myfile;
char line1[1000];
char line2[1000];
char line3[1000]; /* Each field in the line */
char *stptr;
int flag = 0;
int idx = 0;
int lcount = 0; /* Loop counter for debugging */
myfile = fopen("alive.txt","r");
if(!myfile)
{
puts("Some kind of file error!");
exit(0);
}

/* Get a line from file */
while (fgets(line1,999,myfile) != NULL)
{
strcpy(line2,line1);
stptr = line2;

/* start going character by character thro the line */
while (*stptr != '\0')
{ lcount++;
printf("%d",lcount);
/* If field begins with " */
if (*stptr == '"')
{ int flag = 0;
while (flag = 0)
{ idx = 0;
stptr++;
/* Find corresponding closing " */
while (*stptr != '"')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
stptr++;
idx++;
if (*stptr == ',' || *stptr == '\0')
{
line3[idx] = '\0';
printf("%s",line3);
flag = 1;
}
else if (*stptr == '"')
{ line3[idx] = *stptr;
stptr++;
idx++;
}
}
}
else
{ idx = 0;
while (*stptr != '\0' && *stptr != ',')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
line3[idx] = '\0';
printf("%s",line3);
}
if (*stptr != '\0' && *stptr == ',')
stptr++;
strcpy(line2,stptr);
stptr = line2;
}
}
fclose(myfile);
}
 
M

Martin Ambuhl

vvk4 said:
Here is a C program that needs to be debugged. It will read first 999
characters of a line.

Your code (preserved below) does far too much. Here is a cut-down
version that may provide you with a place to start in properly designing
your program:

#include <stdio.h>
#include <stdlib.h>

#define LBUFSIZ 1000

int main(void)
{
FILE *myfile;
char input_line[LBUFSIZ];
char token[LBUFSIZ];
char *cptr, *tptr;

if (!(myfile = fopen("alive.txt", "r"))) {
fprintf(stderr, "Could not open \"alive.txt\" for reading\n");
exit(EXIT_FAILURE);
}

/* Get a line from file */
while (fgets(input_line, sizeof input_line, myfile))
for (cptr = input_line; *cptr;) {
tptr = token;
*tptr = 0;
if (*cptr == '"') {
for (tptr = token, cptr++; *cptr && *cptr != '"';)
*tptr++ = *cptr++;
*tptr = 0;
}
else {
for (tptr = token; *cptr && *cptr != ',';)
*tptr++ = *cptr++;
*tptr = 0;
}
if (*cptr && *cptr == ',')
cptr++;
if (*token)
printf("%s", token);
}
putchar('\n');
fclose(myfile);
}



[OP's code]
#include <stdio.h>
#include <string.h>


void main()
{
FILE *myfile;
char line1[1000];
char line2[1000];
char line3[1000]; /* Each field in the line */
char *stptr;
int flag = 0;
int idx = 0;
int lcount = 0; /* Loop counter for debugging */
myfile = fopen("alive.txt","r");
if(!myfile)
{
puts("Some kind of file error!");
exit(0);
}

/* Get a line from file */
while (fgets(line1,999,myfile) != NULL)
{
strcpy(line2,line1);
stptr = line2;

/* start going character by character thro the line */
while (*stptr != '\0')
{ lcount++;
printf("%d",lcount);
/* If field begins with " */
if (*stptr == '"')
{ int flag = 0;
while (flag = 0)
{ idx = 0;
stptr++;
/* Find corresponding closing " */
while (*stptr != '"')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
stptr++;
idx++;
if (*stptr == ',' || *stptr == '\0')
{
line3[idx] = '\0';
printf("%s",line3);
flag = 1;
}
else if (*stptr == '"')
{ line3[idx] = *stptr;
stptr++;
idx++;
}
}
}
else
{ idx = 0;
while (*stptr != '\0' && *stptr != ',')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
line3[idx] = '\0';
printf("%s",line3);
}
if (*stptr != '\0' && *stptr == ',')
stptr++;
strcpy(line2,stptr);
stptr = line2;
}
}
fclose(myfile);
}
 
V

vvk4

Here is the program that correctly reads a CSV FILE and spits the value
each cell. I have numbered each cell just for verification. Thanks to
Martin and everybody else for suggestions. You may be able to condense
this further.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define LBUFSIZ 1000



int main()
{
FILE *myfile;
char line1[LBUFSIZ];
char line2[LBUFSIZ];
char line3[LBUFSIZ]; /* Each field in the line */
char *stptr;
int flag = 0;
int idx = 0;
int lcount = 0; /* Cell Seperator */

if (!(myfile = fopen("alive.txt", "r")))
{
fprintf(stderr, "Could not open \"alive.txt\" for reading\n");
exit(EXIT_FAILURE);
}

/* Get a line from file */
while (fgets(line1,sizeof line1,myfile) != NULL)
{ lcount = 0;
strcpy(line2,line1);
stptr = line2;

/* start going character by character thro the line */
while (*stptr != '\0')
{ lcount++;
printf("%d",lcount);
/* If field begins with " */
if (*stptr == '"')
{
int flag = 0;
idx = 0;
while (flag == 0)
{

stptr++;
/* Find corresponding closing " */
while (*stptr != '"')
{ line3[idx] = *stptr;
idx++;
stptr++;
}

stptr++;
if (*stptr != '\0' && *stptr == ',')
{

line3[idx] = '\0';
printf("%s",line3);
flag = 1;
}
else if (*stptr != '\0' && *stptr == '"')
{ line3[idx] = *stptr;
idx++;
}
else
{

line3[idx] = '\0';
printf("%s",line3);
flag = 1;
}
}
}
else
{ idx = 0;
while (*stptr != '\0' && *stptr != ',')
{ line3[idx] = *stptr;
idx++;
stptr++;
}
line3[idx] = '\0';
printf("%s",line3);
}
if (*stptr != '\0' && *stptr == ',')
stptr++;
strcpy(line2,stptr);
stptr = line2;
}
putchar('\n');
}
fclose(myfile);
}
 
A

Anonymous 7843

I had forgotten about commas in the quoted fields, but OP didn't
specify what quality of suggestions he or she was looking for :)

It's not like they listen to our suggestions about the quality
of questions...
 
C

Christopher Benson-Manica

vvk4 said:
Here is the program. It needs debugging. It is looping.

It is proper Usenet etiquette to include the text you are replying to.
To do this using Google groups, please follow the instructions below,
penned by Keith Thompson:

If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
Flash Gordon wrote (context restored):

Did you understand the import of these words? Get thee to the FAQ and
welcome message.

http://www.ungerhu.com/jxh/clc.welcome.txt
http://www.eskimo.com/~scs/C-faq/top.html
#include <stdio.h>
#include <string.h>
void main()

This is how we know you didn't read the FAQ. void main() is patently
wrong.
{
FILE *myfile;
char line1[1000];
char line2[1000];
char line3[1000]; /* Each field in the line */
char *stptr;
int flag = 0;
int idx = 0;
int lcount = 0; /* Loop counter for debugging */
myfile = fopen("alive.txt","r");
if(!myfile)
{
puts("Some kind of file error!");
exit(0);

#include <stdlib.h>

exit( EXIT_FAILURE );

Zero denotes success. A file error is not success.
/* Get a line from file */
while (fgets(line1,999,myfile) != NULL)

Better is

while( fgets(line1,sizeof line1,myfile) != NULL )

If you change the size of line1, this version changes with it. Yours
does not.

The rest of your code is more complicated than it has to be. Think
about using strcspn() to find the commas and double quotes more
easily.
 
K

Keith Thompson

Christopher Benson-Manica said:
It is proper Usenet etiquette to include the text you are replying to.

A quibble: It is proper Usenet etiquette to include *the relevant
portions of* the text you are replying to. Sometimes that's the
entire article; more often it isn't.

(If I were being really pedantic I would have written "to which you
are replying".)
 
M

Mike Wahler

Default User said:
Keith Thompson wrote:





Why is that?

It's because it is incorrect English grammar to
end a sentence with a preposition (e.g.
"... which you are replying to."

BTW I once ended a sentence with a proposition,
and I'm still serving my sentence. (Just kidding,
she's great.) :)

-Mike
 
S

Skarmander

Mike said:
It's because it is incorrect English grammar to
end a sentence with a preposition (e.g.
"... which you are replying to."
Off-topic, but before someone starts quoting Churchill: most people feel
free to ignore this rather artificial rule. See
http://www.grammartips.homestead.com/prepositions1.html, just one site
of many who explain this. See also http://en.wikipedia.org/wiki/Preposition.

Fact is, people adhere to this rule because they adhere to this rule, so
violating it is wrong because they "just know" it's wrong. It has little
to do with "live" English grammar.

"It is proper Usenet etiquette to include the text you are replying to"
is a fine English sentence. Ask anyone who's not a grammarian, then ask
the grammarians why it shouldn't be. :)
BTW I once ended a sentence with a proposition,
and I'm still serving my sentence. (Just kidding,
she's great.) :)
The site I linked to even uses your pun, so it can't be coincidence. :)

S.
 
D

Default User

Mike said:
It's because it is incorrect English grammar to
end a sentence with a preposition (e.g.
"... which you are replying to."


No, it's not. Nor is it incorrect grammar to split the infinitive, the
other great urban legend of English grammar.



Brian
 
C

Christopher Benson-Manica

Keith Thompson said:
A quibble: It is proper Usenet etiquette to include *the relevant
portions of* the text you are replying to. Sometimes that's the
entire article; more often it isn't.

If the only quibble you had with my post was that I included too much
quoted text (I did trim some, but perhaps not adequately), then I'm
happy with it.
 
K

Keith Thompson

Christopher Benson-Manica said:
If the only quibble you had with my post was that I included too much
quoted text (I did trim some, but perhaps not adequately), then I'm
happy with it.

My quibble was that you didn't mention the need to trim text, not that
you failed to do so yourself. Your statement was:

] It is proper Usenet etiquette to include the text you are replying to.

Of course, if "the text you are replying to" referred only to the
relevant text, rather than the entire article, then I have hardly any
quibble at all.
 
M

Mark McIntyre

On Sat, 08 Oct 2005 01:02:51 +0200, in comp.lang.c , Skarmander
(of the rule of english grammar that you may not end a sentence with a
preposition)
Fact is, people adhere to this rule because they adhere to this rule, so
violating it is wrong because they "just know" it's wrong. It has little
to do with "live" English grammar.

If you can't see how bogus this argument is, I pity you. One might as
well say that slang is proper engish, innit like, y'know?
 
S

Skarmander

Mark said:
On Sat, 08 Oct 2005 01:02:51 +0200, in comp.lang.c , Skarmander
(of the rule of english grammar that you may not end a sentence with a
preposition)




If you can't see how bogus this argument is, I pity you. One might as
well say that slang is proper engish, innit like, y'know?

There is a world of difference between using slang in contexts where
it's not appropriate and ending your sentences with a preposition. So no
-- one might not "as well" say that.

If you're looking for an absolute stance on what makes language use
right and wrong, however, I have none to offer you, and wouldn't trust
people who claim they do. I was merely pointing out that I cannot agree
with people who judge that ending a sentence with a preposition is
wrong, because they have nothing more concrete to offer than "because we
say so".

That argument is equally bogus: we should speak and write only as
grammarians tell us to, because they know how the language works and the
majority of its speakers don't. That's not how natural language works,
no matter how hard you wish it worked that way.

Clearly the truth is in the middle -- grammar has rules, but rules
evolve from use. In this case, the argument is that the rule being
applied never evolved from anything but the imagination of language
lawyers, and while that may work for a programming language, it doesn't
work for English.

But let's not drag comp.lang.c into a discussion of prescriptivist vs.
descriptivist grammar. In C, it's all much simpler: consult the standard
and it will tell you.

S.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top