Spaces in C

Joriveek · Feb 13, 2006

Hi,

I have a little piece of program here

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

Now the problem is that it is not counting Spaces. For example to read the
following line with the below Program is OK:

123, Hello, 3422C
3994,Hii,39948D

Result: Fine, works;

But if I have the strings like the below;
123, Hi How are you,99399 C

Result: Fails to read after "Hi" because of the space, can you suggest any
code changes to below stub?

Thanks
J
--------------------------
int j = 0;
char c;
c = ptr[0];
ibic[0] = c;
while(c != ',')
{
++j;
c = ptr[j];
ibic[j] = c;
}
ibic[j] = '\0';
return 0;
-------------------------

Eric Sosman · Feb 13, 2006

Joriveek wrote On 02/13/06 11:00,:

Hi,

I have a little piece of program here

Please post the entire thing -- reduce it to its
essentials if it's long, but post a complete compilable
program. When you are sick, do you take your entire
body to the doctor or just send a lock of your hair?

Rod Pemberton · Feb 13, 2006

Joriveek said:
Hi,

I have a little piece of program here

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

Now the problem is that it is not counting Spaces. For example to read the
following line with the below Program is OK:

123, Hello, 3422C
3994,Hii,39948D

Result: Fine, works;

But if I have the strings like the below;
123, Hi How are you,99399 C

Result: Fails to read after "Hi" because of the space, can you suggest any
code changes to below stub?

Thanks
J
--------------------------
int j = 0;
char c;
c = ptr[0];
ibic[0] = c;
while(c != ',')
{
++j;
c = ptr[j];
ibic[j] = c;
}
ibic[j] = '\0';
return 0;

Are you sure the problem is in that stub and not in the routine that reads
and fills 'ptr'?

Rod Pemberton

Joriveek · Feb 13, 2006

sorry, it is for reading a CSV file;
if there are spaces, it is not working, just reading if it is a continuous
string.

Rod Pemberton said:
Joriveek said:

Hi,

I have a little piece of program here

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

Now the problem is that it is not counting Spaces. For example to read
the
following line with the below Program is OK:

123, Hello, 3422C
3994,Hii,39948D

Result: Fine, works;

But if I have the strings like the below;
123, Hi How are you,99399 C

Result: Fails to read after "Hi" because of the space, can you suggest
any
code changes to below stub?

Thanks
J
--------------------------
int j = 0;
char c;
c = ptr[0];
ibic[0] = c;
while(c != ',')
{
++j;
c = ptr[j];
ibic[j] = c;
}
ibic[j] = '\0';
return 0;

Click to expand...

Are you sure the problem is in that stub and not in the routine that reads
and fills 'ptr'?

Rod Pemberton

Sandeep · Feb 13, 2006

Eric Sosman wrote:

[snip]

When you are sick, do you take your entire
body to the doctor or just send a lock of your hair?

this is really a good one

)

stathis gotsis · Feb 13, 2006

Joriveek said:
Hi,

I have a little piece of program here

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

Now the problem is that it is not counting Spaces. For example to read the
following line with the below Program is OK:

123, Hello, 3422C
3994,Hii,39948D

Result: Fine, works;

But if I have the strings like the below;
123, Hi How are you,99399 C

Result: Fails to read after "Hi" because of the space, can you suggest any
code changes to below stub?

Thanks
J
--------------------------
int j = 0;
char c;
c = ptr[0];
ibic[0] = c;
while(c != ',')
{
++j;
c = ptr[j];
ibic[j] = c;
}
ibic[j] = '\0';
return 0;
-------------------------

Try displaying the contents of ptr[], maybe there are no spaces in there
either.

Mark McIntyre · Feb 18, 2006

Hi,

I have a little piece of program here

you didn't post enough of your code. The sample you show doesn't make
any sense.

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

You could try using strchr or strtok

Mark McIntyre

CBFalconer · Feb 19, 2006

Mark said:
you didn't post enough of your code. The sample you show doesn't
make any sense.

You could try using strchr or strtok

Here is a routine I just wrote down, totally untested, and not even
compiled yet. After this bunch gets through criticizing it it
should be bullet proof. Until then beware slippery slopes.

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace???). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.
The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
while (' ' == *src) *src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
*token = '\0';
return src;
} /* toksplit */

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

Michael Mair · Feb 19, 2006

CBFalconer said:
Here is a routine I just wrote down, totally untested, and not even
compiled yet. After this bunch gets through criticizing it it
should be bullet proof. Until then beware slippery slopes.

I did not test it either...

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace???). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.
The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
while (' ' == *src) *src++;

ITYM
while (*src && ' ' == *src) src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}

I'd break in an else. Why go through 100000 characters if
five suffice? This may imply a change of the loop structure.

src++;
}
if (*src && (tokchar == *src)) src++;
*token = '\0';
return src;
} /* toksplit */

Cheers
Michael

CBFalconer · Feb 19, 2006

Michael said:
I did not test it either...

ITYM
while (*src && ' ' == *src) src++;

if *src == ' ' then *src is true, unless ' ' == 0, which conflicts
with the idea that strings are terminated with '\0'.

I'd break in an else. Why go through 100000 characters if
five suffice? This may imply a change of the loop structure.

My attitude is that if a token is over-long and needs to be
truncated, do it and get the pointers set up for the next token.
That way a sequence of calls can always find, say, the third
token. I envision something like:

const char source = "Suitable stuff, , make, tokens";
char token[5];
const char *src = source;
int i;
....
for (i = 0; i < 3; i++) {
src = toksplit(src, ',', token, sizeof(token) - 1);
process(token);
}

finding the third token, "make". I don't think a source string of
length 10000 is especially likely to occur, so I am prepared for
inefficiencies in dealing with it. This would allow tokens to be
abbreviated to their first four chars.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

Michael Mair · Feb 19, 2006

CBFalconer said:
if *src == ' ' then *src is true, unless ' ' == 0, which conflicts
with the idea that strings are terminated with '\0'.

Argh. Did not think enough about it.
I would have checked all the input parameters and
have prematurely terminated and was somehow still caught
on that track.

I'd break in an else. Why go through 100000 characters if
five suffice? This may imply a change of the loop structure.

Click to expand...

My attitude is that if a token is over-long and needs to be
truncated, do it and get the pointers set up for the next token.
That way a sequence of calls can always find, say, the third
token. I envision something like:

const char source = "Suitable stuff, , make, tokens";
char token[5];
const char *src = source;
int i;
...
for (i = 0; i < 3; i++) {
src = toksplit(src, ',', token, sizeof(token) - 1);
process(token);
}

finding the third token, "make". I don't think a source string of
length 10000 is especially likely to occur, so I am prepared for
inefficiencies in dealing with it. This would allow tokens to be
abbreviated to their first four chars.

I see; I did not follow the discussion but I still would have
gone for a final call to strchr() after the loop rather than
test against lgh all the time.

Cheers
Michael

CBFalconer · Feb 20, 2006

Michael said:
CBFalconer schrieb:
.... snip ...

I did not test it either...

I got around to testing it. Use -DTESTING to compile a test
program with gcc. Without that define you get a linkable module.
The result follows:

/* ------- file toksplit.h ----------*/
#ifndef H_toksplit_h
# define H_toksplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh); /* length token can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file toksplit.h ----------*/

/* ------- file toksplit.c ----------*/
#include "toksplit.h"

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) *src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable token abbreviations */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char token[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = toksplit(t, ',', token, ABRsize);
putchar(i + '1'); putchar(':');
puts(token);
}

puts("\nHow to detect 'no more tokens'");
t = s; i = 0;
while (*t) {
t = toksplit(t, ',', token, 3);
putchar(i + '1'); putchar(':');
puts(token);
i++;
}

puts("\nUsing blanks as token delimiters");
t = s; i = 0;
while (*t) {
t = toksplit(t, ' ', token, ABRsize);
putchar(i + '1'); putchar(':');
puts(token);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file toksplit.c ----------*/

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

websnarf · Feb 20, 2006

Joriveek said:
I have a little piece of program here

Basically what it does is, it copies the strings of variable widths. The
basis is until it finds a comma ",". The input is a CSV/Comma Separated
file.

CSV parsing is a little bit convoluted. CSV files are lines of fields,
which is the classic nested tokens problem that strtok is so useless as
dealing with. You can find a parser here:

http://www.pobox.com/~qed/bcsv.zip

Filter sober in c++ don't pass test	0	Dec 2, 2023
C exercise	1	Feb 3, 2022
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Boomer trying to learn coding in C and C++	6	Dec 16, 2022
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
Function is not worked in C	2	Jun 27, 2023

Spaces in C

Joriveek

Eric Sosman

Rod Pemberton

Joriveek

Sandeep

stathis gotsis

Mark McIntyre

CBFalconer

Michael Mair

CBFalconer

Michael Mair

CBFalconer

websnarf

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads