insert a char in a char array

S

Sean Bartholomew

i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.

but how do i DO this?
how can i check which delimiter was used.
and if not, back to the 1st solution, how do i INSERT a space thereby
moving the following characters to the right which will of course mean
that the array has to be +1 larger. HOW?
 
V

Victor Bazarov

Sean Bartholomew said:
i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.

OR

3----quit using strtok and simply look for the next \t and extract
what is between them in your string.

Victor
 
A

Alex Lyman

Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}
 
S

Sean Bartholomew

Alex Lyman said:
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}


i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
 
S

Sean Bartholomew

Alex Lyman said:
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}


i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
 
A

Alex Lyman

i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.

This was not unexpected. You'll note my statement: "isn't recomended for
use with other compilers (which can and will have issues with this)"
 
A

Alex Lyman

DOH!

Sometime in the future, maybe I'll learn to actually look at a new algoritim
implimentation before submitting it.
The desired behavior is not one that my below recommendation and code
follow. Instead, the aforementioned 'while' ('if') should be removed
entirely. The code as-was will skip at most 2 delimiters (being an idiot, I
forgot that that while-statement actually starts at the character after the
last \0-d token delimiter). Removing it would get rid of this entirely,
letting the for-loop just below that while-statement take care of where the
delimiters are.

My bad.

- Alex


Alex Lyman said:
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top