splitting a string (and memory allocation)

F

fatted

I'm trying to write a function which splits a string (possibly multiple
times) on a particular character and returns the strings which has been
split. What I have below is kind of (oh dear!) printing the results I
expect, which I guess means my dynamic memory allocation is a mess.

Also, I was advised previously that I should really free memory in the
same place I declare it, but I'm not sure how I would go about doing
this
in my code below.

Comments, advice and cleaner solutions welcome :)

--

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void * mymalloc(size_t);
void * myrealloc(void *, size_t);
char ** split(char *, int);

int main(void)
{
char string[] = "this is a string = sausage";
char ** data = split(string,' ');

while(*data)
{
printf("[main]%s\n",*data++);
}

return EXIT_SUCCESS;
}

/*
** This function splits a string s, on the character c and returns a
pointer to
** an array of strings.
*/
char ** split(char * s, int c)
{
char * sptr;
int word_len;
/* according to man page on linux (hopefully standards
compliant)
char ** result = NULL;
int i = 0;

while(s)
{
if((sptr = strchr(s, c)) != NULL)
{
word_len = strlen(s) - strlen(sptr);
result = myrealloc(result,(sizeof(char
*)*(i+1)));
result = mymalloc(word_len + 1);
strncpy(result,s,word_len);
s = sptr + 1;
}
else
{
result = mymalloc(strlen(s)+1);
result = s;
s = NULL;
}
++i;
}
/* assign null to indicate end of array */
result = mymalloc(1);
result = '\0';

return(result);
}

void * myrealloc(void * ptr, size_t amount)
{
if((ptr = realloc(ptr, amount)) == NULL)
{
perror("Realloc Failed!\n");
}

return(ptr);
}

void * mymalloc(size_t amount)
{
void * ptr;

if((ptr = malloc(amount)) == NULL)
{
perror("Malloc Failed!\n");
}

return(ptr);
}
 
K

Kristofer Pettijohn

fatted said:
I'm trying to write a function which splits a string (possibly multiple
times) on a particular character and returns the strings which has been
split. What I have below is kind of (oh dear!) printing the results I
expect, which I guess means my dynamic memory allocation is a mess.

Also, I was advised previously that I should really free memory in the
same place I declare it, but I'm not sure how I would go about doing
this
in my code below.

Comments, advice and cleaner solutions welcome :)

I don't have time to post any sample code, but what people usually do in
this case is strdup() the original string, to create a dupliate of it
with new malloc()'d memory. They then operate on that string,
traversing through it character by caracter, and putting '\0' where the
delimiter is, and adding a char* of the next character after the NULL
that was replaced to an array which is returned to the calling function.

Example:

If you're splitting on spaces, the string

"This is my test"

then becomes

"This\0is\0my\0test".

and assuming the string begins at 0x8000000 (for simplicity sake), an
array would be returned as follows:

{ 0x8000000, 0x80000006, 0x80000008, 0x8000000b }.

How you create that array is up to you, but those are the values that
would be in it.. pointing to the character 'T', the character 'i', the
character 'm', and the charater 't'.

Kristofer
 
?

=?ISO-8859-1?Q?Bj=F8rn_Augestad?=

fatted said:
I'm trying to write a function which splits a string (possibly multiple
times) on a particular character and returns the strings which has been
split. What I have below is kind of (oh dear!) printing the results I
expect, which I guess means my dynamic memory allocation is a mess.

Also, I was advised previously that I should really free memory in the
same place I declare it, but I'm not sure how I would go about doing
this
in my code below.

Comments, advice and cleaner solutions welcome :)

A very good implementation of what you want is available here:

http://tinyurl.com/6phlp

(http://cvs.sourceforge.net/viewcvs.py/libclc/libclc/src/string/clc_strsplit.c?rev=1.9&view=auto)

Bjørn

[snip]
 
J

Jens.Toerring

fatted said:
I'm trying to write a function which splits a string (possibly multiple
times) on a particular character and returns the strings which has been
split. What I have below is kind of (oh dear!) printing the results I
expect, which I guess means my dynamic memory allocation is a mess.
Also, I was advised previously that I should really free memory in the
same place I declare it, but I'm not sure how I would go about doing
this

Sources snipped to the interesting part and indented for readability.
/*
** This function splits a string s, on the character c and returns a
** pointer to an array of strings.
*/

char **split(char *s, int c)
{
char *sptr;
int word_len;

Since strlen returns a size_t it probably would be cleaner to use
that also for this variable.
/* according to man page on linux (hopefully standards compliant) */

I wouldn't count on Linux man pages teaching you standard comliant
programming;-)
char **result = NULL;
int i = 0;

while( s )
{
if ( ( sptr = strchr( s, c ) ) != NULL )
{
word_len = strlen( s ) - strlen( sptr );

Subtracting pointers would be faster, but that's not an issue here.
result = myrealloc( result, ( sizeof( char * ) * ( i + 1 ) ) );
result[ i ] = mymalloc( word_len + 1 );
strncpy( result[ i ], s, word_len );

You still have to append a '\0' to the new string (or to make it a
string, to be precise):

result[ i ][ word_len ] = '\0';
s = sptr + 1;
}
else
{
result[ i ] = mymalloc( strlen( s ) + 1 );
result[ i ] = s;

But you have never allocated memory for result! You only did that
in the case of strchr() finding something. And even if you did that
then you would have to use strcpy() in the next step, not just assgn
some pointer. So you seem to need

result = myrealloc( result, sizeof( char * ) * ( i + 1 ) );
result[ i ] = mymalloc( strlen( s ) + 1 );
strcpy( result[ i ], s );
s = NULL;
}
++i;
}

/* assign null to indicate end of array */

result[ i ] = mymalloc( 1 );
result[ i ] = '\0';

No. The last element is supposed to be a NULL pointer which is quite
different from the character '\0'. Besides, 'result' is an array of
pointers to char pointers and you can't assign a char to a char pointer.
And again, without one more allocation 'result' has only 'i' elements,
so the largest index you can use is 'i - 1', What you need here is

result = realloc( result, sizeof ( char * ) * ( i + 1 ) );
result[ i ] = NULL;
return( result );
}
Regards, Jens
 
F

Flash Gordon

I don't have time to post any sample code, but what people usually do
in this case is strdup() the original string, to create a dupliate of
it with new malloc()'d memory. They then operate on that string,

<snip>

No they don't. strdup is not part of the C standard so people only use
it when they are not concerned with portability. Personally I think it
would have been a nice addition for C99 since there is a lot of existing
practise and it would not have broken anything, but it was not added.

You can trivially implement my_strdup using strlen, malloc and memcpy.
 
S

Sascha Springer

Hi!

If memory is not a bottleneck I would suggest to allocate just the
double size of the original string length and put in the string
terminators ('\0') while copying and comparing the original characters
in the original string with the split character.

For the pointer table just do the following: allocate a block of memory
of type "char *" with the size of (strlen(original_string) + 1).

At least this all would save you from reallocation! ;)

Regards
Sascha
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,121
Latest member
LowellMcGu
Top