[newbie] strcpy, strtok and strcat problem...

  • Thread starter =?ISO-8859-1?Q?Une_b=E9vue?=
  • Start date

?

=?ISO-8859-1?Q?Une_b=E9vue?=

in order not to change an input string i strcpy it to be able to use
strtok and strcat with it, for another reason, i need a second copy of
this string, however, after strtok and strcat with the first, the second
does have exactly the same value, here is the small piece of code for
that part :

--- part of RAliasRecord.c --------------------------------------------
VALUE m_raliasrecord_init(VALUE self, VALUE from_path, VALUE
target_path)
{
char *target_path_cpy, *target_path_cpy2;
target_path_cpy = strcpy(target_path_cpy,
StringValuePtr(target_path));
target_path_cpy2 = strcpy(target_path_cpy2,
StringValuePtr(target_path));

//
// some print-out to verify input state :
//
printf("target_path=%s\n", StringValuePtr(target_path));
// => target_path = /Users/yvon/Desktop/raliasrecord-0.0.1
printf("target_path_cpy=%s\n", target_path_cpy);
// => target_path_cpy = /Users/yvon/Desktop/raliasrecord-0.0.1
printf("target_path_cpy2=%s\n", target_path_cpy2);
// => target_path_cpy2 = /Users/yvon/Desktop/raliasrecord-0.0.1

const char needle[] = "/";
char *token, *file_name;
char parent_path[512]="";
char *pieces[32];
int i = 1;
int j = 1;

token = strtok(target_path_cpy, needle);
strcat(parent_path, needle);
strcat(parent_path, token);
pieces[0]=token;

while(token != NULL) {
pieces=token;
file_name = token;
token = strtok(NULL, needle);
i++;
}

while(j < i-1) {
strcat(parent_path, needle);
strcat(parent_path, pieces[j]);
j++;
}

//
// some print-out to verify output state :
//
printf("target_path=%s\n", StringValuePtr(target_path));
// => target_path = /Users/yvon/Desktop/raliasrecord-0.0.1 GOOD
printf("parent_path=%s\n", parent_path);
// => parent_path = /Users/Users/yvon/Desktop GOOD
printf("file_name=%s\n", file_name);
// => file_name = raliasrecord-0.0.1 GOOD
printf("target_path_cpy=%s\n", target_path_cpy);
// => target_path_cpy = /Users ????
printf("target_path_cpy2=%s\n", target_path_cpy2);
// => target_path_cpy2 = /Users ????

-----------------------------------------------------------------------

i'm not surprised having "target_path_cpy = /Users" because of (strtok
and strcat) however i'm surprised having :

target_path_cpy2 = /Users = target_path_cpy

and, as expected, target_path remains unchanged (thanks to strcpy).
 
Ad

Advertisements

D

Dang

Une said:
in order not to change an input string i strcpy it to be able to use
strtok and strcat with it, for another reason, i need a second copy of
this string, however, after strtok and strcat with the first, the second
does have exactly the same value, here is the small piece of code for
that part :

--- part of RAliasRecord.c --------------------------------------------
VALUE m_raliasrecord_init(VALUE self, VALUE from_path, VALUE
target_path)
{
char *target_path_cpy, *target_path_cpy2;
target_path_cpy = strcpy(target_path_cpy,
StringValuePtr(target_path));
target_path_cpy2 = strcpy(target_path_cpy2,
StringValuePtr(target_path));

//
// some print-out to verify input state :
//
printf("target_path=%s\n", StringValuePtr(target_path));
// => target_path = /Users/yvon/Desktop/raliasrecord-0.0.1
printf("target_path_cpy=%s\n", target_path_cpy);
// => target_path_cpy = /Users/yvon/Desktop/raliasrecord-0.0.1
printf("target_path_cpy2=%s\n", target_path_cpy2);
// => target_path_cpy2 = /Users/yvon/Desktop/raliasrecord-0.0.1

const char needle[] = "/";
char *token, *file_name;
char parent_path[512]="";
char *pieces[32];
int i = 1;
int j = 1;

token = strtok(target_path_cpy, needle);
strcat(parent_path, needle);
strcat(parent_path, token);
pieces[0]=token;

while(token != NULL) {
pieces=token;
file_name = token;
token = strtok(NULL, needle);
i++;
}

while(j < i-1) {
strcat(parent_path, needle);
strcat(parent_path, pieces[j]);
j++;
}

//
// some print-out to verify output state :
//
printf("target_path=%s\n", StringValuePtr(target_path));
// => target_path = /Users/yvon/Desktop/raliasrecord-0.0.1 GOOD
printf("parent_path=%s\n", parent_path);
// => parent_path = /Users/Users/yvon/Desktop GOOD
printf("file_name=%s\n", file_name);
// => file_name = raliasrecord-0.0.1 GOOD
printf("target_path_cpy=%s\n", target_path_cpy);
// => target_path_cpy = /Users ????
printf("target_path_cpy2=%s\n", target_path_cpy2);
// => target_path_cpy2 = /Users ????

-----------------------------------------------------------------------

i'm not surprised having "target_path_cpy = /Users" because of (strtok
and strcat) however i'm surprised having :

target_path_cpy2 = /Users = target_path_cpy

and, as expected, target_path remains unchanged (thanks to strcpy).


Hi, if this is the actual code then I can see some issue. In function
"m_raliasrecord_init" you have taken two char pointers target_path_cpy,
target_path_cpy2 and you are directly doing a strcpy. Who is allocation
memory for them?

Regards,
Dang
 
C

CBFalconer

Une said:
in order not to change an input string i strcpy it to be able to
use strtok and strcat with it, for another reason, i need a second
copy of this string, however, after strtok and strcat with the
first, the second does have exactly the same value, here is the
small piece of code for that part :

You have some problems here. First, strtok is not a standard
function, and even the various implementations have problems, such
as lack of re-entrancy, fouling the original string, failure to
detect omitted tokens, etc. In addition strcpy and strcat are
inherently unsafe, and are much better replaced by the BSD proposed
strlcpy and strlcat routines. You can get these last two, in
portable standard C, at:

<http://cbfalconer.home.att.net/download/>

I have also written a replacement for strtok without the faults,
but simultaneously with different characteristics. Note that it
includes a conditionally compiled testing mechanism. Its source
follows:

/* ------- file toksplit.h ----------*/
#ifndef H_toksplit_h
# define H_toksplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh); /* length token can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file toksplit.h ----------*/

/* ------- file toksplit.c ----------*/
#include "toksplit.h"

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable token abbreviations */

/* ---------------- */

static void showtoken(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtoken */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char token[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = toksplit(t, ',', token, ABRsize);
showtoken(i, token);
}

puts("\nHow to detect 'no more tokens' while truncating");
t = s; i = 0;
while (*t) {
t = toksplit(t, ',', token, 3);
showtoken(i, token);
i++;
}

puts("\nUsing blanks as token delimiters");
t = s; i = 0;
while (*t) {
t = toksplit(t, ' ', token, ABRsize);
showtoken(i, token);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file toksplit.c ----------*/
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Dang said:
Hi, if this is the actual code then I can see some issue. In function
"m_raliasrecord_init" you have taken two char pointers target_path_cpy,
target_path_cpy2 and you are directly doing a strcpy. Who is allocation
memory for them?

fine thanks ))

char *pieces[32];

pieces being an array of char, i suppose i need also to malloc each
element of this array ?
 
M

Michael Mair

Une said:
in order not to change an input string i strcpy it to be able to use
strtok and strcat with it, for another reason, i need a second copy of
this string, however, after strtok and strcat with the first, the second
does have exactly the same value, here is the small piece of code for
that part :

--- part of RAliasRecord.c --------------------------------------------
VALUE m_raliasrecord_init(VALUE self, VALUE from_path, VALUE
target_path)
{
char *target_path_cpy, *target_path_cpy2;

You have uninitialised pointers, which may contain a random
valid address or a null pointer or something not a valid pointer
representation at all.
target_path_cpy = strcpy(target_path_cpy,
StringValuePtr(target_path));
target_path_cpy2 = strcpy(target_path_cpy2,
StringValuePtr(target_path));

Now you try to copy something to a storage location you
_cannot_ own.
There are non-Standard-C extensions like strdup() which
duplicate a string and provide the necessary memory; these
can be used instead of the above.
Or you can get the memory yourself previous to the call to
strcpy():
char *target_path_cpy, *target_path_cpy2;
char *temp = StringValuePtr(target_path);
size_t size = strlen(temp) + 1;
target_path_cpy = malloc(size);
target_path_cpy2 = malloc(size);
if (NULL == target_path_cpy
|| NULL == target_path_cpy2) {
free(target_path_cpy);
free(target_path_cpy2);
return INVALID_VALUE; /* or whatever you do to indicate failure */
}
strcpy(target_path_cpy, temp);
strcpy(target_path_cpy2, temp);

There you go...
And at the end of m_raliasrecord_init(), you call
free(target_path_cpy);
free(target_path_cpy2);
before returning the VALUE return value.

Cheers
Michael
 
E

Eric Sosman

CBFalconer wrote On 08/31/06 14:20,:
[...] First, strtok is not a standard
function, [...]

I suspect you meant to type something other than
"standard" here, but I don't know what it was.
 
Ad

Advertisements

M

Michael Mair

CBFalconer said:
You have some problems here. First, strtok is not a standard
function,
<snip>

Either you are wrong or my standards are broken:
ISO9899:1999 "7.21.5.8 The strtok function"
C89 draft: "4.11.5.8 The strtok function"
(the ISO9899:1990 is at work but I can look it up tomorrow
if you wish).

The rest of your article, of course, is correct but essentially
only advertisement for your code without addressing the
immediate problem.


Cheers
Michael
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Michael Mair said:
You have uninitialised pointers, which may contain a random
valid address or a null pointer or something not a valid pointer
representation at all.

right ! i've found that in between...
Now you try to copy something to a storage location you
_cannot_ own.
There are non-Standard-C extensions like strdup() which
duplicate a string and provide the necessary memory; these
can be used instead of the above.
Or you can get the memory yourself previous to the call to
strcpy():
char *target_path_cpy, *target_path_cpy2;
char *temp = StringValuePtr(target_path);
size_t size = strlen(temp) + 1;

i've put a constant size 512 for a full path and 256 for a relative one,
even if target_path_cpy does have the same length, at start, than
target_path it might be longer (or shorter afterwards.
target_path_cpy = malloc(size);
target_path_cpy2 = malloc(size);
if (NULL == target_path_cpy
|| NULL == target_path_cpy2) {
free(target_path_cpy);
free(target_path_cpy2);
return INVALID_VALUE; /* or whatever you do to indicate failure */
}
strcpy(target_path_cpy, temp);
strcpy(target_path_cpy2, temp);

There you go...
And at the end of m_raliasrecord_init(), you call
free(target_path_cpy);
free(target_path_cpy2);
before returning the VALUE return value.

yes, thanks a lot, however i've a question about freeing memory the
skeleton of my prog being :

VALUE m_raliasrecord_init(VALUE self, VALUE from_path, VALUE
target_path)
{
computes all the internal values
rb_iv_set(self, "@from_path", from_path);
//<the same for all the values>
return self;
}

BUT the values as from_path are used elsewhere like that :

VALUE m_from_path(VALUE self) {
return rb_iv_get(self, "@from_path");
}

(rb_iv_set/get are specific to ruby.h)

part of ruby.h :
VALUE rb_iv_get(VALUE, const char*);
VALUE rb_iv_set(VALUE, const char*, VALUE);

am i sure that, after deallocating memory for from_path in
"m_raliasrecord_init" , for example, i'll get the correct value in
"m_from_path" ???
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

CBFalconer said:
You have some problems here. First, strtok is not a standard
function, and even the various implementations have problems, such
as lack of re-entrancy, fouling the original string, failure to
detect omitted tokens, etc. In addition strcpy and strcat are
inherently unsafe, and are much better replaced by the BSD proposed
strlcpy and strlcat routines. You can get these last two, in
portable standard C, at:

<http://cbfalconer.home.att.net/download/>

I have also written a replacement for strtok without the faults,
but simultaneously with different characteristics. Note that it
includes a conditionally compiled testing mechanism. Its source
follows:

ok, fine, thanks for all, i'll examine that asap !!!
 
M

Michael Mair

Une said:
right ! i've found that in between...

Um, "in between" what?
Do you mean that you have found out this in the meantime?
I am not giving the language teacher (at least not for
English), I just do not understand what you mean.
i've put a constant size 512 for a full path and 256 for a relative one,
even if target_path_cpy does have the same length, at start, than
target_path it might be longer (or shorter afterwards.

Note: Fixed path lengths probably _will_ give you trouble at some
point in the future as there is no "large enough" buffer to hold
all possible paths.
If you want some room for change, you can just use

size_t size = INITIAL_FULL_PATH_ADJUSTMENT + strlen(temp) + 1;

and realloc() if necessary.
yes, thanks a lot, however i've a question about freeing memory the
skeleton of my prog being :

VALUE m_raliasrecord_init(VALUE self, VALUE from_path, VALUE
target_path)
{
computes all the internal values
rb_iv_set(self, "@from_path", from_path);
//<the same for all the values>
return self;
}

BUT the values as from_path are used elsewhere like that :

VALUE m_from_path(VALUE self) {
return rb_iv_get(self, "@from_path");
}

(rb_iv_set/get are specific to ruby.h)

part of ruby.h :
VALUE rb_iv_get(VALUE, const char*);
VALUE rb_iv_set(VALUE, const char*, VALUE);

am i sure that, after deallocating memory for from_path in
"m_raliasrecord_init" , for example, i'll get the correct value in
"m_from_path" ???

You did not give enough information.
Does rb_iv_get() create a copy of the passed source VALUE's
respective property?
Or does it create a copy of some pointer within the source VALUE
and store this pointer in the destination VALUE?
The same for the strings:
Will it store a copy of the string?
Find some property by using the string?
Store a copy of the string's address?

All of this should be told to you by the documentation.
If not pointers are stored, then you can and should free() whatever
you allocated in m_raliasrecord_init(). If only pointers are stored,
then you must not free() these pointers in m_raliasrecord_init()
but should free a VALUE's property pointers when deleting the last
VALUE storing them.
Hopefully, the former is the case.


Cheers
Michael
 
Ad

Advertisements

?

=?ISO-8859-1?Q?Une_b=E9vue?=

Michael Mair said:
Um, "in between" what?

in the meantime )))
Note: Fixed path lengths probably _will_ give you trouble at some
point in the future as there is no "large enough" buffer to hold
all possible paths.
If you want some room for change, you can just use

size_t size = INITIAL_FULL_PATH_ADJUSTMENT + strlen(temp) + 1;

and realloc() if necessary.

yes you'r right ! i discovered this evening from Apple's Tech Note 2078
<http://developer.apple.com/technotes/tn2002/tn2078.html>

that a file name might be of more than 2k bites ... because it's in
UTF16...
even if they use, generally 40-50 chars !
then i'll be carefull about that point...

[...]
You did not give enough information.
Does rb_iv_get() create a copy of the passed source VALUE's
respective property?

i don't know actually...
Or does it create a copy of some pointer within the source VALUE
and store this pointer in the destination VALUE?
The same for the strings:
Will it store a copy of the string?
Find some property by using the string?
Store a copy of the string's address?

in fact last time when i posted i didn't even find the *.c file about
those rb_iv_set and rb_iv_get i've found them now, but it is not
sufficient i have to walk up the tree of functions.

the things i've found :
VALUE
rb_iv_get(VALUE obj, const char *name)
{
ID id = rb_intern(name);

return rb_ivar_get(obj, id);
}

VALUE
rb_iv_set(VALUE obj, const char *name, VALUE val)
{
ID id = rb_intern(name);

return rb_ivar_set(obj, id, val);
}

with ID rb_intern(const char*);
def being :
ID rb_intern(const char*);

i'm looking for the implementation...

i'll ask that on comp.lang.ruby ...
All of this should be told to you by the documentation.
If not pointers are stored, then you can and should free() whatever
you allocated in m_raliasrecord_init(). If only pointers are stored,
then you must not free() these pointers in m_raliasrecord_init()
but should free a VALUE's property pointers when deleting the last
VALUE storing them.
Hopefully, the former is the case.

yes it's what i've thought.

thanks for all !
 
C

CBFalconer

Eric said:
CBFalconer wrote On 08/31/06 14:20,:
[...] First, strtok is not a standard
function, [...]

I suspect you meant to type something other than
"standard" here, but I don't know what it was.

Maybe "re-entrant"? A cow flew by.
 
B

Barry Schwarz

char *pieces[32];

pieces being an array of char, i suppose i need also to malloc each
element of this array ?

pieces being an array of pointer to char.

You need to initialize each pointer in the array before you evaluate
that pointer. Initializing a pointer with the value returned by
malloc is one way. Another is to assign the address of an existing
char to a pointer. A third is to assign the value of an existing
char* to the pointer. There may be others.


Remove del for email
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Barry Schwarz said:
pieces being an array of pointer to char.

yes true ;-) i've to face with C vocabulary ))
You need to initialize each pointer in the array before you evaluate
that pointer. Initializing a pointer with the value returned by
malloc is one way. Another is to assign the address of an existing
char to a pointer. A third is to assign the value of an existing
char* to the pointer. There may be others.

fine, thanks.
 
Ad

Advertisements

?

=?ISO-8859-1?Q?Une_b=E9vue?=

Michael Mair said:
Or does it create a copy of some pointer within the source VALUE
and store this pointer in the destination VALUE?

i get the answer upon that point, this not rb_iv_set() doing a copy
rather than the functions aimed to transfor a C value to a "Ruby" one :
Assuming you're using one of the rb_str_new family of
functions to create Ruby String objects from C strings,
Ruby makes a copy of the C string in Ruby-managed memory.
In general Ruby objects are always in Ruby-managed memory
unless you're wrapping one of your own C structures
into a Ruby object with Data_Wrap_Struct.

(given by Timothy Hunter msg ID : <[email protected]>)

then i must free up all internal C variables...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Why does strcat mess up the tokens in strtok (and strtok_r)? 92
[Newbie] changing allocated memory 17
strtok() 13
oddness with strcpy 16
strtok problem 16
strcat and overwritten strings 9
Help With strtok 2
Help with strtok 8

Top