disturbed between two versions

  • Thread starter =?ISO-8859-1?Q?Une_b=E9vue?=
  • Start date
?

=?ISO-8859-1?Q?Une_b=E9vue?=

I'm still a neby in C, i do have two versions of about the same algo.
The first one works fine but de second. I don't see what's the prob with
the second.

the purpose to cut out in sections a unix path.

the test source :
--- essai_token.c ------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (void)
{
const char *unix_sep = "/";
char *token = malloc(sizeof (*token) * 256);
if(token == NULL) return EXIT_FAILURE;

//
// Version 1
//
printf("// Version 1\n");
char *path_v1 = strdup("/path_v1/to/something");
printf("path_v1 = %s\n", path_v1);
token = strtok(path_v1, unix_sep);
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);
printf("token = %s\n", token);
}
free(path_v1);
path_v1 = NULL;
printf("\n");

//
// Version 2
//
printf("// Version 2\n");
char *path_v2 = malloc(sizeof (*path_v2) * 256);
if(path_v2 == NULL) return EXIT_FAILURE;
path_v2 = "/path_v2/to/something";
printf("path_v2 = %s\n", path_v2);
token = strtok(path_v2, unix_sep);// <= broken HERE
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);
printf("token = %s\n", token);

}
free(path_v2);
path_v2 = NULL;
free(token);
token = NULL;
return EXIT_SUCCESS;
}
------------------------------------------------------------------------

run output :

~/work/C/developpez/listes_chainees_double%> ./essai_token
// Version 1
path_v1 = /path_v1/to/something
token = path_v1
token = to
token = something
token = (null)

// Version 2
path_v2 = /path_v2/to/something
zsh: bus error ./essai_token

then, the broken line is "token = strtok(path_v2, unix_sep);"

no warnings nor errors at the compil time using the following gcc
options :

alias ycc='cc -W -Wall -Wextra -Wuninitialized -Wstrict-prototypes
-Wmissing-prototypes -pedantic -std=c99 -O2 -pipe -o '

pratically i make use only of the first version, however i'd like to
understand what's my mistake in the second one...


i know that working with strtok needs a writable mem zone, i assume
malloc does that ???

my setup :
Mac OS X 10.4.7
~/work/C/developpez/listes_chainees_double%> gcc --version
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
5363)
 
S

Spiros Bousbouras

Une said:
I'm still a neby in C, i do have two versions of about the same algo.
The first one works fine but de second. I don't see what's the prob with
the second.

the purpose to cut out in sections a unix path.

the test source :
--- essai_token.c ------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (void)
{
const char *unix_sep = "/";
char *token = malloc(sizeof (*token) * 256);

token points to char so sizeof(*token) is guaranteed
to be 1.
if(token == NULL) return EXIT_FAILURE;

//
// Version 1
//
printf("// Version 1\n");
char *path_v1 = strdup("/path_v1/to/something");

strdup is not standard C. But since this is the only non
standard part I'll give you my comments. You need
to check that strdup did not return NULL.
printf("path_v1 = %s\n", path_v1);
token = strtok(path_v1, unix_sep);

You have already assigned to token a value returned
by malloc. Now , without having used this value in
any manner , you simply discard it and you assign to
token a new value. We can conclude then that the call
to malloc was redundant.
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);
printf("token = %s\n", token);
}
free(path_v1);
path_v1 = NULL;

What's the reason for assigning NULL to
path_v1 ?
printf("\n");

Quicker to write putchar('\n')
//
// Version 2
//
printf("// Version 2\n");
char *path_v2 = malloc(sizeof (*path_v2) * 256);
if(path_v2 == NULL) return EXIT_FAILURE;
path_v2 = "/path_v2/to/something";

You probably think that this copies to the area assigned
to path_v2 by malloc the string "/path_v2/to/something".
It does not. What it does instead is create an area in memory,
possibly ***read only*** , which contains the string
"/path_v2/to/something" and assigns to path_v2 the beginning
of this area. In other words the value returned by malloc is
discarded.
printf("path_v2 = %s\n", path_v2);
token = strtok(path_v2, unix_sep);// <= broken HERE

Because path_v2 points to an area in memory which may
be read only.
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);
printf("token = %s\n", token);

}
free(path_v2);
path_v2 = NULL;
free(token);
token = NULL;

The assignments to NULL are pointless.

return EXIT_SUCCESS;
}
------------------------------------------------------------------------

run output :

~/work/C/developpez/listes_chainees_double%> ./essai_token
// Version 1
path_v1 = /path_v1/to/something
token = path_v1
token = to
token = something
token = (null)

// Version 2
path_v2 = /path_v2/to/something
zsh: bus error ./essai_token

then, the broken line is "token = strtok(path_v2, unix_sep);"

no warnings nor errors at the compil time using the following gcc
options :

alias ycc='cc -W -Wall -Wextra -Wuninitialized -Wstrict-prototypes
-Wmissing-prototypes -pedantic -std=c99 -O2 -pipe -o '

pratically i make use only of the first version, however i'd like to
understand what's my mistake in the second one...


i know that working with strtok needs a writable mem zone, i assume
malloc does that ???

Yes , malloc does that but you're not using the
value returned by malloc.
 
B

Barry Schwarz

I'm still a neby in C, i do have two versions of about the same algo.
The first one works fine but de second. I don't see what's the prob with
the second.

Actually neither one works.
the purpose to cut out in sections a unix path.

the test source :
--- essai_token.c ------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (void)
{
const char *unix_sep = "/";
char *token = malloc(sizeof (*token) * 256);
if(token == NULL) return EXIT_FAILURE;

//
// Version 1
//
printf("// Version 1\n");
char *path_v1 = strdup("/path_v1/to/something");

strdup is not a standard function. A typical implementation builds
the desired string in allocated memory and I assume this is what yours
does also.
printf("path_v1 = %s\n", path_v1);
token = strtok(path_v1, unix_sep);
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);

After the last part of the path name has been processed, this call to
strtok will set token to NULL.
printf("token = %s\n", token);

At that point, this statement will invoke undefined behavior.
Unfortunately, your system did not catch that behavior.
}
free(path_v1);
path_v1 = NULL;
printf("\n");

//
// Version 2
//
printf("// Version 2\n");
char *path_v2 = malloc(sizeof (*path_v2) * 256);
if(path_v2 == NULL) return EXIT_FAILURE;
path_v2 = "/path_v2/to/something";

These three lines cause a memory leak. You allocate memory and save
the address in a pointer. You then change the pointer to point to a
string literal. You no longer have any pointer to the allocated
memory.

Did you think the last statement stores data in the allocated memory?
If that was your intent, you need to use strcpy.
printf("path_v2 = %s\n", path_v2);
token = strtok(path_v2, unix_sep);// <= broken HERE

Even though a string literal is not const, you are not allowed to
modify it. strtok modifies the string it is working on. Your code
invokes undefined behavior. Fortunately in this case, the undefined
behavior caused your program to do something that got your attention.
printf("token = %s\n", token);
while(token != NULL) {
token = strtok(NULL, unix_sep);
printf("token = %s\n", token);

}
free(path_v2);

Since path_v2 no longer points to allocated memory, this call to free
would also invoke undefined behavior.
path_v2 = NULL;
free(token);

You already know token is NULL (or you would still be in the while
loop). Calling free with an argument of NULL is equivalent to a time
consuming no-op.
token = NULL;

token cannot be made anymore NULL than it already is.
return EXIT_SUCCESS;
}
------------------------------------------------------------------------

run output :

~/work/C/developpez/listes_chainees_double%> ./essai_token
// Version 1
path_v1 = /path_v1/to/something
token = path_v1
token = to
token = something
token = (null)

This is an indication that something is wrong.
// Version 2
path_v2 = /path_v2/to/something
zsh: bus error ./essai_token

then, the broken line is "token = strtok(path_v2, unix_sep);"

no warnings nor errors at the compil time using the following gcc
options :

alias ycc='cc -W -Wall -Wextra -Wuninitialized -Wstrict-prototypes
-Wmissing-prototypes -pedantic -std=c99 -O2 -pipe -o '

pratically i make use only of the first version, however i'd like to
understand what's my mistake in the second one...

It is also broken.
i know that working with strtok needs a writable mem zone, i assume
malloc does that ???

Only if you then use the allocated area.
my setup :
Mac OS X 10.4.7
~/work/C/developpez/listes_chainees_double%> gcc --version
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
5363)


Remove del for email
 
S

Spiros Bousbouras

After the last part of the path name has been processed, this call to
strtok will set token to NULL.


At that point, this statement will invoke undefined behavior.
Unfortunately, your system did not catch that behavior.

Actually it did catch it. See below.
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Barry Schwarz said:
After the last part of the path name has been processed, this call to
strtok will set token to NULL.

it's exactly what i want otherwise the while(token != NULL) loop
wouldn't work ?
At that point, this statement will invoke undefined behavior.
Unfortunately, your system did not catch that behavior.

no the print-out in such a case gives :
"token = (null)"
the print-out is here only for try-out.
These three lines cause a memory leak. You allocate memory and save
the address in a pointer. You then change the pointer to point to a
string literal. You no longer have any pointer to the allocated
memory.


FINE, THANXS, i've understood where's my mistake ))
Did you think the last statement stores data in the allocated memory?
yes...

If that was your intent, you need to use strcpy.


Even though a string literal is not const, you are not allowed to
modify it. strtok modifies the string it is working on. Your code
invokes undefined behavior.

YES, i knew that point.
Fortunately in this case, the undefined
behavior caused your program to do something that got your attention.


Since path_v2 no longer points to allocated memory, this call to free
would also invoke undefined behavior.


You already know token is NULL (or you would still be in the while
loop). Calling free with an argument of NULL is equivalent to a time
consuming no-op.

i don't understand clearly here. did you mean, in order to free memory
allocation, it is sufficient to assign a pointer to NULL ?
token cannot be made anymore NULL than it already is.

ok, i didn't see that point ))
This is an indication that something is wrong.

i don't think so, it's the way to break the while loop isn't it ?
It is also broken.

because of token == NUL ???
Only if you then use the allocated area.

yes that's my *** BIGGEST *** mistake here, i think.
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Spiros Bousbouras said:
token points to char so sizeof(*token) is guaranteed
to be 1.

yes that's true but i do have some plan to use UTF8|16 instead of ASCII
in the near future...
strdup is not standard C. But since this is the only non
standard part I'll give you my comments. You need
to check that strdup did not return NULL.

yes i don't have the habbit for that...
You have already assigned to token a value returned
by malloc. Now , without having used this value in
any manner , you simply discard it and you assign to
token a new value. We can conclude then that the call
to malloc was redundant.

OK )))
What's the reason for assigning NULL to
path_v1 ?

i've heard that' "the best" way to free memory ???
Quicker to write putchar('\n')

OK, i'll remember that too.
You probably think that this copies to the area assigned
to path_v2 by malloc the string "/path_v2/to/something".
It does not. What it does instead is create an area in memory,
possibly ***read only*** , which contains the string
"/path_v2/to/something" and assigns to path_v2 the beginning
of this area. In other words the value returned by malloc is
discarded.

that's by giggest mistake here )) (two times for token too)
Because path_v2 points to an area in memory which may
be read only.

fine thanks, i understood my mistakes, i'll rewrite that small test.
 
S

Spiros Bousbouras

Une said:
i've heard that' "the best" way to free memory ???

The best , by virtue of being the only ,
way to free memory is to use free().
Assigning any value to the pointer which
you used with free() isn't going to make the
memory any more free. Perhaps what you
heard is that if you assign NULL to the pointer
then you cannot accidentally try to write to
the memory which has been freed. But dereferencing
a NULL pointer is undefined behaviour anyway
so the trick doesn't give you much.
 
S

Spiros Bousbouras

Une said:
it's exactly what i want otherwise the while(token != NULL) loop
wouldn't work ?


no the print-out in such a case gives :
"token = (null)"

A message of "token = (null)" is consistent with
undefined behaviour. The problem is that anything
is consistent with undefined behaviour including
something that will cause damage or produce strange
bugs. In your platform it may be that passing a NULL
pointer is harmless but there is no guarantee that it
will be the same on some other system. So you should
modify your loop in such a way that printf does not get
called when token has value NULL. For example you
could write
while (token != NULL && printf("token = %s\n", token))
i don't understand clearly here. did you mean, in order to free memory
allocation, it is sufficient to assign a pointer to NULL ?

No , in order to free memory it is necessary and sufficient
to call free() with a pointer which was previously returned
by malloc() or one of the related functions. In your case
path_v2 no longer contains a value returned by malloc()
so calling free() with it invokes undefined behaviour.

Furthermore , free(NULL) does not do anything.
 
M

Mark McIntyre

On Sun, 10 Sep 2006 08:41:50 +0200, in comp.lang.c ,
char *path_v2 = malloc(sizeof (*path_v2) * 256);

here you allocate some memory to path_v2...
if(path_v2 == NULL) return EXIT_FAILURE;
path_v2 = "/path_v2/to/something";

....and here you leak that memory, and point path_v2 at some new
string.

Note: The = operator does NOT copy. It assigns values - in this case,
it assigns the value of path_p2 (which is a pointer) to be a pointer
to the string "/path_v2/to/something".
printf("path_v2 = %s\n", path_v2);
token = strtok(path_v2, unix_sep);// <= broken HERE

You can't do this. The string you pointed path_v2 at is a string
literal. This sort of string is non-editable, and strtok works by
altering its argument. So it can't work.

To fix this, copy the string properly.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
C

CBFalconer

Barry said:
.... snip ...

strdup is not a standard function. A typical implementation builds
the desired string in allocated memory and I assume this is what
yours does also.

For the purposes of this operation, you can safely use the
following function, which is implemented in standard C.

/* ------- file toksplit.c ----------*/
#include "toksplit.h"

/* released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */

/* ------- file toksplit.h ----------*/
#ifndef H_toksplit_h
# define H_toksplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh); /* length token can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file toksplit.h ----------*/
 
S

Stephen Sprunk

Spiros Bousbouras said:
The best , by virtue of being the only , way to free memory is to use
free(). Assigning any value to the pointer which you used with free()
isn't going to make the memory any more free. Perhaps what you
heard is that if you assign NULL to the pointer then you cannot
accidentally try to write to the memory which has been freed. But
dereferencing a NULL pointer is undefined behaviour anyway
so the trick doesn't give you much.

It definitely helps when debugging*, and if it is indeed useless the
compiler will optimize away the dead code so it should never hurt. At
worst you've wasted a half-second of programmer time.

* NULL typically stands out in debugger/stack trace output, whereas the
value of a free()d pointer does not. Clearing a pointer you free()
helps makes bugs more obvious. Also, many systems will happily allow
you to dereference a pointer after free()ing it, hiding the location of
the bug, whereas dereferencing NULL is caught immediately by nearly all
modern systems.

S
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Spiros Bousbouras said:
perhaps what you
heard is that if you assign NULL to the pointer
then you cannot accidentally try to write to
the memory which has been freed.
right

But dereferencing
a NULL pointer is undefined behaviour anyway
so the trick doesn't give you much.

then, you mean, i might use this pointer still again with unpredictable
results ?

(in case i don't test it to NULL)
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Spiros Bousbouras said:
A message of "token = (null)" is consistent with
undefined behaviour. The problem is that anything
is consistent with undefined behaviour including
something that will cause damage or produce strange
bugs. In your platform it may be that passing a NULL
pointer is harmless but there is no guarantee that it
will be the same on some other system. So you should
modify your loop in such a way that printf does not get
called when token has value NULL. For example you
could write
while (token != NULL && printf("token = %s\n", token))

yes ok, i'll rewrite that even if it is only a small test piece the
print out would deseapeare in a "real" function)...
No , in order to free memory it is necessary and sufficient
to call free() with a pointer which was previously returned
by malloc() or one of the related functions. In your case
path_v2 no longer contains a value returned by malloc()
so calling free() with it invokes undefined behaviour.

Furthermore , free(NULL) does not do anything.

ok obvious !
 
?

=?ISO-8859-1?Q?Une_b=E9vue?=

Mark McIntyre said:
here you allocate some memory to path_v2...


...and here you leak that memory, and point path_v2 at some new
string.

Note: The = operator does NOT copy. It assigns values - in this case,
it assigns the value of path_p2 (which is a pointer) to be a pointer
to the string "/path_v2/to/something".

yes, right, i don't must forget that !!!
You can't do this. The string you pointed path_v2 at is a string
literal. This sort of string is non-editable, and strtok works by
altering its argument. So it can't work.

To fix this, copy the string properly.
allready done, thanks !
 
K

Keith Thompson

Mark McIntyre said:
On Sun, 10 Sep 2006 08:41:50 +0200, in comp.lang.c ,


here you allocate some memory to path_v2...


...and here you leak that memory, and point path_v2 at some new
string.

Note: The = operator does NOT copy. It assigns values - in this case,
it assigns the value of path_p2 (which is a pointer) to be a pointer
to the string "/path_v2/to/something".

Yes, the "=" operator *does* copy. (In this case, what it copies is a
pointer value, not what it points to.)
 
B

Barry Schwarz

It definitely helps when debugging*, and if it is indeed useless the
compiler will optimize away the dead code so it should never hurt. At
worst you've wasted a half-second of programmer time.

Adding **unnecessary** code in the belief that the compiler will
optimize it away does not qualify as a good technique. Depending on
how often the product undergoes update/maintenance and the quantity of
unnecessary code, it can become a significant cost factor.
* NULL typically stands out in debugger/stack trace output, whereas the
value of a free()d pointer does not. Clearing a pointer you free()
helps makes bugs more obvious. Also, many systems will happily allow

While I don't usually do this, I would have no objection if it were a
documented standard applied consistently.
you to dereference a pointer after free()ing it, hiding the location of
the bug, whereas dereferencing NULL is caught immediately by nearly all
modern systems.

But not on this poster's system, as indicated by the illegal call to
printf.


Remove del for email
 
B

Barry Schwarz

Please don't snip attributions. I didn't write this code, you did. I
just commented on it.


Remove del for email
 
B

Barry Schwarz

then, you mean, i might use this pointer still again with unpredictable
results ?

(in case i don't test it to NULL)

Any attempt to dereference a pointer which a) has not been
initialized, b) has been set to NULL, or c) points to an area which
has been deallocated by free() or realloc(), invokes undefined
behavior. There are other cases also, such as pointing beyond the end
of an array.

There used to be a statement in the standard that the value of a
pointer passed to free() would be indeterminate when free() returns.
In that case, even evaluating the pointer (such as comparing it to
NULL) would also invoke undefined behavior. However, I cannot find
that statement in my May 6, 2006 copy of N1124 (C99) so I don't know
if that is still part of the language definition. On the other hand,
there is a statement that you cannot use (evaluate) this value in
Appendix J.2 but that appendix is marked informative, not normative.


Remove del for email
 
K

Keith Thompson

Barry Schwarz said:
Any attempt to dereference a pointer which a) has not been
initialized, b) has been set to NULL, or c) points to an area which
has been deallocated by free() or realloc(), invokes undefined
behavior. There are other cases also, such as pointing beyond the end
of an array.

There used to be a statement in the standard that the value of a
pointer passed to free() would be indeterminate when free() returns.
In that case, even evaluating the pointer (such as comparing it to
NULL) would also invoke undefined behavior. However, I cannot find
that statement in my May 6, 2006 copy of N1124 (C99) so I don't know
if that is still part of the language definition. On the other hand,
there is a statement that you cannot use (evaluate) this value in
Appendix J.2 but that appendix is marked informative, not normative.

It's still there.

N1124 6.2.4p2:

The value of a pointer becomes indeterminate when the object it
points to reaches the end of its lifetime.

(This is unchanged from C99.)
 
M

Mark McIntyre

Yes, the "=" operator *does* copy.

Ok, I'll bite. For a weak definition of copy, which doen't necessarily
include the result being identical to the original.
(In this case, what it copies is a pointer value, not what it points to.)

Bear in mind that it does not necessarily create a copy of the value,
since the value is type-converted en route if required. (6.5.16.1 p2).

(I admit I left the word "strings" out after the word "copy", but then
strings don't exist in C so it seemed reasonable not to leave it
in...)
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top