library implementation strncpy() modifies string literal, legal? more doubts need more suggestion?

L

lovecreatesbeauty

1. In the following code, is the code (line 11) legal? Is there a
notice in the document to tell callers that the parameter s1 should
receive an array variable, i.e. type char[], but not a variable of char
*? p1 and p2 point to the same things but they must be declared as
different types? Is it nature?

char p1[] = "hello123456";
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

And is the lack of the const keyword in the declaration of the
parameter s1 (line 2) an indication that s1 receives an argument of
array?


2. The temporary variable os1 (line 4) is used only once as the return
value (line 12). Is it better to use os1 instead of s1 throughout the
function except the declaration (line 4). Or just don't introduce the
temporary os1, is it also a good idea? Which style do you prefer?

/*from:
http://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.c*/
char *
strncpy(char *s1, const char *s2, size_t n) /*line 2*/
{
char *os1 = s1; /*line 4*/

n++;
while (--n != 0 && (*s1++ = *s2++) != '\0')
;
if (n != 0)
while (--n != 0)
*s1++ = '\0'; /*line 11*/
return (os1); /*line 12*/
}
 
P

Peter Nilsson

lovecreatesbeauty said:
Subject: library implementation strncpy() modifies string literal, legal?

It's perfectly legal for an implementation to do whatever it wants, so
long
as it forfills the required semantics of the C standard [assuming the
implementation claims conformance.]
more doubts need more suggestion?

If it hasn't been suggested to you before, then I'll suggest it to you
now:

Read the FAQ.

http://c-faq.com/

e.g...

http://c-faq.com/decl/strlitinit.html

The strncpy function is allowed to modify the string pointed to by the
first argument because that is what it is _supposed_ to do.

String literals have const[] type, but are non-modifiable (for
hysterical
reasons, as the saying goes...)

<snip>
 
R

Richard Heathfield

lovecreatesbeauty said:
1. In the following code, is the code (line 11) legal? Is there a
notice in the document to tell callers that the parameter s1 should
receive an array variable, i.e. type char[], but not a variable of char
*? p1 and p2 point to the same things but they must be declared as
different types? Is it nature?

char p1[] = "hello123456";
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

Legal but often unwise. Note that p1[] now has the contents "world123456",
not just "world".
And is the lack of the const keyword in the declaration of the
parameter s1 (line 2) an indication that s1 receives an argument of
array?

No. It means that the function requires to be able to write to the n
characters starting at the address indicated by s1, so you'd better make
sure it can.
2. The temporary variable os1 (line 4) is used only once as the return
value (line 12). Is it better to use os1 instead of s1 throughout the
function except the declaration (line 4). Or just don't introduce the
temporary os1, is it also a good idea? Which style do you prefer?

The Standard, rightly or wrongly, requires strncpy to return the pointer
value it receives in the first parameter. It also needs to move a pointer
along from that point if it is to achieve its goal. So either way, a copy
of that value has to be made. Which pointer is used for moving along the
array and which is used for storing the original value doesn't make an
ounce of difference.

You are given a photocopy which you can scribble on if you wish. You need to
scribble on it, but you also need to keep hold of an unscribbled copy. So
you photocopy the photocopy. This being programming, the copy is perfect.
You now have two copies - one for best, and one for scribbling on. Does it
matter which you use for which usage? Of course not. They are identical for
any practical purpose.
 
L

lovecreatesbeauty

Peter said:
lovecreatesbeauty said:
Subject: library implementation strncpy() modifies string literal, legal?

It's perfectly legal for an implementation to do whatever it wants, so
long
as it forfills the required semantics of the C standard [assuming the
implementation claims conformance.]

I'm not doubting the behavior of modification, but doubting on its
implementation or interface. You could provide you expertise and be
more patient to others.
String literals have const[] type, but are non-modifiable (for
hysterical
reasons, as the saying goes...)

Don't you think the following 2 lines are modifying a string literal?

char *s = "hysterical";
s[1] = 'H';
 
L

lovecreatesbeauty

Richard said:
No. It means that the function requires to be able to write to the n
characters starting at the address indicated by s1, so you'd better make
sure it can.

But the compiler can not prevent callers from doing this:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

It's no guaranty in the language to detect the bad behavior at
compiling time, or the illegal thing is allowed in the language itself.
Just a runtime undefined error left.
The Standard, rightly or wrongly, requires strncpy to return the pointer
value it receives in the first parameter. It also needs to move a pointer
along from that point if it is to achieve its goal. So either way, a copy
of that value has to be made. Which pointer is used for moving along the
array and which is used for storing the original value doesn't make an
ounce of difference.

Thank Richard, what you said is clear and helpful to me.
You are given a photocopy which you can scribble on if you wish. You need to
scribble on it, but you also need to keep hold of an unscribbled copy. So
you photocopy the photocopy. This being programming, the copy is perfect.
You now have two copies - one for best, and one for scribbling on. Does it
matter which you use for which usage? Of course not. They are identical for
any practical purpose.

Don't understand it much, could you be more detail?
 
R

Richard Heathfield

lovecreatesbeauty said:
Richard said:
No. It means that the function requires to be able to write to the n
characters starting at the address indicated by s1, so you'd better make
sure it can.

But the compiler can not prevent callers from doing this:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

The compiler is not forbidden from preventing callers from doing that, but
in general it would, I think, be fairly hard to detect.
It's no guaranty in the language to detect the bad behavior at
compiling time, or the illegal thing is allowed in the language itself.

No, it's not *allowed*. It's just not *punished*...
Just a runtime undefined error left.

....until runtime, maybe.
 
I

Ian Collins

Richard said:
lovecreatesbeauty said:

Richard said:
No. It means that the function requires to be able to write to the n
characters starting at the address indicated by s1, so you'd better make
sure it can.

But the compiler can not prevent callers from doing this:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));


The compiler is not forbidden from preventing callers from doing that, but
in general it would, I think, be fairly hard to detect.
It shouldn't be, C++ compilers tend to emit a nasty warning if you
assign a string literal to a char*. One of the things I find strange
about the C standard is this not requiring a diagnostic.
 
R

Richard Heathfield

Ian Collins said:
Richard said:
lovecreatesbeauty said:

Richard Heathfield wrote:

No. It means that the function requires to be able to write to the n
characters starting at the address indicated by s1, so you'd better make
sure it can.

But the compiler can not prevent callers from doing this:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));


The compiler is not forbidden from preventing callers from doing that,
but in general it would, I think, be fairly hard to detect.
It shouldn't be, C++ compilers tend to emit a nasty warning if you
assign a string literal to a char*.

Oh, that's easy enough to detect. But because it's legal in C, that's why it
becomes hard to detect when that privilege is abused.
One of the things I find strange
about the C standard is this not requiring a diagnostic.

Alas, it's one of those stupid "mustn't break already-broken code" things,
which is also why we still have stupid stupid gets().
 
C

CBFalconer

Richard said:
Ian Collins said:
Richard said:
lovecreatesbeauty said:
Richard Heathfield wrote:

No. It means that the function requires to be able to write to
the n characters starting at the address indicated by s1, so
you'd better make sure it can.

But the compiler can not prevent callers from doing this:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

The compiler is not forbidden from preventing callers from doing
that, but in general it would, I think, be fairly hard to detect.
It shouldn't be, C++ compilers tend to emit a nasty warning
if you assign a string literal to a char*.

Oh, that's easy enough to detect. But because it's legal in C,
that's why it becomes hard to detect when that privilege is
abused.
One of the things I find strange
about the C standard is this not requiring a diagnostic.

Alas, it's one of those stupid "mustn't break already-broken
code" things, which is also why we still have stupid stupid gets().

Historically, it wasn't broken code. It was just systems that
stored literals where they could be modified, and then programs
that did just that. Never mind that it drove the maintainers nuts.

It still can work, as shown by the following baby program on _some_
systems:

[1] c:\c\junk>cat junk.c
#include <stdio.h>

int main(void)
{
char *junk = "Original";

puts(junk);
junk[0] = 'o';
puts(junk);
return 0;
}

[1] c:\c\junk>gcc -W -Wall -ansi -pedantic junk.c

[1] c:\c\junk>.\a
Original
original

The way to catch it is to add "-Wwrite-strings" to the gcc call,
although the error warning can be confusing.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
 
L

lovecreatesbeauty

CBFalconer said:
lovecreatesbeauty said:
char *s = "hysterical";
s[1] = 'H';
They are causing Undefined Behaviour.

So does strncpy in the following code:

char *p1 = "hello123456"; /*char p1[] = "hello123456";*/
char *p2 = "world";
strncpy(p1, p2, strlen(p2));

Sometimes, the function calls of strncpy will be given char *p1 =
"hello123456"; as the first argument and undefined behavior occurs
inside the library function strncpy. In case of those occasions, are
programmers considered unqualified due to the careless of both
themselves and the language?

Thank you for the "-Wwrite-strings" tip in the next post.
 
J

Jack Klein

Peter said:
lovecreatesbeauty said:
Subject: library implementation strncpy() modifies string literal, legal?

It's perfectly legal for an implementation to do whatever it wants, so
long
as it forfills the required semantics of the C standard [assuming the
implementation claims conformance.]

I'm not doubting the behavior of modification, but doubting on its
implementation or interface. You could provide you expertise and be
more patient to others.
String literals have const[] type, but are non-modifiable (for
hysterical
reasons, as the saying goes...)

Don't you think the following 2 lines are modifying a string literal?

char *s = "hysterical";
s[1] = 'H';

No, the two lines above are "attempting to modify" a string literal.
There is a big difference between "modifying" and "attempting to
modify".

Attempting to modify a string literal produces undefined behavior.
Whether or not it actually manages the task of "modifying" the string
literal is a matter about which the C standard says nothing.

On some implementations that I know of, the string literal would
actually be modified. On others, the write would have no effect. On
still others, a hardware trap would cause the underlying operating
system to terminate the program immediately.
 
K

Keith Thompson

Peter Nilsson said:
String literals have const[] type,

No, string literals do *not* have const[] type.
but are non-modifiable (for hysterical reasons, as the saying
goes...)

More precisely, attempting to modify a string literal invokes
undefined behavior (the implementation is not required to diagnose the
error).
 
R

Richard Bos

CBFalconer said:
Richard said:
Ian Collins said:

Oh, that's easy enough to detect. But because it's legal in C,
that's why it becomes hard to detect when that privilege is
abused.


Alas, it's one of those stupid "mustn't break already-broken
code" things, which is also why we still have stupid stupid gets().

Historically, it wasn't broken code. It was just systems that
stored literals where they could be modified, and then programs
that did just that. Never mind that it drove the maintainers nuts.

It still can work, as shown by the following baby program on _some_
systems:

[1] c:\c\junk>cat junk.c
#include <stdio.h>

int main(void)
{
char *junk = "Original";

puts(junk);
junk[0] = 'o';
puts(junk);
return 0;
}

Moreover, this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *lang_names[]={"Nederlands", "English", "Italiano"};
char *lang_name;
unsigned int language;

if (argc>1)
language=strtoul(argv[1], 0, 0);
else
language=0;

if (language>=sizeof lang_names/sizeof *lang_names)
lang_name="(Invalid)";
else
lang_name=lang_names[language];

printf("You chose language %s\n", lang_name);

return 0;
}

is, as it stands, valid ISO C, and this principle is on occasion quite
useful. Forbidding the assignment to char pointers of string literals or
pointers to string literals would break this quite reasonable code.

Richard
 
C

CBFalconer

Richard said:
CBFalconer said:
Richard said:
Ian Collins said:

It shouldn't be, C++ compilers tend to emit a nasty warning
if you assign a string literal to a char*.

Oh, that's easy enough to detect. But because it's legal in C,
that's why it becomes hard to detect when that privilege is
abused.

One of the things I find strange
about the C standard is this not requiring a diagnostic.

Alas, it's one of those stupid "mustn't break already-broken
code" things, which is also why we still have stupid stupid gets().

Historically, it wasn't broken code. It was just systems that
stored literals where they could be modified, and then programs
that did just that. Never mind that it drove the maintainers nuts.

It still can work, as shown by the following baby program on _some_
systems:

[1] c:\c\junk>cat junk.c
#include <stdio.h>

int main(void)
{
char *junk = "Original";

puts(junk);
junk[0] = 'o';
puts(junk);
return 0;
}

Moreover, this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *lang_names[]={"Nederlands", "English", "Italiano"};
char *lang_name;
unsigned int language;

if (argc>1)
language=strtoul(argv[1], 0, 0);
else
language=0;

if (language>=sizeof lang_names/sizeof *lang_names)
lang_name="(Invalid)";
else
lang_name=lang_names[language];

printf("You chose language %s\n", lang_name);

return 0;
}

is, as it stands, valid ISO C, and this principle is on occasion
quite useful. Forbidding the assignment to char pointers of
string literals or pointers to string literals would break this
quite reasonable code.

I don't think anyone is going to quibble that your demo is
legitimate, and should never cause a problem. Unlike mine, which
causes undefined behaviour, but will quite likely 'work' on many
systems. My point was that legacy code often contains such
operations (which were legal at the time of writing), so the use of
-Wwrite-strings or the equivalent needs to be optional.
 
R

Richard Heathfield

Richard Bos said:
Moreover, this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
char *lang_names[]={"Nederlands", "English", "Italiano"};
char *lang_name;
unsigned int language;

if (argc>1)
language=strtoul(argv[1], 0, 0);
else
language=0;

if (language>=sizeof lang_names/sizeof *lang_names)
lang_name="(Invalid)";
else
lang_name=lang_names[language];

printf("You chose language %s\n", lang_name);

return 0;
}

is, as it stands, valid ISO C, and this principle is on occasion quite
useful. Forbidding the assignment to char pointers of string literals or
pointers to string literals would break this quite reasonable code.

True enough. Nevertheless, changing your code to:

const char *lang_names[]={"Nederlands", "English", "Italiano"};
const char *lang_name;

retains the semantics of the original code, and reduces the number of gcc
warnings I get from 4 to 0.
 
K

Keith Thompson

CBFalconer said:
I don't think anyone is going to quibble that your demo is
legitimate, and should never cause a problem. Unlike mine, which
causes undefined behaviour, but will quite likely 'work' on many
systems. My point was that legacy code often contains such
operations (which were legal at the time of writing), so the use of
-Wwrite-strings or the equivalent needs to be optional.

Code that "threatens" to write to a string literal (by assigning its
address to a non-const char*) is legal. gcc's -Wwrite-strings option
triggers warnings for such code, and I agree that it should be
optional.

Code that actually writes to a string literal invokes undefined
behavior. Compilers are under no obligation to warn about this, but
they're also under no obligation to enable it to work. (gcc had an
option to make string literals writable, but I believe it's been
removed from more recent versions, and I have no problem with that.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top