Writable strings

J

Jason Curl

Hello C coders, et. al,

Please have a look at the snippet of code below.

01 int main(void)
02 {
03 char mystr[] = "Hello world\n";
04 char *pmystr2 = "My String\n";
05
06 mystr[0] = 0;
07 pmystr2[0] = 0;
08
09 return 0;
10 }

Is the second assignment with 'pmystr2' allowed at line 7? There should
be no problem with "Hello world\n" as it is assigned to a char array on
line 3, but the second instance might be writing to a section of memory
that is marked read only, in which the problem originally occurs at line 4.

Or, would the compiler be required to take the constant string "My
String" on line 4, copy it to somewhere it is writable, and then assign
pmystr2 to the address in the new section?

Any insights would help.

Jason.
 
L

Lawrence Kirby

Hello C coders, et. al,

Please have a look at the snippet of code below.

01 int main(void)
02 {
03 char mystr[] = "Hello world\n";
04 char *pmystr2 = "My String\n";
05
06 mystr[0] = 0;
07 pmystr2[0] = 0;
08
09 return 0;
10 }

Is the second assignment with 'pmystr2' allowed at line 7? There should
be no problem with "Hello world\n" as it is assigned to a char array on
line 3, but the second instance might be writing to a section of memory
that is marked read only, in which the problem originally occurs at line 4.

Modifying a string literal results in undefined behaviour, so as you say
line 7 is invalid.
Or, would the compiler be required to take the constant string "My
String" on line 4, copy it to somewhere it is writable, and then assign
pmystr2 to the address in the new section?

No, it can't do that because line 7 is not allowed to change the value of
the pointer pmystr2.

Lawrence
 
O

Omri Barel

Lawrence said:
Hello C coders, et. al,
[snip]


Modifying a string literal results in undefined behaviour, so as you say
line 7 is invalid.

Undefined behaviour is not necessarily invalid. It may not be portable,
but if you know what your compiler is doing (and hopefully it's well
documented), then there's nothing inherently wrong with this code. For
example, if you have a very tight memory budget, and you know for a fact
that the compiler generates code that works for line 7 (and you're aware
of the fact that it's not portable, but even if you're not aware), then
there's no problem with line 7.


From the rationale:

Undefined behavior gives the implementor license not to catch certain
program errors that are difficult to diagnose. It also identifies areas
of possible conforming language extension: the implementor may augment
the language by providing a definition of the officially undefined
behavior.


If the implementor decided to allow line 7, then there's nothing wrong
with that.
 
M

Malcolm

Omri Barel said:
Undefined behavior gives the implementor license not to catch certain
program errors that are difficult to diagnose. It also identifies areas
of possible conforming language extension: the implementor may augment the
language by providing a definition of the officially undefined behavior.
Implementation-defined behaviour is provided for that purpose. Of course
there's nothing to prevent a compiler vendor from slightly altering the C
language to make programs which would have been illegal legal in what is now
technically a new version.
 
J

Jack Klein

Implementation-defined behaviour is provided for that purpose. Of course
there's nothing to prevent a compiler vendor from slightly altering the C
language to make programs which would have been illegal legal in what is now
technically a new version.

No, the quote from the rationale that Omri Barel posted is absolutely
not the same thing as implementation-defined behavior.

Implementation-defined behavior is itself a very specifically defined
concept, with a precise definition in the C standard:

"implementation-defined behavior
unspecified behavior where each implementation documents how the
choice is made"

The only place where an implementation produces implementation-defined
behavior is in places where the standard states that there is
implementation-defined behavior. Examples are whether plain char is
equivalent to signed or unsigned char, whether bit-fields of type
"int" are signed or unsigned, and whether right shifting signed
integer types with negative values sign extends or not.

This has nothing at all to do with an implementation defining what
happens for specific cases of undefined behavior. A typical answer
here would be the common (on 2's complement) platforms of stating that
signed integer overflow results in truncating the true result to the
number of bits in the destination type. The fact that the
implementation documents what the underlying hardware does in this
case of undefined behavior does not make it implementation-defined.
 
R

Rajan

Hi Omri,
I saw the code-snippet , your assigning of char* mypstr = "My string"
is as good as const char* and it is in the read-only data location ,
therefore cannot change, except that when you declare this as const
char* and compile it does give you a warning , whereas when you use
char* it does'nt give you any warning.
 
P

pete

Rajan said:
Hi Omri,
I saw the code-snippet , your assigning of char* mypstr = "My string"
is as good as const char* and it is in the read-only data location ,

It may or may not be in read only memory.
It doesn't have to be.
 
L

Lawrence Kirby

Lawrence said:
Hello C coders, et. al,
[snip]


Modifying a string literal results in undefined behaviour, so as you say
line 7 is invalid.

Undefined behaviour is not necessarily invalid.

It is invalid in standard C i.e. the topic of this newsgroup.
It may not be portable,
but if you know what your compiler is doing (and hopefully it's well
documented), then there's nothing inherently wrong with this code.

Except that it is not valid C. It might be valid in some other language
which is very similar to C but that isn't particularly helpful. An
important reason for having a standardised language is to be able to write
code that will work with a C compiler, without having to know all of the
little wrinkles of the particular compiler in question.
For
example, if you have a very tight memory budget, and you know for a fact
that the compiler generates code that works for line 7 (and you're aware
of the fact that it's not portable, but even if you're not aware), then
there's no problem with line 7.

If you have a tight memory budgest you use an alternative method that
supports writable data, such as a static declared array. It is better to
find a language feature that support what you want than misusing one that
doesn't. That isn't always possible but it is surprising how often it is.
From the rationale:

Undefined behavior gives the implementor license not to catch certain
program errors that are difficult to diagnose. It also identifies areas
of possible conforming language extension: the implementor may augment
the language by providing a definition of the officially undefined
behavior.

If the implementor decided to allow line 7, then there's nothing wrong
with that.

You just need to understand that when you do so the C language doesn't
provide guarantees about what your code will do, IOW the language you are
programming in is no longer C. A LOT of code has to use extensions
to the language, e.g. anything that handles sockets explicitly. There's
nothing wrong with that. But there is value in writing code that is well
defined in standard C. As well as portability considerations it can be
more readable. For example code that writes to string literals looks
suspicious, code that uses standard techniques such as static arrays of
char is just more obviously correct to anybody reading the code.

Lawrence
 
R

Rajan

Do you mean you can change the char value of the string like say char*
a = "My string" ;
Can you say *a='a'; This will certainly not work , atleast does not on
Solaris.
 
P

pete

Rajan said:
Do you mean you can change the char value of the string like say char*
a = "My string" ;

No.
I'm saying that the object refered to by
the string literal "My string",
may or may not reside in read only memory
according to the rules of C.
Can you say *a='a';

That's undefined simply because the standard says that any attempt
to modify the contents of the object refered to by a string literal
is undefined.
This will certainly not work , atleast does not on
Solaris.

The rules say that that code is undefined.

This newsgroup focuses on what's true for all conforming
implementations of C.
If something is guaranteed to work by the C standard,
then it doesn't matter what kind of conforming system you have.

If something isn't guaranteed to be true by the standard,
then whether or not it's true on your implementation,
is off topic for the clc newsgroup.
 
D

Default User

Rajan said:
Do you mean you can change the char value of the string like say char*
a = "My string" ;
Can you say *a='a'; This will certainly not work , atleast does not on
Solaris.



Please quote a relevant portion of the previous message when replying.
To do so from the Google interface, don't use the Reply at the bottom
of the message. Instead, click "show options" and use the Reply shown
in the expanded headers.


Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top