Read-only functionality without 'const'

K

karthikbalaguru

Hi,

While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory
and the string 'hello' in second method lie in a modifiable memory ?
Only 'const' provides the 'Read-only' functionality in C . How come
this "char *s" provides that functionality ? What is the internal of
this functionality actually ?

The following is the snapshot of the info that has prompted me to
raise this query :-
In any context, char *s = "Hello"; just means that the pointer s is
assigned the address of the string literal "Hello". Normally, that
string literal will reside in read-only memory which means that it's
not legal to do:
char *s = "Hello"; s[1] = 'a';

while it's perfectly legal to do
char s[] = "Hello"; s[1] = 'a';

Thx in advans,
Karthik Balaguru
 
R

Richard Bos

karthikbalaguru said:
While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

Who says it does? It _may_, because it's a string literal, and string
literals are allowed to be non-modifiable. To be exact, modifying the
contents of the array that is created to hold the string literal has
undefined behaviour, which means that anything may happen, from the
modification succeeding, through it being ignored, crashing the program,
and if the implementation is being intentionally malevolent, anything up
to sending lurid emails in your name to CERT-In.
and the string 'hello' in second method lie in a modifiable memory ?

Because it's a normal object, which is initialised from a string
literal; but once initialised, it is a normal, modifiable array object.
Only 'const' provides the 'Read-only' functionality in C .
Wrong.

How come this "char *s" provides that functionality ?

It doesn't; the fact that it points at a string literal does. Had you
pointed the pointer at a char array _object_ rather than at a string
literal, you could have modified it.
What is the internal of this functionality actually ?

That sentence no meaning.

Richard
 
C

Chris Dollin

karthikbalaguru said:
While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

It /may/ lie in read-only memory. It's not /required/ to.
and the string 'hello' in second method lie in a modifiable memory ?

The standard says it's undefined behaviour to write into a string
literal. So an implementation can put string literals in read-only
memory because, if anyone writes into them, the standard says All
Bets Are Off.
Only 'const' provides the 'Read-only' functionality in C .

Not really true.
How come this "char *s" provides that functionality ?

It doesn't. It's not to do with `char *` ness. It's to do with
"Hello" being a string literal. You're not allowed to modify
the contents thereof.

In the second initialisation `char s[] = "Hello";`, the /content/
of that literal is copied into the store allocated for `s`. (And
that content doesn't change, since if you tried, you'd get UB.)
Since you don't get access to the insides of the literal, whether
or not it happened to be read-only doesn't matter. Indeed, the
implementation might not need to keep the literal around at all.

[It might, as an implementation-specific example, implement

char s[] = "f";

inside a function as

mov r0, #'f'
str r0, [fp], #offsetOf(s)

so that the content of the literal string ends up encoded in
the immediate value of the `ldr` instruction.]
 
P

pete

karthikbalaguru said:
Hi,

While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

What should the following code do?

char *s = "Hello";

*s = '\0';
puts("Hello");

Even if the meaning of that code wasn't explicitly
made undefined by the C standard,
it is obviously ambiguous code.
 
K

karthikbalaguru

Not really true.

Sorry, i should have used that in a clear fashion.
For true compile-time constant , we need to use '#define' or perhaps
'enum'.
For run time constants, we need to use 'const' telling that it can not
be manipulated/changed.
I think, the above is true now.
Is there something else that makes it false ?
 
C

Chris Dollin

karthikbalaguru said:
Sorry, i should have used that in a clear fashion.
For true compile-time constant , we need to use '#define' or perhaps
'enum'.
For run time constants, we need to use 'const' telling that it can not
be manipulated/changed.
I think, the above is true now.
Is there something else that makes it false ?

String literals are effectively read-only. Hence, it is not true that
 
R

Richard

pete said:
karthikbalaguru said:
Hi,

While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

What should the following code do?

char *s = "Hello";

*s = '\0';
puts("Hello");

Even if the meaning of that code wasn't explicitly
made undefined by the C standard,
it is obviously ambiguous code.

I don't think it is obvious at all.
 
J

Jens Thoms Toerring

Richard said:
pete said:
karthikbalaguru said:
While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

What should the following code do?

char *s = "Hello";

*s = '\0';
puts("Hello");

Even if the meaning of that code wasn't explicitly
made undefined by the C standard,
it is obviously ambiguous code.
I don't think it is obvious at all.

I guess Pete meant (but didn't explicitly wrote so) that the compiler
is free to use the same memory location for string literals whereever
they appear. In this case the "Hello" from the initialization of the
pointer could be at the same address as the "Hello" used as the argu-
ment to puts(). So if one managed to change the memory where "Hello"
resides then what puts() prints out is dependent on if the compiler
did this space-optimization or not.

Regards, Jens
 
R

Richard

Richard said:
pete said:
karthikbalaguru wrote:
While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory

What should the following code do?

char *s = "Hello";

*s = '\0';
puts("Hello");

Even if the meaning of that code wasn't explicitly
made undefined by the C standard,
it is obviously ambiguous code.
I don't think it is obvious at all.

I guess Pete meant (but didn't explicitly wrote so) that the compiler
is free to use the same memory location for string literals whereever
they appear. In this case the "Hello" from the initialization of the
pointer could be at the same address as the "Hello" used as the argu-
ment to puts(). So if one managed to change the memory where "Hello"
resides then what puts() prints out is dependent on if the compiler
did this space-optimization or not.

No, I know how it works (or could work). I am just saying that it is not
obvious for a new C programmer.
Regards, Jens

--
 
K

Keith Thompson

karthikbalaguru said:
Sorry, i should have used that in a clear fashion.
For true compile-time constant , we need to use '#define' or perhaps
'enum'.
For run time constants, we need to use 'const' telling that it can not
be manipulated/changed.
I think, the above is true now.
Is there something else that makes it false ?

Remember that "const" doesn't really mean "constant"; it merely means
"read-only". For example, 42 is a constant, but:
const int x = 42;
x is not a constant; it's merely read-only. (But the compiler can
choose to store x in read-only memory, or not store it at all if its
address is never used; nevertheless, x can't be used where a constant
expression is required.)

Attempting to *directly* modify something declared as "const" is
illegal (a constraint violation, requiring a diagnostic):
x = 43; /* ILLEGAL */

If you attempt to *indirectly* modify something declared as "const",
the compiler isn't required to complain, but the behavior is
undefined:
int *ptr = (int*)&x; /* the cast discards the "const"; bad idea */
*ptr = 43; /* UNDEFINED BEHAVIOR */

A string literal is not "const" (for historical reasons) but any
attempt to modify the contents of a string literal also invokes
undefined behavior. This isn't because it's const (it isn't); it's
because the standard explicitly says that it's undefined behavior.
The language would be a bit cleaner if string literals actually were
"const", but that would have broken old code written before "const"
was introduced to the language.

If you write:
char *s = "hello"; /* DANGEROUS */
you're treading on dangerous ground. The initialization is legal,
because the string literal isn't const, but it means that the compiler
isn't required to complain if you try to modify *s or s[0]. Doing so
might happen to work, or it might blow up in your face. To be safe,
*pretend* that string literals are really const:
const char *s = "hello"; /* BETTER */
 
R

Richard

pete said:
Then what do you think that code looks like it's supposed to do?

It's not about what I think. It is obvious to anyone with any training
experience (giving) what the nOOb might assume from the code above. What
is "obvious" to you is not necessarily obvious to others.
 
O

Old Wolf

While trying to understand the difference between the following 2
methods, i have some interesting queries.
Method 1) char *s = "Hello";
and
Method 2) char s[] = "Hello";

How does the string 'hello' in first method lie in read-only memory
and the string 'hello' in second method lie in a modifiable memory ?

The "Hello" is in the same place in both cases. In the
first case, 's' points directly to "Hello". But in the
second case, 's' allocates new, writable memory and
copies "Hello" into it.
 
P

pete

Richard said:
It's not about what I think. It is obvious to anyone with any training
experience (giving) what the nOOb might assume from the code above.
What
is "obvious" to you is not necessarily obvious to others.

I think that it might either output "Hello" followed by a newline,
or just output a newline.
What would a n00b expect it to do?
 
R

Richard

pete said:
I think that it might either output "Hello" followed by a newline,
or just output a newline.
What would a n00b expect it to do?

I think we are going off track.

IMO it is NOT "obviously ambiguous" code for a nOOB. Possibly we are at
cross purposes with what we mean by that.

I would think that a nOOb who had just learnt about null terminated
strings would think nothing was output at all. Never mind a newline. I
certainly don't think it is obvious to a nOOb that writing to the memory
address by s is a no no.
 
K

karthikbalaguru

Thx. That sounds interesting.
Other queries pop up in my mind :( :( :( :(
Remember that "const" doesn't really mean "constant"; it merely means
"read-only". For example, 42 is a constant, but:
const int x = 42;
x is not a constant; it's merely read-only. (But the compiler can
choose to store x in read-only memory, or not store it at all if its
address is never used; nevertheless, x can't be used where a constant
expression is required.)

Attempting to *directly* modify something declared as "const" is
illegal (a constraint violation, requiring a diagnostic):
x = 43; /* ILLEGAL */

If you attempt to *indirectly* modify something declared as "const",
the compiler isn't required to complain, but the behavior is
undefined:
int *ptr = (int*)&x; /* the cast discards the "const"; bad idea */
*ptr = 43; /* UNDEFINED BEHAVIOR */


Why does C support such Undefined Behaviour. It should immediately pop-
up error while compilation itself.
Is there any other Indirect good use of the above method ?
A string literal is not "const" (for historical reasons) but any
attempt to modify the contents of a string literal also invokes
undefined behavior. This isn't because it's const (it isn't); it's
because the standard explicitly says that it's undefined behavior.
The language would be a bit cleaner if string literals actually were
"const", but that would have broken old code written before "const"
was introduced to the language.

If you write:
char *s = "hello"; /* DANGEROUS */
you're treading on dangerous ground. The initialization is legal,
because the string literal isn't const, but it means that the compiler
isn't required to complain if you try to modify *s or s[0]. Doing so
might happen to work, or it might blow up in your face. To be safe,
*pretend* that string literals are really const:
const char *s = "hello"; /* BETTER */

Why does C support such dangerous methods ?
Why does it 'Might Happen to Work" Or " It might blow up " ?

It should either work correctly or should not be supported at all.
Is there really any other reasons for such methods to exist and
continue to exist ?
Is tehre any use because of it ?

Thx in advans,
- Karthik Balaguru
 
C

Chris Dollin

karthikbalaguru said:
Thx. That sounds interesting.
Other queries pop up in my mind :( :( :( :(



Why does C support such Undefined Behaviour. It should immediately pop-
up error while compilation itself.

In general, it cannot tell. And the C philosophy is to assume that
the programmer knows what they are doing, in the interests of
having pretty minimal implementations.
Why does C support such dangerous methods ?

To allow implementations flexibility and to support legacy code.
Why does it 'Might Happen to Work" Or " It might blow up " ?

Because it might happen to work, or it might blow up.
It should either work correctly or should not be supported at all.

It turns out that that's too expensive a choice: the alternative
"we don't define this, it's up to the implementation" is more
effective (/for C/).
Is there really any other reasons for such methods to exist and
continue to exist ?
Is tehre any use because of it ?

Yes. It allows more implementations to fit the standard, and allows
informed programmers to make appropriate choices. C isn't an intrinsically
safe language; it's not meant to be. There are plenty of safer languages,
if one wants to / can / prefers to / needs to use them.
 
K

karthikbalaguru

String literals are effectively read-only. Hence, it is not true that

Thx.
Consider the following : -
const int *rvar;

Where is rvar stored in the memory??
What happens internal so that its a read only.
How is this being done ?

Normally, i find that there are LinkerScripts / LCFs / Linker command
Files that will manage different segments of memory
by making them as either Read-only, or RW or RWX . But that is done by
Linker and it is w.r.t External Memory Organisation which is known to
the programmer.
And infact done by programmer only.

How is that being done in C ?
Is there some kind of internal-linker which makes rvar to be of Read-
Only based on the const identifier ?
If so, where does it put that variable in memory ?

Thx in advans,
Karthik Balaguru
 
R

Richard Tobin

karthikbalaguru said:
Consider the following : -
const int *rvar;

Where is rvar stored in the memory??

rvar points to a const int, so there would be no reason for rvar
itself to be stored any differently than other variables. rvar
itself is not constant. So let's assume you're asking about the
int (or ints) that rvar points to.
What happens internal so that its a read only.

Nothing. Your declaration is a promise that you won't modify the data
through rvar. For example, you won't do *rvar=1 or rvar[2]=3. The
compiler can check this in many cases at copmile time. However, there
may be other pointers that point to the same data as rvar without the
const qualifier, so it is perfectly possible for the data to change.

If you had written "const int i = 1", then the compiler could indeed
put the value in read-only memory, because there is no defined way to
modify it.

-- Richard
 
K

karthikbalaguru

rvar points to a const int, so there would be no reason for rvar
itself to be stored any differently than other variables. rvar
itself is not constant. So let's assume you're asking about the
int (or ints) that rvar points to.

Sorry, It should have been the the contents pointed to by rvar.
Here it should be const int to which rvar points to.
Thx. Your assumption about my question is correct.
What happens internal so that its a read only.

Nothing. Your declaration is a promise that you won't modify the data
through rvar. For example, you won't do *rvar=1 or rvar[2]=3. The
compiler can check this in many cases at copmile time. However, there
may be other pointers that point to the same data as rvar without the
const qualifier, so it is perfectly possible for the data to change.

Thats very interesting.
So, it should be "Read-only w.r.t rvar(particular pointer only if it
is pointed via pointer).
There is all possibility of that data being pointed to for getting
changed if manipulated via other means other than that of rvar(pointer
pointing it)".
Thx.
If you had written "const int i = 1", then the compiler could indeed
put the value in read-only memory, because there is no defined way to
modify it.

Around this point revolves my query.
What does this read-only memory mean here ?
What does it refer to ? Is it a portion of stack or heap or something
else ?
Where is it ?

Thx in advans,
Karthik Balaguru
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top