Allocating memory for strings

W

Win Sock

Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?
 
S

santosh

Win said:
Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?

The storage for the string is set aside during translation and the
pointer 'a' is set to point to it's beginning.

If the pointer is not static and no other pointers point to the string,
then the string becomes irretrievable when 'a' goes out of scope.
 
C

christian.bau

Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?

No.

A string literal like "Hello wrold" works exactly as if you had a
static array of const char, and got a pointer to that array, cast to
char* instead of const char*. So

char *a = "Hello wrold";

works exactly the same as

const char secret_array [] = "Hello wrold";
char *a = (char *) secret_array;
 
J

Joe Wright

Win said:
Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?
Not in the sense of malloc() and friends. free(a) is Undefined. The
constant string "Hello wrold" is placed somewhere in memory as an
anonymous array of char, the address of which is placed in a.

Spelling? legal and world.
 
B

Ben Pfaff

christian.bau said:
char *a = "Hello wrold";

works exactly the same as

const char secret_array [] = "Hello wrold";
char *a = (char *) secret_array;

If it's outside any function, yet; otherwise, secret_array must
be declared static.
 
K

Keith Thompson

christian.bau said:
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?

No.

A string literal like "Hello wrold" works exactly as if you had a
static array of const char, and got a pointer to that array, cast to
char* instead of const char*. So

char *a = "Hello wrold";

works exactly the same as

const char secret_array [] = "Hello wrold";
char *a = (char *) secret_array;

Except that string literals aren't const. (Attempting to modify a
string literal invokes undefined behavior, but only because the
standard explicitly says so.) It would be better if string literals
*were* const, but that would have broken existing code back in 1989
when the ANSI standard first introduced the "const" keyword.
 
T

Tor Rustad

Win said:
Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?

String literals may be placed in read-only memory, and it's undefined
behavior (UB) altering what *a points to. Hence, you should rather use:

const char *a = "Hello wrold";

Note that, else the compiler might not catch this error:

$ cat -n main.c
1 #include <stdio.h>
2
3
4 int main(void)
5 {
6 char *a = "Hello";
7 const char *b = "Hello";
8
9 printf("%s %s\n", a, b);
10
11 a[0]='\0';
12 b[0]='\0';
13
14 return 0;
15 }

$ gcc -ansi -pedantic -W -Wall main.c
main.c: In function âmainâ:
main.c:12: error: assignment of read-only location

above, there was no warning about the UB at line 11!

--
Tor <torust [at] online [dot] no>

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the
other way is to make it so complicated that there are no obvious
deficiencies"
 
S

santosh

Tor said:
Win said:
Hi All,
somebody told me this morning that the following is leagal.

char *a = "Hello wrold";

The memory is automatically allocated on the fly. Is this correct?

String literals may be placed in read-only memory, and it's undefined
behavior (UB) altering what *a points to. Hence, you should rather
use:

const char *a = "Hello wrold";

Note that, else the compiler might not catch this error:

$ cat -n main.c
1 #include <stdio.h>
2
3
4 int main(void)
5 {
6 char *a = "Hello";
7 const char *b = "Hello";
8
9 printf("%s %s\n", a, b);
10
11 a[0]='\0';
12 b[0]='\0';
13
14 return 0;
15 }

$ gcc -ansi -pedantic -W -Wall main.c
main.c: In function âmainâ:
main.c:12: error: assignment of read-only location

above, there was no warning about the UB at line 11!

Interestingly if the const qualifier is removed, compilation succeeds
under gcc, but the executable terminates with a segmentation fault.
This indicates that gcc places the strings in read-only storage.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.
 
B

Ben Pfaff

Keith Thompson said:
christian.bau said:
char *a = "Hello wrold";

works exactly the same as

const char secret_array [] = "Hello wrold";
char *a = (char *) secret_array;

Except that string literals aren't const. (Attempting to modify a
string literal invokes undefined behavior, but only because the
standard explicitly says so.)

I think that's why Christian included the cast to char *. With
the cast, the effect is the same.
 
T

Tor Rustad

santosh wrote:

[...]
Interestingly if the const qualifier is removed, compilation succeeds
under gcc, but the executable terminates with a segmentation fault.
This indicates that gcc places the strings in read-only storage.
Yes.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.

Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.

So something "unexpected" could still happen.

Note that splint issue two warnings for the sample code i posted.

--
Tor <torust [at] online [dot] no>

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the
other way is to make it so complicated that there are no obvious
deficiencies"
 
S

santosh

Tor said:
santosh wrote:

[...]
Interestingly if the const qualifier is removed, compilation succeeds
under gcc, but the executable terminates with a segmentation fault.
This indicates that gcc places the strings in read-only storage.
Yes.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.

Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.

So something "unexpected" could still happen.

Note that splint issue two warnings for the sample code i posted.

Actually for the sake of brevity I omitted to mention that the test
program was not what you provided, but a similar one I wrote. Below is
it's source:

#include <stdio.h>

int main(void)
{
char *a = "Hello ";
char *b = "world!\n";

printf("%s%s", a, b);
*a = *b;
*b = *(a+2);
printf("%s%s", a, b);
return 0;
}

$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
$ ./t11_gcc
Hello world!
Segmentation fault
$

$ lcc -ansic t11.c
$ gcc -o t11_lcc t11.o
[This is needed because lcc-linux32 does not yet do linking]
$ ./t11_lcc
Hello world!
wello lorld!
$

This seems to indicate that gcc places the string literals in read-only
storage while lcc-linux32 doesn't. Both are of course perfectly
conforming behaviour and the difference in behaviour is merely a QoI
issue.
 
T

Tor Rustad

santosh said:
Tor said:
santosh wrote:

[...]
Interestingly if the const qualifier is removed, compilation succeeds
under gcc, but the executable terminates with a segmentation fault.
This indicates that gcc places the strings in read-only storage. Yes.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.
Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.

So something "unexpected" could still happen.

Note that splint issue two warnings for the sample code i posted.

Actually for the sake of brevity I omitted to mention that the test
program was not what you provided, but a similar one I wrote. Below is
it's source:

#include <stdio.h>

int main(void)
{
char *a = "Hello ";
char *b = "world!\n";

printf("%s%s", a, b);
*a = *b;
*b = *(a+2);
printf("%s%s", a, b);
return 0;
}

$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
$ ./t11_gcc
Hello world!
Segmentation fault
$

$ lcc -ansic t11.c
$ gcc -o t11_lcc t11.o
[This is needed because lcc-linux32 does not yet do linking]
$ ./t11_lcc
Hello world!
wello lorld!
$

This seems to indicate that gcc places the string literals in read-only
storage while lcc-linux32 doesn't. Both are of course perfectly
conforming behaviour and the difference in behaviour is merely a QoI
issue.

Could you try e.g. this:

char *a = "Hello";
char *b = "Hello";

and check if the lcc use the same storage location?

--
Tor <torust [at] online [dot] no>

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the
other way is to make it so complicated that there are no obvious
deficiencies"
 
S

santosh

Tor said:
santosh said:
Tor said:
santosh wrote:

[...]

Interestingly if the const qualifier is removed, compilation
succeeds under gcc, but the executable terminates with a
segmentation fault. This indicates that gcc places the strings in
read-only storage.
Yes.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.
Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.

So something "unexpected" could still happen.

Note that splint issue two warnings for the sample code i posted.

Actually for the sake of brevity I omitted to mention that the test
program was not what you provided, but a similar one I wrote. Below
is it's source:

#include <stdio.h>

int main(void)
{
char *a = "Hello ";
char *b = "world!\n";

printf("%s%s", a, b);
*a = *b;
*b = *(a+2);
printf("%s%s", a, b);
return 0;
}

$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
$ ./t11_gcc
Hello world!
Segmentation fault
$

$ lcc -ansic t11.c
$ gcc -o t11_lcc t11.o
[This is needed because lcc-linux32 does not yet do linking]
$ ./t11_lcc
Hello world!
wello lorld!
$

This seems to indicate that gcc places the string literals in
read-only storage while lcc-linux32 doesn't. Both are of course
perfectly conforming behaviour and the difference in behaviour is
merely a QoI issue.

Could you try e.g. this:

char *a = "Hello";
char *b = "Hello";

and check if the lcc use the same storage location?

With your changes to the above program, and an additional line of the
form:

printf("a = %p\tb = %p\n", (void *)a, (void *)b);

I get for gcc:

$ ./t11_gcc
a = 0x80484e4 b = 0x80484e4
Segmentation fault
$

and for lcc:

$ ./t11_lcc
a = 0x80495dc b = 0x80495dc
HelloHellolellolello
$

So the same storage location is being used for both strings by both
compilers with the difference that gcc seems to be placing them in
read-only storage while lcc-linux32 places them in modifiable storage.

Of course since the code invokes undefined behaviour any result
is "correct."
 
T

Tor Rustad

santosh said:
Tor Rustad wrote:
[...]
Could you try e.g. this:

char *a = "Hello";
char *b = "Hello";

and check if the lcc use the same storage location?

With your changes to the above program, and an additional line of the
form:

printf("a = %p\tb = %p\n", (void *)a, (void *)b);

I get for gcc:

$ ./t11_gcc
a = 0x80484e4 b = 0x80484e4
Segmentation fault
$

and for lcc:

$ ./t11_lcc
a = 0x80495dc b = 0x80495dc
HelloHellolellolello
$

So the same storage location is being used for both strings by both
compilers with the difference that gcc seems to be placing them in
read-only storage while lcc-linux32 places them in modifiable storage.

Thanks santosh, I think this last example program illustrate quite well
to OP the dangers of modifying string literals. The "unexpected" can
happen, including for the current version of the lcc compiler.
Of course since the code invokes undefined behaviour any result
is "correct."

Let us just call the result undefined, like 0/0 is in mathematics. :)

--
Tor <torust [at] online [dot] no>

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the
other way is to make it so complicated that there are no obvious
deficiencies"
 
C

Charles Richmond

santosh said:
The storage for the string is set aside during translation and the
pointer 'a' is set to point to it's beginning.

If the pointer is not static and no other pointers point to the string,
then the string becomes irretrievable when 'a' goes out of scope.

If the pointer a goes out of scope or is set to a different
value, the string may *not* be irretrievable. Some compilers
only store *one* copy of each string literal and uses that
one copy everywhere the literal appears in the source.

So it is possible that this literal may be accessed in other
ways than through pointer a.
 
¬

¬a\\/b

In data Sun, 07 Oct 2007 03:14:25 +0530, santosh scrisse:
Tor said:
santosh said:
Tor Rustad wrote:

santosh wrote:

[...]

Interestingly if the const qualifier is removed, compilation
succeeds under gcc, but the executable terminates with a
segmentation fault. This indicates that gcc places the strings in
read-only storage.
Yes.

On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.
Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.

So something "unexpected" could still happen.

Note that splint issue two warnings for the sample code i posted.

Actually for the sake of brevity I omitted to mention that the test
program was not what you provided, but a similar one I wrote. Below
is it's source:

#include <stdio.h>

int main(void)
{
char *a = "Hello ";
char *b = "world!\n";

printf("%s%s", a, b);
*a = *b;
*b = *(a+2);
printf("%s%s", a, b);
return 0;
}

$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
$ ./t11_gcc
Hello world!
Segmentation fault
$

$ lcc -ansic t11.c
$ gcc -o t11_lcc t11.o
[This is needed because lcc-linux32 does not yet do linking]
$ ./t11_lcc
Hello world!
wello lorld!
$

This seems to indicate that gcc places the string literals in
read-only storage while lcc-linux32 doesn't. Both are of course
perfectly conforming behaviour and the difference in behaviour is
merely a QoI issue.

Could you try e.g. this:

char *a = "Hello";
char *b = "Hello";

and check if the lcc use the same storage location?

With your changes to the above program, and an additional line of the
form:

printf("a = %p\tb = %p\n", (void *)a, (void *)b);

I get for gcc:

$ ./t11_gcc
a = 0x80484e4 b = 0x80484e4
Segmentation fault
$

and for lcc:

$ ./t11_lcc
a = 0x80495dc b = 0x80495dc
HelloHellolellolello
$

how are smart...
and all this for save 2 unsigned in the memory ("Hello", 0)
 
A

Army1987

Except that string literals aren't const. (Attempting to modify a
string literal invokes undefined behavior, but only because the
standard explicitly says so.) It would be better if string literals
*were* const, but that would have broken existing code back in 1989
when the ANSI standard first introduced the "const" keyword.
They could have made string literals const while allowing implicit
conversion of const char * to char *, and make the return type of
strchr, strstr etc. const char *, as in C++.
 
J

Jack Klein

They could have made string literals const while allowing implicit
conversion of const char * to char *, and make the return type of
strchr, strstr etc. const char *, as in C++.

What does that fix? What errors does it prevent that are not
prevented by making it undefined behavior to attempt to modify string
literals?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
 
A

Army1987

What does that fix? What errors does it prevent that are not
prevented by making it undefined behavior to attempt to modify string
literals?
I don't like the idea of having something I can't write to while
its type isn't const. Think that in C99
(const char []){ 'a', 'b', 0 } and "ab" are allowed to be the same
object. This doesn't sound like a logical choice to me.
Also, if it is true that I am allowed to modify argv and
argv[j] but not necessarily argv (an errata to K&R2 says "It
isn't forbidden, but it isn't allowed either"), I'd prefer it to
be declared as char *const *argv.
 
K

karthikbalaguru

Tor said:
santosh said:
Tor Rustad wrote:
santosh wrote:
[...]
Interestingly if the const qualifier is removed, compilation
succeeds under gcc, but the executable terminates with a
segmentation fault. This indicates that gcc places the strings in
read-only storage.
Yes.
On the other hand under the lcc-linux32 compiler nothing unexpected
happens. Apparently string literals are _not_ placed into read-only
storage by lcc-linux32.
Did you get a warning with lcc-linux32? I don't have the lcc-linux32
compiler, but it could use the same storage location for those two
string literals, which I purpose used "Hello" for both.
So something "unexpected" could still happen.
Note that splint issue two warnings for the sample code i posted.
Actually for the sake of brevity I omitted to mention that the test
program was not what you provided, but a similar one I wrote. Below
is it's source:
#include <stdio.h>
int main(void)
{
char *a = "Hello ";
char *b = "world!\n";
printf("%s%s", a, b);
*a = *b;
*b = *(a+2);
printf("%s%s", a, b);
return 0;
}
$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
$ ./t11_gcc
Hello world!
Segmentation fault
$
$ lcc -ansic t11.c
$ gcc -o t11_lcc t11.o
[This is needed because lcc-linux32 does not yet do linking]
$ ./t11_lcc
Hello world!
wello lorld!
$
This seems to indicate that gcc places the string literals in
read-only storage while lcc-linux32 doesn't. Both are of course
perfectly conforming behaviour and the difference in behaviour is
merely a QoI issue.
Could you try e.g. this:
char *a = "Hello";
char *b = "Hello";
and check if the lcc use the same storage location?

With your changes to the above program, and an additional line of the
form:

printf("a = %p\tb = %p\n", (void *)a, (void *)b);

I get for gcc:

$ ./t11_gcc
a = 0x80484e4 b = 0x80484e4
Segmentation fault
$

and for lcc:

$ ./t11_lcc
a = 0x80495dc b = 0x80495dc
HelloHellolellolello
$

So the same storage location is being used for both strings by both
compilers with the difference that gcc seems to be placing them in
read-only storage while lcc-linux32 places them in modifiable storage.

Of course since the code invokes undefined behaviour any result
is "correct."- Hide quoted text -

- Show quoted text -

Some compilers have a switch controlling whether string literals are
writable or not (for compiling old code),
and some may have options to cause string literals to be formally
treated as arrays of const char (for better error catching).

Earlier C didn not have the 'const' keyword, so if you wanted to
pass a string literal to a particular function( In sucha way that
the
string will not be modified inside the function), then that
particular function must take a 'char*' argument. Thats all.

Karthik Balaguru
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top