int * vs char *

Y

yugandhar

Hi Forum,

I have a quick question.

The following works:

char *s="hello";
*s="world";

but the following gives a segementation fault:

int *p = 0;
*p = 17;

Could anyone please clarify me about this?

yugandhar
 
J

John Gordon

In said:
Hi Forum,
I have a quick question.
The following works:
char *s="hello";
*s="world";

It shouldn't work. A character pointer declared in that way should be
non-writable. It's an accident that it "worked" (did the wrong thing)
for you.
but the following gives a segementation fault:
int *p = 0;
*p = 17;
Could anyone please clarify me about this?

Setting a pointer equal to zero is a special case.

Zero is another way of saying NULL. So when you declare a pointer that
is equal to zero, it's really a NULL pointer. It doesn't point anywhere.

So, when you say "Insert 17 at the location pointed to by this pointer",
it has no place to put the value 17, since you told it to point nowhere.
 
I

Ian Collins

Hi Forum,

I have a quick question.

The following works:

char *s="hello";
*s="world";

It is certainly not guaranteed to work. It would barf on any of the
compilers I use. These all place string literals in a write only segment.
but the following gives a segementation fault:

int *p = 0;
*p = 17;

Could anyone please clarify me about this?

You are invoking the demons of undefined behaviour.
 
I

Ike Naar

The following works:

In what way do you think it works?
char *s="hello";
*s="world";

You're assigning a pointer-to-char to a char.
That doesn't work. Your compiler should have issued a diagnostic.
but the following gives a segementation fault:

int *p = 0;
*p = 17;

p is a null pointer; you're trying to dereference it; don't do that.
Could anyone please clarify me about this?

Sorry.
 
K

Keith Thompson

yugandhar said:
I have a quick question.

The following works:

char *s="hello";
*s="world";

but the following gives a segementation fault:

int *p = 0;
*p = 17;

Could anyone please clarify me about this?

It's already been explained why "*p = 17;" blows up, but I don't believe
that the first block of code is what you actually compiled.

s is of type char*, so *s is of type char. You're attempting to assign
a string literal (which will decay to char*) to a char object. Early
pre-ANSI C compilers might have accepted that without complaint,
implicitly treating the pointer value as an integer and then narrowing
it to one byte, but in modern C it's a constraint violation, and any
compiler that doesn't at least warn you about it is broken.

Even if you corrected the type mismatch by writing:

*s = 'w';

your program's behavior would be undefined. s points to a string
literal (more precisely, it points to the first element of the static
array associated with the string literal), and any attempt to modify a
string literal has undefined behavior.

I can't be sure what your actual code looks like, but the most plausible
variant I can think of is:

char *s = "hello";
s = "world";

That's perfectly legal; it stores the address of the string "world" in
the variable s.
 
S

Shao Miller

Hi Forum,

I have a quick question.

The following works:

char *s="hello";
*s="world";

The type of 's' is 'char *'.

The type of '*s' is 'char'.

The type of '"world"' is not 'char'.

So the last line up above is erroneous.
but the following gives a segementation fault:

int *p = 0;
*p = 17;

The type of 'p' is 'int *'.

The type of '*p' is 'int'.

Since 'p' is a null pointer, use of '*p' is undefined behaviour. That
explains your segmentation fault.
Could anyone please clarify me about this?

Maybe.

int my_int = 13;
^^^--type

int my_int = 13;
^^^^^^--object identifier

int * foo, * bar;
^^^-^--type

int * foo, * bar;
^^^--object identifier

int * foo, * bar;
^^^--------^--type

int * foo, * bar;
^^^--object identifier

Also:

int my_int = 13;

is similar to:

int my_int;
my_int = 13;

A different example:

int * ip = &my_int;

is similar to:

int * ip;
ip = &my_int;

and not:

int * ip;
*ip = &my_int;
 
M

Morris Keesan

It shouldn't work. A character pointer declared in that way should be
non-writable. It's an accident that it "worked" (did the wrong thing)
for you.

Not the declaration, but the initialization (making s point to a string
literal). But whether the pointed-to char is writable is undefined:
"should be non-writable" is a bit of an overstatement. But it should be
treated as unwritable by the programmer, even if not by an implementation.
 
E

Eric Sosman

Hi Forum,

I have a quick question.

The following works:

char *s="hello";
*s="world";

Get a new compiler. Or, possibly, post your question to a forum
devoted to the language you're using: It's not C. (Or, possibly,
post the actual code you're asking about rather than a paraphrase.)
but the following gives a segementation fault:

int *p = 0;
*p = 17;

In C, if you're using C, the first line declares a pointer-to-int
and initializes it with the value usually known as NULL, the "pointer
to nowhere." The second line attempts to store 17 at the location
"nowhere," and things go haywire. (Technically, "the behavior is
undefined" -- "haywire" is a reasonable shorthand.)
 
S

Stephen Sprunk

The following works:

char *s="hello";
*s="world";

It does?

% cat foo.c
#include <stdio.h>
int main(void) {
char *s="hello";
*s="world";
puts(s);
}
% gcc foo.c
foo.c: In function ‘main’:
foo.c:4: warning: assignment makes integer from pointer without a cast
% ./a.out
Segmentation fault

What is the shortest _compilable_ program that "works" on your
implementation? Does your compiler generate any warning messages that
indicate undefined behavior, as mine does?
but the following gives a segementation fault:

int *p = 0;
*p = 17;

You're dereferencing a null pointer here, which is also undefined
behavior. However, my compiler isn't smart enough to notice that
problem; it just crashes when executed--for a somewhat different reason
than the former example.

S
 
N

Noob

Keith said:
Even if you corrected the type mismatch by writing:

char *s = "hello";
*s = 'w';

your program's behavior would be undefined. s points to a string
literal (more precisely, it points to the first element of the static
array associated with the string literal), and any attempt to modify a
string literal has undefined behavior.

Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

Regards.
 
S

Shao Miller

Keith said:
Even if you corrected the type mismatch by writing:

char *s = "hello";
*s = 'w';

your program's behavior would be undefined. s points to a string
literal (more precisely, it points to the first element of the static
array associated with the string literal), and any attempt to modify a
string literal has undefined behavior.

Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

Undefined behaviour implies "non-portable," in my view. But why remove
an implementation's freedom to define string literals' storage as writable?
 
J

James Kuyper

On 06/22/2011 04:28 AM, Noob wrote:
....
Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

It would break a large amount of existing code. The most common problem
would be code which assigns the pointer value of a string literal to a
char* object. You could argue that this is bad practice - if a pointer
might be pointing at a string literal, it should be declared 'const
char*', rather than 'char*'. However, there's no actual problem with
such code so long as no attempt is made to write through that pointer
value, and there's an awful lot of existing code which relies upon that
fact.
 
S

Stephen Sprunk

Keith said:
Even if you corrected the type mismatch by writing:

char *s = "hello";
*s = 'w';

your program's behavior would be undefined. s points to a string
literal (more precisely, it points to the first element of the static
array associated with the string literal), and any attempt to modify a
string literal has undefined behavior.

Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

That has been proposed numerous times over the years, but it is always
shot down due to widespread fears of "const poisoning". Frankly, I
think more use of const in C would be a good thing and eliminate many
lurking bugs, but to date ISO is unwilling to "break" code that assumes
string literals are not const but doesn't actually write to them.

S
 
K

Kleuskes & Moos

     Get a new compiler.  Or, possibly, post your question to a forum
devoted to the language you're using: It's not C.  (Or, possibly,
post the actual code you're asking about rather than a paraphrase.)



     In C, if you're using C, the first line declares a pointer-to-int
and initializes it with the value usually known as NULL, the "pointer
to nowhere."  The second line attempts to store 17 at the location
"nowhere," and things go haywire.  (Technically, "the behavior is
undefined" -- "haywire" is a reasonable shorthand.)

As an addendum: if your write through a pointer, you primise the
computer (so to speak) that there exists a valid object at that
address. If it does not, all bets are off.

That is to say:

int *p = 42;
*p = 17;

Would have the same problem.
 
I

Ian Collins

On 06/22/2011 04:28 AM, Noob wrote:
...
Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

It would break a large amount of existing code. The most common problem
would be code which assigns the pointer value of a string literal to a
char* object. You could argue that this is bad practice - if a pointer
might be pointing at a string literal, it should be declared 'const
char*', rather than 'char*'. However, there's no actual problem with
such code so long as no attempt is made to write through that pointer
value, and there's an awful lot of existing code which relies upon that
fact.

The existing code problem was a acknowledged when C++ changed the type
of string literals. Compilers may choose not to issues a diagnostic for
this case. Now we have had over a decade to fix the smelly code, I
believe a diagnostic is now required by the new C++ standard.

C could and should have done the same, but as usual those worried about
breaking already broken code appear to have won the day.
 
K

Keith Thompson

Kleuskes & Moos said:
As an addendum: if your write through a pointer, you primise the
computer (so to speak) that there exists a valid object at that
address. If it does not, all bets are off.

That is to say:

int *p = 42;
*p = 17;

Would have the same problem.

Not exactly. "int *p = 42;" has the problem that 42 is not implicitly
convertible to int*, so it's a contraint violation. "int *p = 0;" is a
special case, because 0 is a null pointer constant.
 
K

Kleuskes & Moos

Not exactly.  "int *p = 42;" has the problem that 42 is not implicitly
convertible to int*, so it's a contraint violation.  "int *p = 0;" isa
special case, because 0 is a null pointer constant.

You're right. There should have been a cast in there.
 
I

Ian Collins

On 06/22/2011 04:28 AM, Noob wrote:
...
Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

It would break a large amount of existing code. The most common problem
would be code which assigns the pointer value of a string literal to a
char* object. You could argue that this is bad practice - if a pointer
might be pointing at a string literal, it should be declared 'const
char*', rather than 'char*'. However, there's no actual problem with
such code so long as no attempt is made to write through that pointer
value, and there's an awful lot of existing code which relies upon that
fact.

The existing code problem was a acknowledged when C++ changed the type
of string literals. Compilers may choose not to issues a diagnostic for
this case. Now we have had over a decade to fix the smelly code, I
believe a diagnostic is now required by the new C++ standard.

C could and should have done the same, but as usual those worried about
breaking already broken code appear to have won the day.

Again, why should a C implementation be rendered non-conforming [to some
future Standard] thusly?

Because in most cases such code is an accident waiting to happen. It
was amusing to see how much cruft was removed from the OpenSolaris code
base when the default action of the native C compiler changed to use
read only literals! Although it's a specific example, it does
illustrate a more general point - assuming writeable literals is
non-portable. All those code changes were also required to get the code
to compile with gcc.
In a "bare metal" environment, one might very well wish to overwrite
their string literals' storage, no? The "bare metal" implementation
might need to define such action as being appropriate.

Most "bare metal" environments I have used place literals in a read only
segment, RAM is usually a more precious (and expensive) resource than
ROM/FLASH on embedded systems.
By using "proper" static arrays, we lose out on the "shared storage"
benefit. Writing for bare metal, hopefully one knows what one is doing.

Embedded tools usually have a rich set of pragmas and link options to
specify where various types of object live. It's nigh on impossible to
write a pure standard C embedded application.
Is this example silly?

No, but it is easily worked round, tool sets with a C++ compiler already
have to.
 
S

Shao Miller

On 06/22/2011 04:28 AM, Noob wrote:
...
Since modifying a character string literal already has UB in
the current standard, then why doesn't the next standard
specify that string literal have type const char[] instead
of just char[] ?

It would break a large amount of existing code. The most common problem
would be code which assigns the pointer value of a string literal to a
char* object. You could argue that this is bad practice - if a pointer
might be pointing at a string literal, it should be declared 'const
char*', rather than 'char*'. However, there's no actual problem with
such code so long as no attempt is made to write through that pointer
value, and there's an awful lot of existing code which relies upon that
fact.

The existing code problem was a acknowledged when C++ changed the type
of string literals. Compilers may choose not to issues a diagnostic for
this case. Now we have had over a decade to fix the smelly code, I
believe a diagnostic is now required by the new C++ standard.

C could and should have done the same, but as usual those worried about
breaking already broken code appear to have won the day.

Again, why should a C implementation be rendered non-conforming [to some
future Standard] thusly?

In a "bare metal" environment, one might very well wish to overwrite
their string literals' storage, no? The "bare metal" implementation
might need to define such action as being appropriate.

By using "proper" static arrays, we lose out on the "shared storage"
benefit. Writing for bare metal, hopefully one knows what one is doing.

Is this example silly?
 
K

Keith Thompson

Shao Miller said:
On 06/22/11 10:25 PM, James Kuyper wrote: [...]
The existing code problem was a acknowledged when C++ changed the type
of string literals. Compilers may choose not to issues a diagnostic for
this case. Now we have had over a decade to fix the smelly code, I
believe a diagnostic is now required by the new C++ standard.

C could and should have done the same, but as usual those worried about
breaking already broken code appear to have won the day.

Again, why should a C implementation be rendered non-conforming [to some
future Standard] thusly?

In a "bare metal" environment, one might very well wish to overwrite
their string literals' storage, no? The "bare metal" implementation
might need to define such action as being appropriate.

By using "proper" static arrays, we lose out on the "shared storage"
benefit. Writing for bare metal, hopefully one knows what one is doing.

Is this example silly?

If there's a need for a "bare metal" environment to be able to modify
string literals, that can be provided as an extension. Any code that
currently takes advantage of that ability already has undefined
behavior.

If such a feature were desirable, we could have an optional 'M' (for
modifiable) prefix for string literals, similar to the existing 'L'
prefix for wide string literals. For example, "hello" could be of type
const char[6], and M"hello" could be of type char[6]].

The fact that I've never heard of anyone implementing something like
this suggests (though not strongly) that there's no demand for it.

I suspect that the vast majority of code that attempts to modify
string literals does so as the result of bugs. A lot more code
uses string literals in contexts that don't treat them as const,
but doesn't actually try to modify them; for example:

void func(char *s) {
printf("In func(), s = \"%s\"\n");
}

...

func("hello");

In short, I think the issue is not that anyone wants to modify
string literals; it's that making them const would break existing
code that *doesn't* actually modify string literals.

(Stroustrup was able to do this in C++ because there was no existing
C++ code before he invented the language.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top