char s[] = "This string literal";
or
char *s= "This string literal";
Both define a string literal. ...
Actually, neither defines a string literal. Even more precisely,
string literals *cannot* be defined. String literals are, instead,
merely source-code constructs, much like tokens and character-constants
(remember that the latter are simply ways to write constants of
type "int", in C).
Both suppose to be read-only and not to be modified according to
Standard.
With one or two exceptions, a string literal produces an anonymous
array of type "array N of char" (where N is one more than the number
of chars in the quotes, after escape-sequence interpretation and
string-literal concatenation and so forth) that is in principle
read-only, and thus the programmer should not attempt to modify
it, yes.
And both have type of "const char *". Right?
No.
Again, the array the compiler whisks up, on seeing a string literal
that is not one of these special exceptions, has type "array N of
char", not "array N of const char". And -- per The Rule about
arrays and pointers in C -- an object of type "array N of T" often
becomes a value of type "pointer to T". In this case, that would
be a value of type "pointer to char" or "char *" -- i.e., no "const".
(C and C++ differ greatly here, incidentally. Be sure you are
using a C compiler, not a C++ compiler.)
But there are those pesky exceptions. The big exception for string
literals occurs when using one as an initializer:
char sa[] = "initialized";
char *sp = "initialized";
The initializer for "sa" *is* one of these exceptions; the initializer
for "sp" is *not* one of these exceptions.
When a string literal is used to initialize an object of type "array
N of char", or "array FILL_IN_SIZE_AUTOMATICALLY_PLEASE of char",
or "array (N or FILL_IN...) of const char", the string literal does
not create an anonymous object [see also footnote]. Instead, it
simply fills in the named "array of char" or "array of const char"
object it is initializing. The type of the array, and its size if
this is specified, override the usual effects, so that the array
has its read/write or read-only state set by the programmer. If
the size is specified and is *much* too small, a diagnostic is
required; but if the size is specified and is one character too
small to hold the string, the trailing '\0' is omitted:
char four[4] = "four"; /* has no terminating '\0'! */
In the case of:
char sa[] = "initialized";
the array's size is left for the compiler to fill in, so the array
has size 12 (if I counted correctly), just large enough to hold
the sequence {'i', 'n',' ..., 'e', 'd', '\0'}. Thus, here, "sa"
has type "array 12 of char" and is read/write -- sa
, where i is
in [0..12), can be replaced with a new value.
In the case of "sp", however, we have:
char *sp = "initialized";
Here "sp" has type "pointer to char", not "array of (possibly
qualified) char", so the string literal goes ahead and produces
an anonymous (unnamed) array, "array 12 of char", that is in
principle read-only. This array "decays", as the FAQ puts it,
to a value of type "pointer to char", pointing to the first
element of the unnamed array -- the first letter 'i'. The
variable "sp" is then initialized to point to that 'i'.
But why does the compiler I am using allow s to be modified, instead
of generating compile error?
In the quote above, you have *two* variables named 's'; which
one do you mean, and modified in what way?
Given sa and sp as defined above, this is valid:
sp = "a different literal";
Here the string literal generates an anonymous array as usual, and
the variable sp is modified to point to its first 'a', via the
usual "decay" trick (that which I call "The Rule about pointers
and arrays in C"; see also http://web.torek.net/torek/c/index.html
and sub-pages.) But:
sp[3] = 'x';
is *not* valid, even though no diagnostic is required.
On the other hand:
sa = "error";
is *not* valid, and a diagnostic is required, because sa has type
"array 12 of char" and is thus not what the Standard calls a
"modifiable lvalue". At the same time:
sa[3] = 'x';
*is* valid, and merely changes the fourth letter (subscript 3) in
the array.
Lets say a function declared: void funct1(char sss[]); to accept a char
string as argument. Why does it also accept a string literal, s, above?
Due to The Rule plus the fact that C passes all arguments by
value, the declaration:
void funct1(char sss[]);
"means" exactly the same thing as the declaration:
void funct1(char *sss);
That is, the function takes a single argument value of type "pointer
to char". As I noted above, a string literal normally produces an
anonymous object of type "array N of char", and The Rule converts
objects of type "array N of T" into values of type "pointer to T".
If T is "char", this is a value of type "pointer to char", which
is precisely what funct1() requires.
To declare a function as: void funct2(const char sss[]); Does this restrict
what kinds of char string passed into sss at all, or does it just imply that
sss will not be modified in "funct2"?
This declaration "means" the same thing as:
void funct2(const char *sss);
The const qualifier largely means "read-only", not "constant"; and
yes, this implies -- but does not guarantee! -- that funct2() will
not write on sss. It does not mean that sss must *be*
read-only; it only says -- weakly -- that funct2() itself will not
write on it. Some *other* function might write on it at any time,
and funct2() cannot in general depend on sss not changing.
(This sort of optimization problem is what led to C99's "restrict".
Combine "restrict" with "const" and funct2() *can* depend on it,
allowing a C compiler to generate smaller and/or faster code.)
Does Standard define something like: void funct3(char sss[] const) since I
saw some code like that but not sure what it means?
No. The "const" qualifier can move slightly:
void f(const char *);
void f(char const *);
both "mean" the same thing. If you use the array syntax for
formal parameters -- I prefer not to, becaues the compiler just
has to rewrite it internally anyway -- these are the only
possible placements:
void f(const char []);
void f(char const []);
(the identifiers are always optional in prototypes). Using
the pointer syntax, however, it is possible to add another
const:
void f(char const * const);
and:
void f(char const * const p) {
...
}
The second "const" in the prototype has no effect. THe second
"const" in the definition affects "p", making the variable p itself
read-only. This is the same as if we were to write:
void f(char const *p0) {
char const * const p = p0;
...
}
The trick here is that any formal parameter in any function acts
just like an ordinary local variable inside that function. It is
automatically initialized with the value delivered by whoever calls
the function. If the formal parameter (here p) is labeled "const",
it is a read-only variable, just like any other "const" in C:
const int i = 3;
const double pi = 3.141592653589793238462643383279502884;
Here "i" and "pi" are not constants, just variables that you are
forbidden to change! (The semantic difference shows up in all
kinds of places, e.g.:
% cat error.c
const int i = 3;
char a[3];
char b;
%
The third line is invalid and must draw a diagnostic, even in C99
with its variable length arrays. Although "i" is "const", it is
not a constant. Again, C++ is quite different here.)
[footnote] [dig out C99 standard and check on string literals,
should allow volatile qualifiers too I believe. Note case of sizeof
"foo"; note that a compiler can still generate the anonymous array,
wasting space in an executable image, but the only way you can find
out it did so is to step outside the C standard.]
[Sorry, no time to fill out footnote -- must dash]