Constant strings

Discussion in 'C Programming' started by BartC, Apr 18, 2014.

  1. BartC

    BartC Guest

    As far as I can gather from my experiment below, a string constant in source
    code has a 'char*' type, not 'const char*'. Why is that?

    Here, I can't get a compiler to complain about passing a string constant as
    a char* parameter where it is clearly going to be modified. But it doesn't
    like the q=p line which does the same. The q="Bart" line shows up the issue
    more simply:

    char* change_initial(char* s,char c){
    *s=c;
    return s;
    }

    int main (void) {
    const char *p;
    char* q;

    change_initial("Bart",'C');

    q=p;
    q="Bart";
    *q='C';

    }
     
    BartC, Apr 18, 2014
    #1
    1. Advertisements

  2. BartC

    G G Guest

    {
    *s=c;
    return s;
    }

    int main (void)
    {
    const char *p;
    char* q;

    change_initial("Bart",'C');

    q=p;

    q="Bart";

    *q='C';
    }
    i'm learning but, i have a question about the code. (that's the alert, i'll be no real help, sorry)

    the program calls a function,

    change_initial("Bart",'C');

    change_initial("Bart",'C'); is to return a char *, but the return value is not assigned to anything.

    so, well, not looking at intent, i quess?, q is just asked to point to another char * that is const.
    then q is ask to point to a character string, q is not defined as a constant so shouldn't it be allowed to change and point to another char *?

    another question please, in practice when a const char * is declare should it not also defined.
    or it really doesn't make a difference, because when it is latter defined it can then not be changed.

    using my compiler is does issue a warning.

    thanks everyones for taking time out to teach a bit.

    g.
     
    G G, Apr 18, 2014
    #2
    1. Advertisements

  3. BartC

    James Kuyper Guest

    That's not quite correct. A string literal actually has the type "array
    of char of length n", where n is the number of characters in the string
    literal plus 1 for the terminating null character. However, in most
    contexts, lvalue expressions of array type get automatically converted
    into a pointer to the first element of the array, giving the impression
    that string literals have the type "char*". The only two contexts where
    that is not the case are sizeof("string"), which is equivalent to
    sizeof(char[7]), and

    char array[] = "array_initializer";

    which would be a constraint violation if the string literal actually had
    the type "char*".

    The fact that the type is not "array of const char of length n", like
    many other poorly designed features of C, was the result of the fact
    that it was not all designed at the same time, combined with the need
    for backwards compatibility. 'const' was not added to C until long after
    string literals were, and giving string literals that type would have
    broken an unacceptably large amount of existing code. I don't think
    there's anyone who'd recommend copying this feature in a new language
    that didn't require backwards compatibility with C.

    Even C++, for which compatibility with C was a major design goal,
    corrected this one error. There really was no choice about that: with
    the correct overloaded function being automatically chosen based upon
    the argument's type, C++ couldn't afford to allow string literals to
    have the wrong type.
     
    James Kuyper, Apr 18, 2014
    #3
  4. [...]

    There's a third context: &"hello" is a pointer value of type char(*)[6]
    (pointer to array 6 of char).
     
    Keith Thompson, Apr 18, 2014
    #4
  5. BartC

    BartC Guest

    OK, but I would have expected some warning at least to have been added over
    the last few decades, considering the number of minor matters that compilers
    do pounce on.

    I've used gcc -Wall -Wpedantic -Wextra, and not a peep out of it!
    (Apparently, -Wwrite-strings is needed to enable the warning; I only
    discovered that by running g++ which does warn.)

    (I'm not that concerned, just wondering how seriously compilers take the
    issue of const qualifiers. Because if I can write:

    char* q;

    q="ABC";
    *q='X'; /* Crashes in Windows and Linux */

    with nothing at all emitted by the compiler unless I go considerably out of
    my way, then the answer seems to be not very. It just seems a gaping
    loop-hole in the const-qualifier system.)
     
    BartC, Apr 18, 2014
    #5
  6. If you happen to be using gcc, the "-Wwrite-strings" option causes
    string literals to be treated as if they were const. This causes gcc to
    be non-conforming if you use it along with "-pedantic-errors", but it's
    useful for detecting certain potential errors.

    Why is it non-conforming? Because this program:

    #include <stdio.h>

    void print(char *s) { /* "const char *s" would be better */
    puts(s);
    }

    int main(void) {
    print("hello");
    }

    is perfectly legal because it doesn't actually attempt to modify the
    string literal, but it would be illegal if string literals were const
    (as they are in C++).
     
    Keith Thompson, Apr 18, 2014
    #6
  7. BartC

    Kaz Kylheku Guest

    I conducted an experiment today that shows you can stick two expressions
    together with a comma, and the value that comes out is evidently the right one.

    With a bit more time, I will have this whole mysterious C thing
    reverse-engineered, and then I will document it for everyone.
     
    Kaz Kylheku, Apr 18, 2014
    #7
  8. BartC

    BartC Guest

    It did do. But the I code I posted was simplified to remove any
    processing/display of the result (because it was tested with a writeable
    string before passing the const string, when it didn't return anyway because
    it had crashed.)

    The return value of char* allows it to be used like this:

    printf("New string = %s\n",change_initial("bart etc",'C'));

    Not using it for some calls doesn't matter (not using it ever, then perhaps
    the return value could be eliminated).
     
    BartC, Apr 18, 2014
    #8
  9. The standard does not specify warnings. It requires *diagnostics* in
    many cases, but in all those cases the diagnostics are permitted to be
    fatal error messages.

    Individual compilers may issue whatever additional warnings they like.
    The gcc documentation explains the rationale for not warning about
    non-const pointers to string literals by default:

    These warnings will help you find at compile time code that can try
    to write into a string constant, but only if you have been very
    careful about using `const' in declarations and prototypes.
    Otherwise, it will just be a nuisance. This is why we did not make
    `-Wall' request these warnings.

    (Personally, I think programmers *should* be very careful about using
    "const", but that doesn't mean a compiler should enforce it by default.)
    Yes, it's a gaping loophole in the const-qualifier system.
    It's unfortunate that it wasn't practical to close it when "const"
    was added to the language by the 1989 ANSI C standard, but we're
    stuck with it.

    The strchr() and memchr() functions can also be used to violate
    const-correctness, since they can quietly return a non-const pointer
    into a const array. That could have been fixed by splitting both
    functions into a const version and a non-const version. (C++
    uses overloading to do that.)

    I'm not aware of any other such loopholes, but I could be missing
    something.

    There's not much point in arguing that this is a flaw in the
    language; I think everyone here agrees with you. We're not defending
    the rule, we're merely explaining why it (unfortunately) exists.
     
    Keith Thompson, Apr 18, 2014
    #9
  10. BartC

    BartC Guest

    If you want me to stop posting in this group, just say the word.

    I had been in a slow process of migrating away from using C to implement my
    projects, but thanks to your piss-taking, that is now a priority.
     
    BartC, Apr 18, 2014
    #10
  11. If one particular person is annoying you, a killfile might be a better
    solution than leaving the group. (Kaz happens to be in mine.)
     
    Keith Thompson, Apr 18, 2014
    #11
  12. And the int value returned by printf is discarded.

    Ignoring the value returned by a function is quite common. Many
    functions return a value that's likely to be ignored more often than not
    (memset, for example).
     
    Keith Thompson, Apr 18, 2014
    #12
  13. BartC

    Kaz Kylheku Guest

    To hell with the newsgroup. Look around, it's mostly insipid twits who
    can't program their way out of a paper bag.

    What I'd like to see you do is (I mean, for pete's sake!): stop reverse
    engineering things *that are documented*, and even done the same way by
    multiple implementations.

    Yes, we need experimentation desperately in our daily work: to make progress in
    uncharted regions, like uncovering the root causes of bugs. No piece of
    documentation will tell me why this USB driver I'm trying to fix is locking up
    the device.

    But we don't need to experiment to find out the type of a literal constant;
    that's a waste of time.

    Moreover, when you experiment, you're only discovering facts about one
    dialect, and sometimes not even that.

    Experiment shows that gcc will accept ({int x = 3; x}) as an expression,
    which returns 3. Yet that's not in standard C, which is useful to know.

    Experimenting could also convince you that i = i++ has a stable behavior.

    String literals being char * could just be a bug in your compiler,
    for all you know, or a feature of the default dialect.
    Why. I am not C.

    Kiki has me in his killfile because I told him to go **** himself.

    And look, he still uses C!

    Sheesh ...
     
    Kaz Kylheku, Apr 18, 2014
    #13
  14. BartC

    G G Guest

    ok. thanks Bartc.
     
    G G, Apr 18, 2014
    #14
  15. BartC

    Ian Collins Guest

    Maybe you should start compiling your C with a C++ compiler? The const
    rules in C++ are much closer to what you are expecting.

    cat /tmp/x.c

    int main()
    {
    char* q="ABC";
    }

    g++ /tmp/x.c
    /tmp/x.c: In function ‘int main()’:
    /tmp/x.c:3:11: warning: deprecated conversion from string constant to
    ‘char*’
     
    Ian Collins, Apr 18, 2014
    #15
  16. Awful documentation is a fact of life in programming. Yesterday I was trying
    to warp the pointer on a Apple computer for example. They have lots of
    co-ordinate systems going, plus two types of 2D point, and what should be a
    simple process is in fact very involved. You need to warp the pointer
    experimentally to see where it goes.

    It is a waste of time. There's endless timewasting involved in learning the
    intricacies of usually proprietary systems, most of which do largely the
    same thing, just with slightly different syntax, identifiers, and so on.
     
    Malcolm McLean, Apr 18, 2014
    #16
  17. BartC

    BartC Guest

    This particular issue isn't bothering me at the minute. I was just intrigued
    at the more rigorous enforcement of const types on one hand, compared with
    the lax approach used with string constants. It's never really come up
    before because I refuse to use 'const' anywhere.

    (And I have tried using g++ to compile my code (with a view to simplifying
    using libraries only having a C++ interface), but it's even more of a
    nightmare getting it to compile my C code, most of which is auto-generated
    in various ways.)
     
    BartC, Apr 18, 2014
    #17
  18. BartC

    Kaz Kylheku Guest

    Although C++ has "const char *" string literals, that was a relatively late
    decision in in the history of C++.

    C++ has the function overloading tools to make this nicer.

    In C, a the safety benefits go out the window as soon as you use a function
    like:

    char *strchr(const char *str, int c);

    the function returns a pointer with the const qualifier stripped.

    In C++, the #include <cstring> compatibility library provides overloads:

    const char *strchr(const char *str, int c);
    char *strchr(char *str, int c);

    A const char * argument selects the first overload; a char * argument
    selects the second overload.

    You can benefit from these overloads if you write in "Clean C": the hybrid
    dialect which compiles as C or C++. Just put this somewhere:

    #ifdef __cplusplus
    #include <cstring>
    #else
    #include <string.h>
    #endif

    Suddenly your strchr calls and whatnot are more type safe, with situations
    like:

    char *t = strchr("abc", x);

    being nicely diagnosed your C++ compiler. (Here, the const char * returning
    overload is selected because of "abc", and so the initialization of t strips
    qualifiers, which requires a diagnostic.)
    That wouldn't be a problem if your auto-generator spits out "Clean C".

    "Clean C" isn't formalized anywhere: if you know C and C++ well (or at
    least the C-like subset of C++ well), you can write in it.

    The name "Clean C" was coined in _C: A Reference Manual_ by Harbison and
    Steele.
     
    Kaz Kylheku, Apr 19, 2014
    #18
  19. And you very rarely want to call strchr with an explicit string literal, but
    you quite often want to parse a string literal passed from above.
    The poor rules for string literals are seldom much of a practical problem,
    because if in a real program a function modifies a string passed to it, the
    caller is going to want to examine the result. So he can't pass a string
    literal.
    const rules mean that you have to have two versions of strchr, which isn't
    too bad. But they also mean that anyone writing a strchr-like function to
    parse a string needs to write two versions. That's unacceptable.
     
    Malcolm McLean, Apr 19, 2014
    #19
  20. BartC

    Ian Collins Guest

    No, they don't.

    They would only need two versions if one were to modify the input. If
    that were the case, two functions with different names would be in order.
     
    Ian Collins, Apr 19, 2014
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.