Is argv array modifiable ?

Discussion in 'C Programming' started by mnaydin, Dec 15, 2005.

  1. mnaydin

    mnaydin Guest

    Assume the main function is defined with
    int main(int argc, char *argv[]) { /*...*/ }

    So, is it permitted to modify the argv array? The standard says
    "The parameters argc and argv and the strings pointed to by the
    argv array shall be modifiable by the program,[...]". According to
    my reading of the standard, for example, ++argv and ++argv[0][0]
    are both permitted, but not ++argv[0] because it says nothing about
    the argv array itself. Is my interpretation correct ?
    mnaydin, Dec 15, 2005
    1. Advertisements

  2. mnaydin said:
    <caveat class="this is from memory, not the Standard">
    I believe so, yes. You can modify argv because you get a
    copy of the caller's value, so why should the caller care
    what you do with it? You can modify the contents of each
    string because there's no particular reason to forbid you
    to, so long as you don't try to stretch the string - i.e.
    scribble over or past the null terminator. But for all you
    know, the implementation might have used dynamic allocation
    to get the memory it needs for storing those strings, and
    might have no spare copy of the pointer values returned by
    the allocator - so (if I recall correctly) the Standard
    doesn't offer any behaviour guarantees whatsoever if you
    mess with those pointers.
    Richard Heathfield, Dec 15, 2005
    1. Advertisements

  3. mnaydin

    Jordan Abel Guest

    Can you swap two of them? [suppose you want to bring all arguments
    starting with '-' to the beginning of the array]
    Jordan Abel, Dec 15, 2005
  4. mnaydin

    Eric Sosman Guest

    Not reliably. There are three different things one might
    be talking about when one says `argv':

    - The function parameter variable: This is modifiable.

    - The individual pointers argv[0], argv[1], ... The
    Standard says nothing about whether these are modifiable.

    - The strings whose first characters are *argv[0],
    *argv[1], ... The Standard says these are modifiable.

    Section, paragraph 2, final constraint.
    Eric Sosman, Dec 15, 2005
  5. mnaydin

    bluejack Guest

    Given that there are no const keywords in use, one would expect that
    argv is modifyable in any and all senses. Naturally, main is something
    of an exception case, but even so, I trust that the people who
    established the standard were reasonably sensible and rigorous people,
    and if they had meant for something to be const, they would have used
    the const keyword to so indicate.

    As for compiler designers...

    bluejack, Dec 15, 2005
  6. mnaydin

    mnaydin Guest

    Yes, my primary intention is to bring some arguments to the beginning
    of the array. But swapping two of them on the argv array is not a
    solution because the assignment argv = argv[j] is not guaranteed to
    work since argv array may not be modifiable, as Richard and Eric said
    in this thread. On the other hand, I thought this was a common
    practice. At least in K&R2 there is an example on the page 117,
    section 5.10, where argv[0] is modified, though with a different
    purpose from mine. Interestingly, in the K&R1 version of the same
    example, on the page 113, section 5.11, the argv[0] was not modified
    and a pointer to char, named s, was used to loop through the string.

    In any case, I think one of the easy and guranteed solutions is to
    clone the original argv array and work on the cloned array,
    something like that:
    char **arglist = malloc((argc + 1) * sizeof *arglist);
    if (arglist == NULL) ... Ouch ! ...
    memcpy(arglist, argv, (argc + 1) * sizeof *arglist);
    mnaydin, Dec 15, 2005
  7. mnaydin

    Eric Sosman Guest

    On the other hand, the authors of the Standard stated
    explicitly that the pointed-to strings are modifiable, even
    though the "no `const' appears" argument would apply to them
    with equal force. Why did they bother?

    Keep in mind the large body of C code already in existence
    before `const' entered the language. The ANSI committee could
    not invalidate two-plus decades' worth of existing code because
    they'd thought of a better way. They codified existing practice,
    even though (with the new tools) more explicit practice was

    It seems to me not unlike the situation with string literals:
    They are not `const', yet you are forbidden to try to alter them.
    The Rationale explains that they were not made `const' because a
    lot of existing code would break; instead, they are non-`const'
    and the Standard has special language warning you not to modify

    The argv question seems similar (although the Rationale does
    not confirm it): Pre-`const' code declared argv as `char**', and
    the Standard adopted that use but added special language describing
    the writeability of argv[j]. I think it a "curious incident"
    that the Standard says nothing about the writeability of argv.
    Eric Sosman, Dec 15, 2005
  8. mnaydin

    mnaydin Guest

    But, by the same logic, one could argue that it is explicitly stated in
    the standard that the parameters argc, argv, and the strings pointed to
    by argv array shall be modifiable, even though there is no const
    keyword qualifying them, but nothing is stated on the modifiability
    of the argv array itself (ie, argv[0],...,argv[argc]), so there is a
    strong indication that the argv array is not supposed to be modifiable.
    I think relying on the absence of the const keyword is not a valid
    mnaydin, Dec 15, 2005
  9. [You might like to quote some context. If your message was not related
    to Eric Sosman's then perhaps you should reply to the OP's message
    rather than somewhere downthread.]
    That is a naive, even dangerous, form of reasoning. C has many quirks
    which are counter-intuitive. Some of them are far from sensisble, e.g.

    Trusting (or blaming) the Committee is an irrelevance. At the end of
    the day, the language is that written in the Standard. It is up to
    programmers to educate themselves on what that language is.

    C is one of the worst languages for programming by intuition and hope!
    Peter Nilsson, Dec 15, 2005
  10. mnaydin

    bluejack Guest

    And, while there are several good approaches to educating yourself
    on what the language is ... and I realize this is going to endear me
    to nobody ... my preferred method is "trial and error" -- despite my
    "naive and dangerous" form of reasoning, it's a perfectly effective
    approach, assuming you start out by trusting nobody. I don't trust
    the standard (in part because there's no guarantee it has been
    implemented correctly, but mostly because I don't have a copy),
    I don't trust compiler designers (because they don't necessarily
    implement correctly), I don't trust secondary documentation (it's
    like a photocopy of a photocopy), I *certainly* don't trust usenet,
    and I trust my own memory *least of all*. What I trust are demonstrable

    Naturally, with that mentality, I tend to code defensively. It would
    even occur to me to *want* to change argv (or use gets). Still I do
    these conversations fascinating, and I always enjoy the cranky
    attitude found on usenet!

    bluejack, Dec 15, 2005
  11. mnaydin

    Chuck F. Guest

    This is fairly meaningless due to the total lack of context. See
    my sig below for a way to use the broken google interface sanely.
    Chuck F., Dec 15, 2005
  12. mnaydin

    Flash Gordon Guest

    These quirks won't be learnt by trial and error. The *most* you will
    learn is how the specific version of the specific implementation you are
    using works.
    No, it is most definitely NOT a perfectly effective method. All sorts of
    things that you might think are correct, and might work on your compiler
    this week, might fail abysmally when it actually matters to you.
    Start by not trusting trial and error, because it has been repeatedly
    been shown that the people posting here having relied on it to learn C
    have learnt to do things which are definitely wrong.
    In that case build your own chip factory, design and build your own
    chips, and write your own compiler.
    Google for n1124.pdf to get a free public draft of the next version, or
    buy a copy of the current version from a standards body (you can get it
    for $18 last I heard).
    In that case don't use any you have not implemented. You also can't
    trust assemblers, text editors or the OS by that reasoning.
    It is easy to find reviews of books to see if they are reliable, and you
    can cross-reference to the standard if you are not sure.
    I can demonstrate with one compiler that you can safely modify string
    literals and get the expected result. I can also demonstrate with a
    later version of the *same* compiler that you can't modify string
    literals because it causes a SIGSEGV (I might be wrong on the exact
    signal, but definitely a crash). The reality is that anything can happen
    because it is undefined behaviour. However, had I relied on your method
    of trial and error all my code could have suddenly gone from "working"
    to "crashing".

    If I could be bothered I could come up with lots of other examples, but
    the above is one I know to be demonstrably true.
    Coding defensively REQUIRES understanding how the language is DEFINED to
    work, what you are doing by relying on trial and error rather than a
    reliable source of information is coding stupidly.
    Well, if you think trial and error is a substitute for a good text book
    expect responses a lot more cranky than mine.
    Flash Gordon, Dec 15, 2005
  13. mnaydin

    Jordan Abel Guest

    They could have permitted an additional prototype:

    int main(int argc, char * const *argv); which i think they would have
    done if they had intended that the pointers may not be modifiable.
    Except, of course, that you are inferring that by lack of analogy to the
    explicit permission to write their targets, not from any actual language
    in the standard.
    The standard does not have such special language for the argv pointers.
    The behavior in modifying a non-const variable that is not a string
    literal and was not cast from the address of a const variable is

    I think it's more curious that it does add such language for the
    writeability of argv[j], given that it's non-const (and not a string
    literal) and hence "should" be modifiable anyway.
    Jordan Abel, Dec 16, 2005
  14. Or, better yet, read the more detailed description at
    Keith Thompson, Dec 16, 2005
  15. mnaydin

    Netocrat Guest

    [on string literals as an analogy to argv]
    The ultimate declaration of the argv variable passed into the program is
    not specified though, all the program gets is the declaration of the
    function parameter.

    It's legal to cast a const-qualified variable to a non-const version of
    the same and pass it into a function, it's just not legal to write to it
    within the function.

    Without that language it would be implicitly undefined behaviour to
    attempt to modify argv and argv[j], as it is now for arg. The
    curiosity is that the Standard left it implicit rather than making it
    Netocrat, Dec 16, 2005
  16. mnaydin

    Jordan Abel Guest

    There is, however no basis in the text for supposing that this is the
    case for *argv (...etc).

    It would not. without that language, **argv (...etc) would still be of
    type char, not const char, and since it's not a string literal (a listed
    exception to an object of type char being modifiable), there's no basis
    for supposing that it would be non-modifiable.

    There is no basis in the text for believing that it might be the case,
    other than your interpretation of a conspicuous lack of a similar
    statement for argv as for argv[j].
    Jordan Abel, Dec 20, 2005
  17. mnaydin

    Netocrat Guest

    [I worded the above sloppily. More correctly the first sentence should
    begin: "It's legal to take the address of a const-declared variable, cast
    it to a pointer to a non-const qualified version of the variable's type,
    and pass that pointer into a function, ..."]
    (Assuming that you interpreted my sloppy wording as intended) I'd express
    that in reverse: there's no basis in the text for supposing that the
    variables passed into main() are uniquely unaffected by this possibility.

    The mention that argc and argv are modifiable does seem redundant, but
    useful clarification given that they are coming from an external
    environment. The claim in my last post that it would be implicit UB to
    attempt to modify argv without this mention may have been too strong, but
    I'm not convinced that modifying argv[j] would be legal and defined
    without mention.
    Netocrat, Dec 20, 2005
  18. mnaydin

    Jordan Abel Guest

    comp.std.c added, it seems appropriate: for those who haven't been
    following along, the issue is whether the text of the standard supports
    a view that modification of the elements of argv [i.e. the individual
    pointers] results in undefined behavior.

    But there's no reason to think that this has been done by whatever calls
    It's hardly unique.
    Jordan Abel, Dec 20, 2005
  19. Yes.

    Section of the ISO/IEC 9899:1999 seems quite explicit:

    The parameters argc and argv and the strings pointed to by the argv
    array shall be modifiable by the program, and retain their last-stored
    values between program startup and program termination.

    There's a note about the conventional but non-mandatory use of argc and
    argv as the names of the parameters. Looks pretty clear to me...
    Jonathan Leffler, Dec 20, 2005
  20. mnaydin

    Chris Torek Guest

    I think this claim is a bit premature....
    That text guarantees that, in the "code" part of:

    int main(int argc, char **argv) {
    ... code ...

    the programmer may change argv itself (though this is hardly
    controversial) and, for appropriate values of i and j, the programmer
    may change argv[j] by ordinary assignment. Thus, e.g., the code
    fragment below is fine given suitable i, p, and q:

    /* suppose at this point, strcmp(argv, "this:that") == 0 */
    p = argv;
    p[4] = '\0';
    q = p + 5;
    /* now strcmp(p, "this") == 0 && strcmp(q, "that") == 0 */

    The question in question (it is late, pardon the phrasing :) ) is
    whether this is also proper, given suitable i, k, and p:

    p = argv;
    argv = argv[k];
    argv[k] = p;

    This writes on argv and argv[k], rather than argv[j]. The
    fact that the Standard explicitly allows the programmer to write
    on argv[j] should make one wonder why it fails to mention whether
    the programmer may write on argv itself. The lack of a "const"
    qualifier is not in itself permission, since:

    void f(void) {
    char *x = "this:that";

    x[4] = '\0';

    violates a "shall" outside a constraints section, rendering the
    behavior undefined, yet no part of the declaration of x uses "const".
    Chris Torek, Dec 20, 2005
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.