'strlen' : Why valgrind reports invalid read ?

Discussion in 'C Programming' started by m.labanowicz@gmail.com, Jan 18, 2013.

  1. Guest

    Hello,

    + gcc --version
    01: gcc (Ubuntu/Linaro 4.7.2-11precise2) 4.7.2
    02: Copyright (C) 2012 Free Software Foundation, Inc.
    03: This is free software; see the source for copying conditions. There is NO
    04: warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    05:
    + cat main.c
    01: #include <stdio.h>
    02: #include <string.h>
    03: #include <stdlib.h>
    04: char * bar(char const * str);
    05: char * bar(char const * str) {
    06: char * buf = (char *)malloc(strlen(str) + 1);
    07: if (buf) {
    08: size_t len;
    09: strcpy(buf, str);
    10: printf("buf=%p\r\n", (void *)buf);
    11: len = strlen(buf);
    12: if (len) {
    13: return buf;
    14: }
    15: free(buf);
    16: }
    17: return NULL;
    18: }
    19: int main(void) {
    20: free((void *)bar(""));
    21: return EXIT_SUCCESS;
    22: }
    + gcc -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c
    + ./a.out
    01: buf=0x7e0010
    + valgrind ./a.out
    01: ==28564== Memcheck, a memory error detector
    02: ==28564== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
    03: ==28564== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
    04: ==28564== Command: ./a.out
    05: ==28564==
    06: ==28564== Invalid read of size 4
    07: ==28564== at 0x4006F4: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    08: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    09: ==28564== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    10: ==28564== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    11: ==28564== by 0x4006C6: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    12: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    13: ==28564==
    14: buf=0x51f1040
    15: ==28564==
    16: ==28564== HEAP SUMMARY:
    17: ==28564== in use at exit: 0 bytes in 0 blocks
    18: ==28564== total heap usage: 1 allocs, 1 frees, 1 bytes allocated
    19: ==28564==
    20: ==28564== All heap blocks were freed -- no leaks are possible
    21: ==28564==
    22: ==28564== For counts of detected and suppressed errors, rerun with: -v
    23: ==28564== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)

    Best Regards

    --
    Maciej Labanowicz
    , Jan 18, 2013
    #1
    1. Advertising

  2. Xavier Roche Guest

    On 01/18/2013 11:42 AM, wrote:
    > + gcc -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c


    You may add -g3 to get some debugging infos even with O2.

    > + valgrind ./a.out


    Humm, old version of valgrind ?
    ==8217== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

    This might be an optimized version of strlen() triggering a false
    warning, BTW.
    Xavier Roche, Jan 18, 2013
    #2
    1. Advertising

  3. Guest

    >
    > You may add -g3 to get some debugging infos even with O2.
    >


    gcc -g3 -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c
    ....
    06: ==28834== Invalid read of size 4
    07: ==28834== at 0x4006F4: bar (main.c:11)
    08: ==28834== by 0x40057D: main (main.c:20)
    09: ==28834== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    10: ==28834== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    11: ==28834== by 0x4006C6: bar (main.c:6)
    12: ==28834== by 0x40057D: main (main.c:20)
    ....

    It seems that this is an 'strlen' bug in [gcc+valgrind].

    --
    Maciej Labanowicz
    , Jan 18, 2013
    #3
  4. Shao Miller Guest

    On 1/18/2013 05:42, wrote:
    > Hello,
    >
    > + gcc --version
    > 01: gcc (Ubuntu/Linaro 4.7.2-11precise2) 4.7.2
    > 02: Copyright (C) 2012 Free Software Foundation, Inc.
    > 03: This is free software; see the source for copying conditions. There is NO
    > 04: warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    > 05:
    > + cat main.c
    > 01: #include <stdio.h>
    > 02: #include <string.h>
    > 03: #include <stdlib.h>
    > 04: char * bar(char const * str);
    > 05: char * bar(char const * str) {
    > 06: char * buf = (char *)malloc(strlen(str) + 1);
    > 07: if (buf) {
    > 08: size_t len;
    > 09: strcpy(buf, str);
    > 10: printf("buf=%p\r\n", (void *)buf);
    > 11: len = strlen(buf);
    > 12: if (len) {
    > 13: return buf;
    > 14: }
    > 15: free(buf);
    > 16: }
    > 17: return NULL;
    > 18: }
    > 19: int main(void) {
    > 20: free((void *)bar(""));
    > 21: return EXIT_SUCCESS;
    > 22: }
    > + gcc -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c
    > + ./a.out
    > 01: buf=0x7e0010
    > + valgrind ./a.out
    > 01: ==28564== Memcheck, a memory error detector
    > 02: ==28564== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
    > 03: ==28564== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
    > 04: ==28564== Command: ./a.out
    > 05: ==28564==
    > 06: ==28564== Invalid read of size 4
    > 07: ==28564== at 0x4006F4: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    > 08: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    > 09: ==28564== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    > 10: ==28564== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    > 11: ==28564== by 0x4006C6: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    > 12: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    > 13: ==28564==
    > 14: buf=0x51f1040
    > 15: ==28564==
    > 16: ==28564== HEAP SUMMARY:
    > 17: ==28564== in use at exit: 0 bytes in 0 blocks
    > 18: ==28564== total heap usage: 1 allocs, 1 frees, 1 bytes allocated
    > 19: ==28564==
    > 20: ==28564== All heap blocks were freed -- no leaks are possible
    > 21: ==28564==
    > 22: ==28564== For counts of detected and suppressed errors, rerun with: -v
    > 23: ==28564== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
    >


    I would take a guess that 'strcpy' is trying to copy four bytes at a
    time. It's possible that it's got a bug for when the string length is <
    4. Perhaps you could experiment with the strings "1", "12", "123", and
    see if the behaviour is the same for all but the last one.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
    Shao Miller, Jan 18, 2013
    #4
  5. Guest

    > I would take a guess that 'strcpy' is trying to copy four bytes at a
    > time. It's possible that it's got a bug for when the string length is <
    > 4. Perhaps you could experiment with the strings "1", "12", "123", and
    > see if the behaviour is the same for all but the last one.


    Hmm, in such case violation should take place before 'printf'.

    I have modified code:
    ....
    strcpy(buf, str);
    printf("buf=%p\r\n", (void *)buf);fflush(stdout);
    len = strlen(buf);
    printf("KOKO\r\n");fflush(stdout);
    ....
    Result:
    ....
    06: buf=0x51f1040
    07: ==4252== Invalid read of size 4
    08: ==4252== at 0x400784: bar (main.c:11)
    09: ==4252== by 0x4005FD: main (main.c:21)
    10: ==4252== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    11: ==4252== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    12: ==4252== by 0x400746: bar (main.c:6)
    13: ==4252== by 0x4005FD: main (main.c:21)
    14: ==4252==
    15: KOKO
    ....

    I have also tried with the different length of string:
    "" -> error
    "1" -> error
    "12" -> error
    "123" -> OK:CLEAR
    "1234" -> error
    "12345" -> error
    "123456" -> error
    "1234567" -> OK:CLEAR
    "12345678" -> failed

    It seems that strlen reads 4 bytes-words.
    I assume that it is illegal.

    --
    Maciej Labanowicz
    , Jan 18, 2013
    #5
  6. Shao Miller Guest

    On 1/18/2013 08:30, wrote:
    >> I would take a guess that 'strcpy' is trying to copy four bytes at a
    >> time. It's possible that it's got a bug for when the string length is <
    >> 4. Perhaps you could experiment with the strings "1", "12", "123", and
    >> see if the behaviour is the same for all but the last one.

    >
    > Hmm, in such case violation should take place before 'printf'.
    >
    > I have modified code:
    > ...
    > strcpy(buf, str);
    > printf("buf=%p\r\n", (void *)buf);fflush(stdout);
    > len = strlen(buf);
    > printf("KOKO\r\n");fflush(stdout);
    > ...
    > Result:
    > ...
    > 06: buf=0x51f1040
    > 07: ==4252== Invalid read of size 4
    > 08: ==4252== at 0x400784: bar (main.c:11)
    > 09: ==4252== by 0x4005FD: main (main.c:21)
    > 10: ==4252== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    > 11: ==4252== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    > 12: ==4252== by 0x400746: bar (main.c:6)
    > 13: ==4252== by 0x4005FD: main (main.c:21)
    > 14: ==4252==
    > 15: KOKO
    > ...
    >
    > I have also tried with the different length of string:
    > "" -> error
    > "1" -> error
    > "12" -> error
    > "123" -> OK:CLEAR
    > "1234" -> error
    > "12345" -> error
    > "123456" -> error
    > "1234567" -> OK:CLEAR
    > "12345678" -> failed
    >
    > It seems that strlen reads 4 bytes-words.
    > I assume that it is illegal.
    >


    Yes after posting I refreshed my news-reader and saw that you suspected
    'strlen'. That makes more sense, given the valgrind output about the
    allocated storage.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
    Shao Miller, Jan 18, 2013
    #6
  7. wrote:

    >> I would take a guess that 'strcpy' is trying to copy four bytes at a
    >> time. It's possible that it's got a bug for when the string length is <
    >> 4. Perhaps you could experiment with the strings "1", "12", "123", and
    >> see if the behaviour is the same for all but the last one.


    > Hmm, in such case violation should take place before 'printf'.


    > I have modified code:
    > ...
    > strcpy(buf, str);
    > printf("buf=%p\r\n", (void *)buf);fflush(stdout);
    > len = strlen(buf);
    > printf("KOKO\r\n");fflush(stdout);


    (snip)
    > It seems that strlen reads 4 bytes-words.
    > I assume that it is illegal.


    Seems to me that if an implementation knew how it allocated strings,
    such that reading the three bytes after one didn't cause any problems,
    it should be able to get away with it.

    For many systems, unless you are at the end of addressing space and
    doing unaligned reads, it should not cause a problem.

    In that case, I would call it a valgrind bug. That is, it isn't
    consistent with the access patterns of the underlying implementation.

    -- glen
    glen herrmannsfeldt, Jan 18, 2013
    #7
  8. Nick Bowler Guest

    On Fri, 18 Jan 2013 16:21:23 +0000, glen herrmannsfeldt wrote:
    > wrote:

    [...]
    >> It seems that strlen reads 4 bytes-words.
    >> I assume that it is illegal.

    >
    > Seems to me that if an implementation knew how it allocated strings,
    > such that reading the three bytes after one didn't cause any problems,
    > it should be able to get away with it.
    >
    > For many systems, unless you are at the end of addressing space and
    > doing unaligned reads, it should not cause a problem.
    >
    > In that case, I would call it a valgrind bug. That is, it isn't
    > consistent with the access patterns of the underlying implementation.


    I doubt there's any bug in the tools (valgrind or C implementation) here.
    Most likely, the OP's configured suppressions are incomplete or out of
    date, perhaps due to a botched installation of valgrind.

    Cheers,
    Nick
    Nick Bowler, Jan 18, 2013
    #8
  9. Unrelated to your problem but:

    Line 04 serves no purpose. A prototype is needed only if the
    definition is not in scope when the function is called. Since the
    definition of bar() follows immediately, this can never be the case.

    The cast on line 6 serves no purpose. malloc() returns a void*
    and there is an implicit conversion to any "pointer to object" type.

    The \r on line 10 may not do what you think. The C library
    combined with your operating system will convert the \n to the
    appropriate "end of line" character(s). The extra \r may just confuse
    any future attempt to read the data.

    The cast in line 20 serves no purpose. A prototype for free() is
    in scope so the compiler knows to convert the argument to void*.

    On Fri, 18 Jan 2013 02:42:33 -0800 (PST),
    wrote:

    >Hello,
    >
    >+ gcc --version
    >01: gcc (Ubuntu/Linaro 4.7.2-11precise2) 4.7.2
    >02: Copyright (C) 2012 Free Software Foundation, Inc.
    >03: This is free software; see the source for copying conditions. There is NO
    >04: warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    >05:
    >+ cat main.c
    >01: #include <stdio.h>
    >02: #include <string.h>
    >03: #include <stdlib.h>
    >04: char * bar(char const * str);
    >05: char * bar(char const * str) {
    >06: char * buf = (char *)malloc(strlen(str) + 1);
    >07: if (buf) {
    >08: size_t len;
    >09: strcpy(buf, str);
    >10: printf("buf=%p\r\n", (void *)buf);
    >11: len = strlen(buf);
    >12: if (len) {
    >13: return buf;
    >14: }
    >15: free(buf);
    >16: }
    >17: return NULL;
    >18: }
    >19: int main(void) {
    >20: free((void *)bar(""));
    >21: return EXIT_SUCCESS;
    >22: }
    >+ gcc -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c
    >+ ./a.out
    >01: buf=0x7e0010
    >+ valgrind ./a.out
    >01: ==28564== Memcheck, a memory error detector
    >02: ==28564== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
    >03: ==28564== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
    >04: ==28564== Command: ./a.out
    >05: ==28564==
    >06: ==28564== Invalid read of size 4
    >07: ==28564== at 0x4006F4: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    >08: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    >09: ==28564== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    >10: ==28564== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    >11: ==28564== by 0x4006C6: bar (in /home/M.Labanowicz/work/valgissue/a.out)
    >12: ==28564== by 0x40057D: main (in /home/M.Labanowicz/work/valgissue/a.out)
    >13: ==28564==
    >14: buf=0x51f1040
    >15: ==28564==
    >16: ==28564== HEAP SUMMARY:
    >17: ==28564== in use at exit: 0 bytes in 0 blocks
    >18: ==28564== total heap usage: 1 allocs, 1 frees, 1 bytes allocated
    >19: ==28564==
    >20: ==28564== All heap blocks were freed -- no leaks are possible
    >21: ==28564==
    >22: ==28564== For counts of detected and suppressed errors, rerun with: -v
    >23: ==28564== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
    >
    >Best Regards


    --
    Remove del for email
    Barry Schwarz, Jan 18, 2013
    #9
  10. writes:

    >> I would take a guess that 'strcpy' is trying to copy four bytes at a
    >> time. It's possible that it's got a bug for when the string length is <
    >> 4. Perhaps you could experiment with the strings "1", "12", "123", and
    >> see if the behaviour is the same for all but the last one.

    >
    > Hmm, in such case violation should take place before 'printf'.
    >
    > I have modified code:
    > ...
    > strcpy(buf, str);
    > printf("buf=%p\r\n", (void *)buf);fflush(stdout);
    > len = strlen(buf);
    > printf("KOKO\r\n");fflush(stdout);
    > ...
    > Result:
    > ...
    > 06: buf=0x51f1040
    > 07: ==4252== Invalid read of size 4
    > 08: ==4252== at 0x400784: bar (main.c:11)
    > 09: ==4252== by 0x4005FD: main (main.c:21)
    > 10: ==4252== Address 0x51f1040 is 0 bytes inside a block of size 1 alloc'd
    > 11: ==4252== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    > 12: ==4252== by 0x400746: bar (main.c:6)
    > 13: ==4252== by 0x4005FD: main (main.c:21)
    > 14: ==4252==
    > 15: KOKO
    > ...
    >
    > I have also tried with the different length of string:
    > "" -> error
    > "1" -> error
    > "12" -> error
    > "123" -> OK:CLEAR
    > "1234" -> error
    > "12345" -> error
    > "123456" -> error
    > "1234567" -> OK:CLEAR
    > "12345678" -> failed
    >
    > It seems that strlen reads 4 bytes-words.
    > I assume that it is illegal.


    You used -O2, yes? When I do that the second strlen is inlined and does
    indeed check in 4-byte chunks:

    movq %rbx, %rsi
    ..L3:
    movl (%rsi), %eax
    addq $4, %rsi
    leal -16843009(%rax), %ecx
    notl %eax
    andl %eax, %ecx
    andl $-2139062144, %ecx
    je .L3

    The first strlen is not inlined, because gcc can't assume that 4-byte
    accesses are safe -- it can assume that after the malloc.

    At least one source says "using Valgrind with code that has been
    compiled with optimisation options could give incorrect results" and
    recommends use -O0. This seems reasonable since I don't think valgrind
    can do anything to avoid reporting an error in cases like this.

    --
    Ben.
    Ben Bacarisse, Jan 18, 2013
    #10
  11. Jorgen Grahn Guest

    On Fri, 2013-01-18, Xavier Roche wrote:
    > On 01/18/2013 11:42 AM, wrote:
    >> + gcc -O2 -ansi -pedantic -W -Wall -Werror -Wextra main.c

    >
    > You may add -g3 to get some debugging infos even with O2.


    I'd formulate that a bit differently. In gcc, there's no connection
    between -g and optimization; the problem here is simply that -g is
    missing. The symptom is that valgrind cannot report the source file
    and line for the violations it finds.

    (-g versus -g3 should be irrelevant in this case. I had to look it
    up: -g3 is like -g, but also includes macro definitions. Unclear to
    me when that's useful.)

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Jan 19, 2013
    #11
  12. Jorgen Grahn Guest

    On Fri, 2013-01-18, Nick Bowler wrote:
    > On Fri, 18 Jan 2013 16:21:23 +0000, glen herrmannsfeldt wrote:
    >> wrote:

    > [...]
    >>> It seems that strlen reads 4 bytes-words.
    >>> I assume that it is illegal.

    >>
    >> Seems to me that if an implementation knew how it allocated strings,
    >> such that reading the three bytes after one didn't cause any problems,
    >> it should be able to get away with it.
    >>
    >> For many systems, unless you are at the end of addressing space and
    >> doing unaligned reads, it should not cause a problem.
    >>
    >> In that case, I would call it a valgrind bug. That is, it isn't
    >> consistent with the access patterns of the underlying implementation.

    >
    > I doubt there's any bug in the tools (valgrind or C implementation) here.
    > Most likely, the OP's configured suppressions are incomplete or out of
    > date, perhaps due to a botched installation of valgrind.


    Yes. The "suppressions" are Valgrind's tool for letting the library
    get away with making assumptions about its own internals.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Jan 19, 2013
    #12
  13. Xavier Roche Guest

    Le 18/01/2013 12:02, a écrit :
    > It seems that this is an 'strlen' bug in [gcc+valgrind].


    Not really a bug, but a careful optimization.

    What I understand (not perfectly, I did not dig into the actual code) is
    that strlen() reads chunks of bytes (4 or 8 bytes) at once to find the
    ending \0. This allows to speed up the whole process.

    This is illegal strictly speaking, except that the low-level
    implementation knows that if you stay on the same address page, reading
    few bytes after the ending is harmless. Technically, you will read the
    terminating \0 AND few garbage bytes, but theses ones will be simply
    ignored since the \0 is seen.

    Typically, if you reserve one (blank) memory page and ask strlen() on
    the last byte of the page, you should not end up with a SEGV.
    Xavier Roche, Jan 19, 2013
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    33,983
    Victor Bazarov
    Feb 17, 2005
  2. Nathan Bates
    Replies:
    1
    Views:
    1,203
    Tim Peters
    Jul 4, 2006
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,675
    Smokey Grindel
    Dec 2, 2006
  4. Flash Gordon

    Invalid read of size 1 [valgrind warning]

    Flash Gordon, Dec 12, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,882
    Robert Gamble
    Dec 13, 2005
  5. sanket

    Invalid write Valgrind

    sanket, Sep 21, 2011, in forum: C Programming
    Replies:
    1
    Views:
    618
    Harald van Dijk
    Sep 21, 2011
Loading...

Share This Page