valgrind spews avalanche of messages

Discussion in 'C Programming' started by bruce56@topmail.co.nz, Jun 29, 2013.

  1. Guest

    I built an open-source package that comprises hundreds of source files
    in C, C++, F90 among others. It has a few bugs suggesting memory
    corruption, such as SIGSEGV or malloc and free aborts.

    So I tried valgrind --tool=memcheck
    It throws up hundreds of warnings about
    "Conditional jump or move depends on uninitialised value(s)"
    Most of these came from c code, a few from Intel libraries such
    as _intel_fast_memcmp or __intel_sse2_strlen

    So I presume these are not really fatal, or else the program
    would not get to first base. So when using valgrind, should
    one start at the end and work back, as the last few are
    more likely to indicate what killed it?
    , Jun 29, 2013
    #1
    1. Advertising

  2. James Kuyper Guest

    On 06/29/2013 09:30 AM, wrote:
    > I built an open-source package that comprises hundreds of source files
    > in C, C++, F90 among others. It has a few bugs suggesting memory
    > corruption, such as SIGSEGV or malloc and free aborts.
    >
    > So I tried valgrind --tool=memcheck
    > It throws up hundreds of warnings about
    > "Conditional jump or move depends on uninitialised value(s)"
    > Most of these came from c code, a few from Intel libraries such
    > as _intel_fast_memcmp or __intel_sse2_strlen


    They might have occurred inside those functions, but the real problem is
    the arguments that were passed to those functions, directly or
    indirectly. You should use the options that allow valgrind to transfer
    control to a debugger, and then trace your way up the call stack until
    you're inside your own code, so you can find out what it did to trigger
    that message.

    With the proper command line arguments (which I don't remember right
    now), if you build your executable in debug mode, valgrind can tell you
    the memory address containing the uninitialized value, and where the
    object was allocated that contains that address. This is not a precise
    identification: if it points to the start of a block, it could be
    referring to any of the objects initialized in that vicinity. However,
    if you check the address of each of those items, you should find one
    that fits.

    > So I presume these are not really fatal, or else the program
    > would not get to first base. ...


    They are not, in themselves, fatal, otherwise the program would have
    halted. However, those non-fatal defects are likely to have been at
    least part of the cause of other defects. The message means that you
    have a variable somewhere whose value was never initialized, and your
    program is making a decision about what to do next based upon the
    uninitialized value of that variable. That is almost always the result
    of a code defect (I say "almost always" only because, on certain poorly
    protected systems, malware sometimes looks at uninitialized memory in
    the hopes that it still contains valuable information left over from the
    last thing it was used for).

    Assuming that your code was actually intended to be making a decision
    based upon an initialized value, the fact that it isn't doing so means
    that your program will not be functioning the way that you intended.

    > ... So when using valgrind, should
    > one start at the end and work back, as the last few are
    > more likely to indicate what killed it?


    No, as with all debugging, it's generally better to remove the first
    problem it finds, because that problem might have caused the other
    problems, and even if it did not, the symptoms of that problem might
    interfere with identification of the other problems.
    --
    James Kuyper
    James Kuyper, Jun 29, 2013
    #2
    1. Advertising

  3. On Saturday, June 29, 2013 2:30:26 PM UTC+1, wrote:
    > I built an open-source package that comprises hundreds of source files
    > in C, C++, F90 among others. It has a few bugs suggesting memory
    > corruption, such as SIGSEGV or malloc and free aborts.
    >
    >
    > So I tried valgrind --tool=memcheck
    > It throws up hundreds of warnings about
    > "Conditional jump or move depends on uninitialised value(s)"
    > Most of these came from c code, a few from Intel libraries such
    > as _intel_fast_memcmp or __intel_sse2_strlen
    >
    > So I presume these are not really fatal, or else the program
    > would not get to first base. So when using valgrind, should
    > one start at the end and work back, as the last few are
    > more likely to indicate what killed it?
    >

    It's unlikely that intel_fast_memcmp really has a bug in it. More
    likely it's written in a strange, highly optimal way that valgrind
    can't understand.
    When the program segfaults, valgrind will halt. You your last
    message should be the instruction that cased it to crash. However
    the root of the problem is unlikely to be there. What will have
    happened is that a pointer will have been set to an invalid
    value earlier on. So second port of call, after the crash itself,
    is finding where the bad pointer got its value from.
    Malcolm McLean, Jun 29, 2013
    #3
  4. Öö Tiib Guest

    On Saturday, 29 June 2013 16:30:26 UTC+3, wrote:
    > I built an open-source package that comprises hundreds of source files
    > in C, C++, F90 among others. It has a few bugs suggesting memory
    > corruption, such as SIGSEGV or malloc and free aborts.


    Open source is often of rather terrible quality. Not as rule, just often.

    > So I tried valgrind --tool=memcheck
    > It throws up hundreds of warnings about
    > "Conditional jump or move depends on uninitialised value(s)"
    > Most of these came from c code, a few from Intel libraries such
    > as _intel_fast_memcmp or __intel_sse2_strlen
    >
    > So I presume these are not really fatal, or else the program
    > would not get to first base. So when using valgrind, should
    > one start at the end and work back, as the last few are
    > more likely to indicate what killed it?


    Note that the places where a defect is in code and where it manifests
    itself (as crash or other misbehavior) are often quite distant.
    Therefore I would start from first warnings. The warnings in library
    are often caused by caller of library supplying invalid arguments. Last
    warnings are usually where the fatally wounded program finally died.
    That can be far from where it got the mortal wounds.
    Öö Tiib, Jun 29, 2013
    #4
  5. Les Cargill Guest

    wrote:
    > I built an open-source package that comprises hundreds of source files
    > in C, C++, F90 among others. It has a few bugs suggesting memory
    > corruption, such as SIGSEGV or malloc and free aborts.
    >
    > So I tried valgrind --tool=memcheck
    > It throws up hundreds of warnings about
    > "Conditional jump or move depends on uninitialised value(s)"
    > Most of these came from c code, a few from Intel libraries such
    > as _intel_fast_memcmp or __intel_sse2_strlen
    >
    > So I presume these are not really fatal, or else the program
    > would not get to first base.


    That is a bad assumption. Your "machine" has a loose part
    that will wreck it every so-random often.

    > So when using valgrind, should
    > one start at the end and work back, as the last few are
    > more likely to indicate what killed it?
    >
    >


    You have, somewhere, an uninitialized value used as an argument
    to those functions, unless those libraries are seriously
    broken.

    Yes, you have to fix this :)

    --
    Les Cargill
    Les Cargill, Jun 29, 2013
    #5
  6. James Kuyper Guest

    On 06/29/2013 10:48 AM, Malcolm McLean wrote:
    > On Saturday, June 29, 2013 2:30:26 PM UTC+1, wrote:
    >> I built an open-source package that comprises hundreds of source files
    >> in C, C++, F90 among others. It has a few bugs suggesting memory
    >> corruption, such as SIGSEGV or malloc and free aborts.
    >>
    >>
    >> So I tried valgrind --tool=memcheck
    >> It throws up hundreds of warnings about
    >> "Conditional jump or move depends on uninitialised value(s)"
    >> Most of these came from c code, a few from Intel libraries such
    >> as _intel_fast_memcmp or __intel_sse2_strlen
    >>
    >> So I presume these are not really fatal, or else the program
    >> would not get to first base. So when using valgrind, should
    >> one start at the end and work back, as the last few are
    >> more likely to indicate what killed it?
    >>

    > It's unlikely that intel_fast_memcmp really has a bug in it. More
    > likely it's written in a strange, highly optimal way that valgrind
    > can't understand.


    The way that valgrind works, it doesn't need to understand how
    _intel_fast_memcmp() is written - it essentially runs the executable in
    an instrumented emulator of the target platform. It keeps track of which
    pieces of memory have been initialized, and when a conditional jump is
    executed based upon the value stored in such memory, valgrind generates
    this message. It doesn't have any need to understand why the jump is
    being executed.

    This message was almost certainly the result of uninitialized memory
    being passed to _intel_fast_memcmp() by higher level code. Assuming it's
    reasonably named, that function is likely to execute a conditional jump
    based upon the value stored in each and every byte of both buffers
    passed to it.
    --
    James Kuyper
    James Kuyper, Jun 29, 2013
    #6
  7. Guest

    On Saturday, 29 June 2013 22:59:31 UTC+8, Öö Tiib wrote:
    > Note that the places where a defect is in code and where it manifests
    >
    > itself (as crash or other misbehavior) are often quite distant.
    >
    > Therefore I would start from first warnings. The warnings in library
    >
    > are often caused by caller of library supplying invalid arguments. Last
    >
    > warnings are usually where the fatally wounded program finally died.
    >
    > That can be far from where it got the mortal wounds.


    I know this. One of the free() crashes gives an address of 2020202020 hex,
    which suggests a string of ASCII spaces is overwriting the allocated
    chunk. But the module in question does no string handling.
    , Jun 30, 2013
    #7
  8. On Saturday, June 29, 2013 9:43:51 PM UTC+1, James Kuyper wrote:
    > On 06/29/2013 10:48 AM, Malcolm McLean wrote:
    >
    > > On Saturday, June 29, 2013 2:30:26 PM UTC+1, wrote:

    >
    > >> I built an open-source package that comprises hundreds of source files

    >
    > >> in C, C++, F90 among others. It has a few bugs suggesting memory

    >
    > >> corruption, such as SIGSEGV or malloc and free aborts.

    >
    > >>

    >
    > >>

    >
    > >> So I tried valgrind --tool=memcheck

    >
    > >> It throws up hundreds of warnings about

    >
    > >> "Conditional jump or move depends on uninitialised value(s)"

    >
    > >> Most of these came from c code, a few from Intel libraries such

    >
    > >> as _intel_fast_memcmp or __intel_sse2_strlen

    >
    > >>

    >
    > >> So I presume these are not really fatal, or else the program

    >
    > >> would not get to first base. So when using valgrind, should

    >
    > >> one start at the end and work back, as the last few are

    >
    > >> more likely to indicate what killed it?

    >
    > >>

    >
    > > It's unlikely that intel_fast_memcmp really has a bug in it. More

    >
    > > likely it's written in a strange, highly optimal way that valgrind

    >
    > > can't understand.

    >
    >
    >
    > The way that valgrind works, it doesn't need to understand how
    >
    > _intel_fast_memcmp() is written - it essentially runs the executable in
    >
    > an instrumented emulator of the target platform. It keeps track of which
    >
    > pieces of memory have been initialized, and when a conditional jump is
    >
    > executed based upon the value stored in such memory, valgrind generates
    >
    > this message. It doesn't have any need to understand why the jump is
    >
    > being executed.
    >
    >
    >
    > This message was almost certainly the result of uninitialized memory
    >
    > being passed to _intel_fast_memcmp() by higher level code. Assuming it's
    >
    > reasonably named, that function is likely to execute a conditional jump
    >
    > based upon the value stored in each and every byte of both buffers
    >
    > passed to it.
    >
    >

    I'm guessing (it's only a guess) that intel_fast_memcmp takes arbitrary
    unsigned char *s and lengths, aligns them on 64 bit boundaries, and if
    all 64 bit chunks match, returns 0. If the edge bits don't match, it does
    AND and OR masking to get the correct answer.
    So valgrind will think it's using uninitialised memory, which it is,
    but legitimately. (Not legal in C, but it's not written in C).
    Malcolm McLean, Jun 30, 2013
    #8
  9. Ike Naar Guest

    On 2013-06-29, ?? Tiib <> wrote:
    > On Saturday, 29 June 2013 16:30:26 UTC+3, wrote:
    >> I built an open-source package that comprises hundreds of source files
    >> in C, C++, F90 among others. It has a few bugs suggesting memory
    >> corruption, such as SIGSEGV or malloc and free aborts.

    >
    > Open source is often of rather terrible quality. Not as rule, just often.


    Same for closed source.
    At least with open source one can see how bad things are.
    Ike Naar, Jun 30, 2013
    #9
  10. James Kuyper Guest

    On 06/30/2013 05:11 AM, Malcolm McLean wrote:
    ....
    > I'm guessing (it's only a guess) that intel_fast_memcmp takes arbitrary
    > unsigned char *s and lengths, aligns them on 64 bit boundaries, and if
    > all 64 bit chunks match, returns 0. If the edge bits don't match, it does
    > AND and OR masking to get the correct answer.
    > So valgrind will think it's using uninitialised memory, which it is,
    > but legitimately. (Not legal in C, but it's not written in C).


    I'll concede that's a possibility, but it seems far more plausible to me
    that calling code is defective by reason of calling _intel_fast_memcmp()
    (directly or indirectly) to compare two buffers, one or both of which
    has uninitialized memory within the specified length. Developers often
    make mistakes like that, which is one of the main reasons for the very
    existence of tools like valgrind.
    --
    James Kuyper
    James Kuyper, Jun 30, 2013
    #10
  11. Öö Tiib Guest

    On Sunday, 30 June 2013 12:53:41 UTC+3, Ike Naar wrote:
    > On 2013-06-29, ?? Tiib <> wrote:
    > > On Saturday, 29 June 2013 16:30:26 UTC+3, wrote:
    > >> I built an open-source package that comprises hundreds of source files
    > >> in C, C++, F90 among others. It has a few bugs suggesting memory
    > >> corruption, such as SIGSEGV or malloc and free aborts.

    > >
    > > Open source is often of rather terrible quality. Not as rule, just often.

    >
    > Same for closed source.
    > At least with open source one can see how bad things are.


    It is not benefit. When someone asks me to look into source code of
    closed source then they usually offer money for that. With open
    source ... maybe others are better at that ... but I have got noteworthy
    money for working on open source only once.
    Öö Tiib, Jun 30, 2013
    #11
  12. Jorgen Grahn Guest

    On Sat, 2013-06-29, wrote:
    > I built an open-source package that comprises hundreds of source files
    > in C, C++, F90 among others. It has a few bugs suggesting memory
    > corruption, such as SIGSEGV or malloc and free aborts.
    >
    > So I tried valgrind --tool=memcheck
    > It throws up hundreds of warnings about


    These may well be your fault, as the others say.

    But note that valgrind comes with "suppressions" -- warnings from
    popular libraries (like the Gnu libc) which have been investigated and
    found harmless. If you're working on an exotic combination of OS,
    compiler and libc you may not have the best possible set of
    suppressions.

    The valgrind documentation can tell you more.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Jun 30, 2013
    #12
  13. James Kuyper <> wrote:
    > On 06/30/2013 05:11 AM, Malcolm McLean wrote:


    (snip)
    >> I'm guessing (it's only a guess) that intel_fast_memcmp takes
    >> arbitrary unsigned char *s and lengths, aligns them on 64 bit
    >> boundaries, and if all 64 bit chunks match, returns 0. If the
    >> edge bits don't match, it does AND and OR masking to get the
    >> correct answer.


    >> So valgrind will think it's using uninitialised memory,
    >> which it is, but legitimately. (Not legal in C, but it's
    >> not written in C).


    Or maybe written in non-standard C. While one can't portably
    write things like that, within an implementation one can either
    write the assembler code for it, or write it using the C compiler.
    (Note that I didn't say write it in C.)

    > I'll concede that's a possibility, but it seems far more
    > plausible to me that calling code is defective by reason of
    > calling _intel_fast_memcmp() (directly or indirectly) to
    > compare two buffers, one or both of which has uninitialized
    > memory within the specified length. Developers often make mistakes
    > like that, which is one of the main reasons for the very
    > existence of tools like valgrind.


    If the implementation is doing that, then a version of valgrind for
    that implementation should know about it.

    -- glen
    glen herrmannsfeldt, Jul 1, 2013
    #13
  14. James Kuyper Guest

    On 07/01/2013 03:24 PM, glen herrmannsfeldt wrote:
    > James Kuyper <> wrote:

    ....
    >> I'll concede that's a possibility, but it seems far more
    >> plausible to me that calling code is defective by reason of
    >> calling _intel_fast_memcmp() (directly or indirectly) to
    >> compare two buffers, one or both of which has uninitialized
    >> memory within the specified length. Developers often make mistakes
    >> like that, which is one of the main reasons for the very
    >> existence of tools like valgrind.

    >
    > If the implementation is doing that, then a version of valgrind for
    > that implementation should know about it.


    I was referring to a defect in the user code, not a defect in the
    implementation.
    James Kuyper, Jul 1, 2013
    #14
  15. Nobody Guest

    On Sat, 29 Jun 2013 22:23:31 -0500, Gordon Burditt wrote:

    > It *is* possible that a highly-optimized C library reads beyond the end of
    > a string, say, 8 bytes at a time, yet still operates correctly because the
    > compiler knows that its malloc() implementation won't ever allocate the
    > last 7 bytes of a virtual memory chunk, so the code won't segfault by
    > going beyond the end of the string, but valgrind doesn't know this.


    It doesn't matter whether malloc() allocates the last few bytes of a page.

    A string-processing algorithm which works in e.g. 8-byte units will
    invariably work in *aligned* 8-byte units, so it will only read bytes
    beyond the end of a string when those bytes are within the same alignment
    unit as one or more bytes of the string, and thus within the same page.
    Nobody, Jul 1, 2013
    #15
  16. On 01-Jul-13 14:55, Nobody wrote:
    > On Sat, 29 Jun 2013 22:23:31 -0500, Gordon Burditt wrote:
    >> It *is* possible that a highly-optimized C library reads beyond the
    >> end of a string, say, 8 bytes at a time, yet still operates
    >> correctly because the compiler knows that its malloc()
    >> implementation won't ever allocate the last 7 bytes of a virtual
    >> memory chunk, so the code won't segfault by going beyond the end of
    >> the string, but valgrind doesn't know this.

    >
    > It doesn't matter whether malloc() allocates the last few bytes of a
    > page.
    >
    > A string-processing algorithm which works in e.g. 8-byte units will
    > invariably work in *aligned* 8-byte units, so it will only read
    > bytes beyond the end of a string when those bytes are within the same
    > alignment unit as one or more bytes of the string, and thus within
    > the same page.


    malloc() implementations often allocate smallish blocks in a set of
    fixed sizes, typically multiples of 8, to reduce heap fragmentation.
    Since the start of each block needs to be aligned for any data type,
    also typically 8 bytes, that means that reading from such blocks in
    aligned chunks of 8 bytes will be safe from segfaults. As long as an
    optimized memcmp() or strcmp() doesn't actually _use_ data from beyond
    the proper length, it's not an error--at least for code, eg. Intel's
    optimized libraries, that can be reasonably considered part of the
    implementation.

    S

    --
    Stephen Sprunk "God does not play dice." --Albert Einstein
    CCIE #3723 "God is an inveterate gambler, and He throws the
    K5SSS dice at every possible opportunity." --Stephen Hawking
    Stephen Sprunk, Jul 2, 2013
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alexandre

    leak detected using valgrind

    Alexandre, Nov 13, 2003, in forum: C++
    Replies:
    1
    Views:
    311
    Gianni Mariani
    Nov 13, 2003
  2. rsina
    Replies:
    2
    Views:
    1,044
    ettipmoez
    Nov 29, 2004
  3. Andreas Andersen

    Problems with valgrind

    Andreas Andersen, Jan 18, 2005, in forum: C++
    Replies:
    2
    Views:
    320
    Gernot Frisch
    Jan 18, 2005
  4. Replies:
    2
    Views:
    33,995
    Victor Bazarov
    Feb 17, 2005
  5. Replies:
    1
    Views:
    458
    Christoph Bartoschek
    Apr 29, 2005
Loading...

Share This Page