Flash said:
<snip>
Stop talking complete rubbish. As you well know valgrind can be used
with any standard C program where as garbage collection cannot.
Who is talking complete rubbish?
Here is a list of limitations of valgrind:
Valgrind will run Linux ELF binaries, on a kernel 2.4.X or 2.6.X system,
on the x86, amd64, ppc32 and ppc64 architectures, subject to the
following constraints:
*
On x86 and amd64, there is no support for 3DNow! instructions. If
the translator encounters these, Valgrind will generate a SIGILL when
the instruction is executed. Apart from that, on x86 and amd64,
essentially all instructions are supported, up to and including SSE2.
Version 3.1.0 includes limited support for SSE3 on x86. This could be
improved if necessary.
On ppc32 and ppc64, almost all integer, floating point and
Altivec instructions are supported. Specifically: integer and FP insns
that are mandatory for PowerPC, the "General-purpose optional" group
(fsqrt, fsqrts, stfiwx), the "Graphics optional" group (fre, fres,
frsqrte, frsqrtes), and the Altivec (also known as VMX) SIMD instruction
set, are supported.
*
Atomic instruction sequences are not properly supported, in the
sense that their atomicity is not preserved. This will affect any use of
synchronization via memory shared between processes. They will appear to
work, but fail sporadically.
*
If your program does its own memory management, rather than using
malloc/new/free/delete, it should still work, but Valgrind's error
checking won't be so effective. If you describe your program's memory
management scheme using "client requests" (see The Client Request
mechanism), Memcheck can do better. Nevertheless, using malloc/new and
free/delete is still the best approach.
*
Valgrind's signal simulation is not as robust as it could be.
Basic POSIX-compliant sigaction and sigprocmask functionality is
supplied, but it's conceivable that things could go badly awry if you do
weird things with signals. Workaround: don't. Programs that do non-POSIX
signal tricks are in any case inherently unportable, so should be
avoided if possible.
*
Machine instructions, and system calls, have been implemented on
demand. So it's possible, although unlikely, that a program will fall
over with a message to that effect. If this happens, please report ALL
the details printed out, so we can try and implement the missing feature.
*
Memory consumption of your program is majorly increased whilst
running under Valgrind. This is due to the large amount of
administrative information maintained behind the scenes. Another cause
is that Valgrind dynamically translates the original executable.
Translated, instrumented code is 12-18 times larger than the original so
you can easily end up with 50+ MB of translations when running (eg) a
web browser.
*
Valgrind can handle dynamically-generated code just fine. If you
regenerate code over the top of old code (ie. at the same memory
addresses), if the code is on the stack Valgrind will realise the code
has changed, and work correctly. This is necessary to handle the
trampolines GCC uses to implemented nested functions. If you regenerate
code somewhere other than the stack, you will need to use the
--smc-check=all flag, and Valgrind will run more slowly than normal.
*
As of version 3.0.0, Valgrind has the following limitations in
its implementation of x86/AMD64 floating point relative to IEEE754.
Precision: There is no support for 80 bit arithmetic. Internally,
Valgrind represents all such "long double" numbers in 64 bits, and so
there may be some differences in results. Whether or not this is
critical remains to be seen. Note, the x86/amd64 fldt/fstpt instructions
(read/write 80-bit numbers) are correctly simulated, using conversions
to/from 64 bits, so that in-memory images of 80-bit numbers look correct
if anyone wants to see.
The impression observed from many FP regression tests is that the
accuracy differences aren't significant. Generally speaking, if a
program relies on 80-bit precision, there may be difficulties porting it
to non x86/amd64 platforms which only support 64-bit FP precision. Even
on x86/amd64, the program may get different results depending on whether
it is compiled to use SSE2 instructions (64-bits only), or x87
instructions (80-bit). The net effect is to make FP programs behave as
if they had been run on a machine with 64-bit IEEE floats, for example
PowerPC. On amd64 FP arithmetic is done by default on SSE2, so amd64
looks more like PowerPC than x86 from an FP perspective, and there are
far fewer noticable accuracy differences than with x86.
Rounding: Valgrind does observe the 4 IEEE-mandated rounding
modes (to nearest, to +infinity, to -infinity, to zero) for the
following conversions: float to integer, integer to float where there is
a possibility of loss of precision, and float-to-float rounding. For all
other FP operations, only the IEEE default mode (round to nearest) is
supported.
Numeric exceptions in FP code: IEEE754 defines five types of
numeric exception that can happen: invalid operation (sqrt of negative
number, etc), division by zero, overflow, underflow, inexact (loss of
precision).
For each exception, two courses of action are defined by 754:
either (1) a user-defined exception handler may be called, or (2) a
default action is defined, which "fixes things up" and allows the
computation to proceed without throwing an exception.
Currently Valgrind only supports the default fixup actions.
Again, feedback on the importance of exception support would be appreciated.
When Valgrind detects that the program is trying to exceed any of
these limitations (setting exception handlers, rounding mode, or
precision control), it can print a message giving a traceback of where
this has happened, and continue execution. This behaviour used to be the
default, but the messages are annoying and so showing them is now
optional. Use --show-emwarns=yes to see them.
The above limitations define precisely the IEEE754 'default'
behaviour: default fixup on all exceptions, round-to-nearest operations,
and 64-bit precision.
*
As of version 3.0.0, Valgrind has the following limitations in
its implementation of x86/AMD64 SSE2 FP arithmetic, relative to IEEE754.
Essentially the same: no exceptions, and limited observance of
rounding mode. Also, SSE2 has control bits which make it treat
denormalised numbers as zero (DAZ) and a related action, flush denormals
to zero (FTZ). Both of these cause SSE2 arithmetic to be less accurate
than IEEE requires. Valgrind detects, ignores, and can warn about,
attempts to enable either mode.
*
As of version 3.2.0, Valgrind has the following limitations in
its implementation of PPC32 and PPC64 floating point arithmetic,
relative to IEEE754.
Scalar (non-Altivec): Valgrind provides a bit-exact emulation of
all floating point instructions, except for "fre" and "fres", which are
done more precisely than required by the PowerPC architecture
specification. All floating point operations observe the current
rounding mode.
However, fpscr[FPRF] is not set after each operation. That could
be done but would give measurable performance overheads, and so far no
need for it has been found.
As on x86/AMD64, IEEE754 exceptions are not supported: all
floating point exceptions are handled using the default IEEE fixup
actions. Valgrind detects, ignores, and can warn about, attempts to
unmask the 5 IEEE FP exception kinds by writing to the floating-point
status and control register (fpscr).
Vector (Altivec, VMX): essentially as with x86/AMD64 SSE/SSE2: no
exceptions, and limited observance of rounding mode. For Altivec, FP
arithmetic is done in IEEE/Java mode, which is more accurate than the
Linux default setting. "More accurate" means that denormals are handled
properly, rather than simply being flushed to zero.
Programs which are known not to work are:
emacs starts up but immediately concludes it is out of memory and
aborts. It may be that Memcheck does not provide a good enough emulation
of the mallinfo function. Emacs works fine if you build it to use the
standard malloc/free routines.