Debugger "print" clears memory corruption

G

Gavin Kreuiter

I am looking for some advice on how to debug a program when the
debugger "print" command actually clears the corruption. This is not
the usual non-initialised memory problem, because the program aborts
with a SIGBUS inside the debugger as well. But when I use the print
command inside the debugger, the program completes normally.

I am using gdb on a linux system. The offending C code is:

memcpy(new_entry, &newloc, IRECPTRLEN);

I display these values just before the memcpy:

printf("Calling memcpy(%p, %p, %d)\n", new_entry, &newloc,
IRECPTRLEN);

.... which works. When run straight from gdb (snipped a bit):

$ gdb xwif
(gdb) b src/c_library.c:598
Breakpoint 1 at 0x804bca3: file src/c_library.c, line 598.
(gdb) run
Starting program: /home/dev/bin/xwif -p
Calling memcpy(0x4001f000, 0xbffff04c, 4)

Breakpoint 1, c$keyed_write (p=0x80520a0, record=0x80658a0 "\002") at
src/c_library.c:598
598 memcpy(new_entry, &newloc, IRECPTRLEN);
(gdb) s

Program received signal SIGBUS, Bus error.
0x4207c46c in memcpy () from /lib/i686/libc.so.6

But when I use "print" before "step":

$ gdb xwif
(gdb) b src/c_library.c:598
Breakpoint 1 at 0x804bca3: file src/c_library.c, line 598.
(gdb) r

Starting program: /home/dev/bin/xwif -p
Calling memcpy(0x4001f000, 0xbffff04c, 4)

Breakpoint 1, c$keyed_write (p=0x80520a0, record=0x80658a0 "\002") at
src/c_library.c:598
598 memcpy(new_entry, &newloc, IRECPTRLEN);
(gdb) p new_entry
$1 = 0x4001f000 ""
(gdb) s
599 new_entry += IRECPTRLEN;
(gdb)

.... and it completes successfully.

I *know* that I am corrupting memory somewhere (I am calling mmap). I
wrote a small program to test the way I am using mmap(), and it works.
But when I try to include it in a much larger application, it aborts.
I am not asking you to debug my program, nor for help on mmap()
(although, if you really want to spend hours stepping through my code,
I won't object :) But I am requesting help with techniques to debug
programs exhibiting symptoms like the above.
 
G

Grumble

Gavin said:
I am looking for some advice on how to debug a program when the
debugger "print" command actually clears the corruption. This is not
the usual non-initialised memory problem, because the program aborts
with a SIGBUS inside the debugger as well. But when I use the print
command inside the debugger, the program completes normally.

I am using gdb on a linux system. The offending C code is:

memcpy(new_entry, &newloc, IRECPTRLEN);

How is new_entry declared? It is probably read-only. Did you try to
dynamically allocate space for it before you call memcpy()?

Do you know about gcc's -Wwrite-strings and -fwritable-strings?
 
K

Kevin Goodsell

Gavin said:
I am looking for some advice on how to debug a program when the
debugger "print" command actually clears the corruption. This is not
the usual non-initialised memory problem, because the program aborts
with a SIGBUS inside the debugger as well. But when I use the print
command inside the debugger, the program completes normally.

Can't you let the program die then see where it happened? gdb usually
reports where a program crashed.

SIGBUS is an indication that you have an alignment issue. Remember that
you can't simply address an arbitrary memory location as if it were an
int, or a double, or whatever:

int main(void)
{
char a[sizeof(int)];
int *p = (int *)a;

*p = 100; /* Possible alignment error! */

return 0;
}

-Kevin
 
C

Chris Torek

I am looking for some advice on how to debug a program when the
debugger "print" command actually clears the corruption. This is not
the usual non-initialised memory problem, because the program aborts
with a SIGBUS inside the debugger as well. But when I use the print
command inside the debugger, the program completes normally.
I am using gdb on a linux system. The offending C code is:
memcpy(new_entry, &newloc, IRECPTRLEN);

[examples snipped]

I suspect neither answer so far is right, and that the problem is
something more subtle having to do with whether the page(s) is/are
allocated at the time memcpy() first touches them. Using the
debugger's "print" command forces a read access to the address, so
that the page is in RAM (and may even be r/w) by the time you step
into memcpy().

There are any number of ways to find out if this is the case, and
what else might be going on, but all of them are off-topic save one:
you can force a write access to the first byte at new_entry via:

*(unsigned char *)new_entry = *(unsigned char *)&newloc;

before the memcpy() operation. If the behavior changes, you at least
have some additional information.

(A Linux-specific group -- which one is not clear -- would be the
right place to go for information on what extra debugging information
is available after a SIGBUS is caught in the debugger, and how to
trace relevant system activity up to that point.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top