how to debug python application crashed occasionally

J

jacky wang

Hello

recently, I met a problem with one python application running with
python2.5 | debian/lenny adm64 system: it crashed occasionally in our
production environment. The problem started to happen just after we
upgraded the python application from python2.4 | debian/etch amd64.

after configuring the system to enable core dump & debugging with
the core dumps by following the guide line from http://wiki.python.org/moin/DebuggingWithGdb,
I became more confused about that.

The first crash case was happening in calling python-xml module,
which is claimed as a pure python module, and it's not supposed to
crash python interpreter. because the python application is relatively
a big one, I can not show u guys the exact source code related with
the crash, but only the piece of python modules. GDB shows it's
crashed at string join operation:

#0 string_join (self=0x7f7075baf030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795 ../Objects/stringobject.c: No such file or directory.
in ../Objects/stringobject.c

and pystack macro shows the details:

gdb) pystack
/usr/lib/python2.5/StringIO.py (271): getvalue
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (62):
toprettyxml
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (47): toxml

at that time, we also found python-xml module has performance issue
for our application, so we decided to use python-lxml to replace
python-xml. After that replacement, the crash was gone. That's a bit
weird for me, but anyway, it's gone.

Unfortunately, another two 'kinds' of crashes happening after that,
and the core dumps show they are not related with the replacement.

One is crashed with "Program terminated with signal 11", and the
pystack macro shows it's crashed at calling the built-in id()
function.

#0 visit_decref (op=0x20200a3e22726574, data=0x0) at ../Modules/
gcmodule.c:270
270 ../Modules/gcmodule.c: No such file or directory.
in ../Modules/gcmodule.c


Another is crashed with "Program terminated with signal 7", and the
pystack macro shows it's crashed at the exactly same operation (string
join) as the first one (python-xml), but in different library python-
simplejson:

#0 string_join (self=0x7f5149877030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795 ../Objects/stringobject.c: No such file or directory.
in ../Objects/stringobject.c

(gdb) pystack
/var/lib/python-support/python2.5/simplejson/encoder.py (367): encode
/var/lib/python-support/python2.5/simplejson/__init__.py (243): dumps

I'm not good at using gdb & C programming, then I tried some other
ways to dig further:
* get the source code of python2.5, but can not figure out the
crash reason :(
* since Debian distribution provides python-dbg package, and I
tried to use python2.5-dbg interpreter, but not the python2.5, so that
I can get more debug information in the core dump file. Unfortunately,
my python application is using a bunch of C modules, and not all of
them provides -dbg package in Debian/Lenny. So it still doesn't make
any progress yet.

I will be really appreciated if somebody can help me about how to
debug the python crashes.

Thanks in advance!

BR
Jacky Wang
 
J

jacky wang

could anyone help me?


Hello

  recently, I met a problem with one python application running with
python2.5 | debian/lenny adm64 system: it crashed occasionally in our
production environment. The problem started to happen just after we
upgraded the python application from python2.4 | debian/etch amd64.

  after configuring the system to enable core dump & debugging with
the core dumps by following the guide line fromhttp://wiki.python.org/moin/DebuggingWithGdb,
I became more confused about that.

  The first crash case was happening in calling python-xml module,
which is claimed as a pure python module, and it's not supposed to
crash python interpreter. because the python application is relatively
a big one, I can not show u guys the exact source code related with
the crash, but only the piece of python modules. GDB shows it's
crashed at string join operation:

#0  string_join (self=0x7f7075baf030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795    ../Objects/stringobject.c: No such file or directory.
        in ../Objects/stringobject.c

and pystack macro shows the details:

gdb) pystack
/usr/lib/python2.5/StringIO.py (271): getvalue
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (62):
toprettyxml
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (47): toxml

  at that time, we also found python-xml module has performance issue
for our application, so we decided to use python-lxml to replace
python-xml. After that replacement, the crash was gone. That's a bit
weird for me, but anyway, it's gone.

  Unfortunately, another two 'kinds' of crashes happening after that,
and the core dumps show they are not related with the replacement.

  One is crashed with "Program terminated with signal 11", and the
pystack macro shows it's crashed at calling the built-in id()
function.

#0  visit_decref (op=0x20200a3e22726574, data=0x0) at ../Modules/
gcmodule.c:270
270     ../Modules/gcmodule.c: No such file or directory.
        in ../Modules/gcmodule.c

  Another is crashed with "Program terminated with signal 7", and the
pystack macro shows it's crashed at the exactly same operation (string
join) as the first one (python-xml), but in different library python-
simplejson:

#0  string_join (self=0x7f5149877030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795    ../Objects/stringobject.c: No such file or directory.
        in ../Objects/stringobject.c

(gdb) pystack
/var/lib/python-support/python2.5/simplejson/encoder.py (367): encode
/var/lib/python-support/python2.5/simplejson/__init__.py (243): dumps

  I'm not good at using gdb & C programming, then I tried some other
ways to dig further:
   * get the source code of python2.5, but can not figure out the
crash reason :(
   * since Debian distribution provides python-dbg package, and I
tried to use python2.5-dbg interpreter, but not the python2.5, so that
I can get more debug information in the core dump file. Unfortunately,
my python application is using a bunch of C modules, and not all of
them provides -dbg package in Debian/Lenny. So it still doesn't make
any progress yet.

  I will be really appreciated if somebody can help me about how to
debug the python crashes.

  Thanks in advance!

BR
Jacky Wang
 
H

Helmut Jarausch

could anyone help me?

For me, it sounds like a hardware problem. Have run memory tests like
memtest86+ and/or memtester?

You could try a more recent version of Python like 2.6.5 and see if
you get the same sort of errors.

Try to run your application on a different machine if possible, to
exclude hardware errors.

Have you rebuild all Python packages you're using and which use an
extension in C / C++ ? after upgrading?

I hope this helps a bit,
Helmut.



--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top