segfault in extension module

  • Thread starter Nathaniel Echols
  • Start date
N

Nathaniel Echols

I've written a function in C to perform protein sequence alignment. This
works fine in a standalone C program. I've added the necessary packaging
to use it in Python; it returns three strings and an integer. However, as
soon as the function is complete, I get a segfault and the interpreter
dies.

If I run Python interactively, just calling the function causes a
segfault. If I'm running a script, I can actually print out the return
values (which are what I'd expect - so something's working) but as soon as
the script is done I get the segfault again. I can even call the function
twice, with different arguments - and it works both times. So it appears
that the problem is with tying up loose ends.

How do I determine what is going wrong? I do not get any problem like
this in the C version. I am not using free() anywhere - I will eventually
need to fix this, but I cannot find any place where I might be accessing
unavailable memory. (Adding in free() does not make any difference in the
module, for what it's worth, but I've had some issues with the C program
so I've left it out.)

[I've also tried using PyMem_Malloc instead, throughout the C code.
Doesn't help.]

thanks,
Nat
(please reply directly!)
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

Nathaniel Echols said:
How do I determine what is going wrong?

I recommend to run your code (or the interactive python) in a debugger.
For example, with gdb, you'd get

gdb /usr/bin/python
(gdb) run[program crashes]
(gdb) bt

The latter command will give a backtrace, which should tell you where
it crashes. If you don't get enough detail, make sure you compile your
module with debugging information. If you are on Windows, make then
sure that Python is compiled for debugging as well.

Regards,
Martin
 
M

Miki Tebeka

Hello Nat,
I've written a function in C to perform protein sequence alignment. This
works fine in a standalone C program. I've added the necessary packaging
to use it in Python;
Which one? Are you using the C API?
it returns three strings and an integer. However, as
soon as the function is complete, I get a segfault and the interpreter
dies.
Looks like refcount problems, check out
http://www.python.org/doc/current/ext/refcounts.html .

IMO just avoid all this stuff and use SWIG/Boost.Python/Pyrex.

HTH.
Miki
 
N

Nathaniel Echols

I've written a function in C to perform protein sequence alignment. This
Which one? Are you using the C API?

Yup - I'm reading right out of the manual.

I read this before and couldn't figure out what it meant. This does seem
like it would relate, but I can't figure out what I'm doing incorrectly.
I just have one function which calls a pure C function and returns a
tuple of strings from it. I'm guessing I need to add a Py_INCREF()
somewhere but so far this just makes it segfault sooner. (I'm
not sure what argument to use for Py_INCREF(), either.)

I've looked at several other pages, and they all seem to involve setups
more complicated than what I'm doing. I'm already using Py_BuildValue()
to generate the returned tuple, and my understanding is that this should
avoid major problems. . .
IMO just avoid all this stuff and use SWIG/Boost.Python/Pyrex.

I'll look at these, but I only have a tiny little bit of code I need to do
this with - I coded it from scratch with the intention of using it this
way, and could have written it in Python if I didn't care about speed.
Would I really benefit from using one of the other methods? The goal here
is explicitly to put only the very time-dependent code in C; everything
else stays in Python.

thanks,
Nat
 
N

Nathaniel Echols

The latter command will give a backtrace, which should tell you where it
crashes. If you don't get enough detail, make sure you compile your
module with debugging information. If you are on Windows, make then sure
that Python is compiled for debugging as well. Regards, Martin

Okay:

#0 0x420744fe in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x0809dc0d in _PyObject_GC_Del ()
#3 0x080ce86f in PyDict_Next ()
#4 0x080d191d in _PyModule_Clear ()

I guess this makes sense, but I'm still not sure how to fix it. . .
 
J

John J. Lee

Nathaniel Echols said:
IMO just avoid all this stuff and use SWIG/Boost.Python/Pyrex.

I'll look at these, but I only have a tiny little bit of code I need to do [...]
Would I really benefit from using one of the other methods?

Why not take advantage of it? You've already discovered how
hand-writing extensions can be painful. SWIG is probably best for
you. Ignore all the fancy SWIG features, just ask it to wrap the
function, it's very easy.

The goal here
is explicitly to put only the very time-dependent code in C; everything
else stays in Python.

SWIG should be able to do that. There's probably some overhead above
a hand-written wrapper, but it's so trivial to use SWIG in this simple
way that you probably shouldn't begin to worry about that -- it's
unlikely to be a problem.


John
 
M

Michael Hudson

Nathaniel Echols said:
I read this before and couldn't figure out what it meant.

Then I am pretty sure this is your problem :)
This does seem like it would relate, but I can't figure out what I'm
doing incorrectly. I just have one function which calls a pure C
function and returns a tuple of strings from it. I'm guessing I
need to add a Py_INCREF() somewhere but so far this just makes it
segfault sooner. (I'm not sure what argument to use for
Py_INCREF(), either.)

I've looked at several other pages, and they all seem to involve setups
more complicated than what I'm doing. I'm already using Py_BuildValue()
to generate the returned tuple, and my understanding is that this should
avoid major problems. . .

Post some code.

Cheers,
mwh
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

Nathaniel Echols said:
Okay:

#0 0x420744fe in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x0809dc0d in _PyObject_GC_Del ()
#3 0x080ce86f in PyDict_Next ()
#4 0x080d191d in _PyModule_Clear ()

I guess this makes sense, but I'm still not sure how to fix it. . .

Ah, tls/libc.so.6. I think you lose, being confronted with a buggy C
library. Try making it not use /lib/tls.

If that does not change the behaviour, you probably have a
ref-counting bug somewhere. Try building a debugging version of
Python.

Regards,
Martin
 
N

Nat Echols

Post some code.

may god have mercy on my soul:

#include <Python.h>
#include "nw.h"


static PyObject *nw_align (PyObject *self, PyObject *args) {
char *seq1, *seq2, *mfile;
char *out1, *out2, *match;
int penalty, status, score;
PyObject *results;

if (! PyArg_ParseTuple(args, "sssi", &seq1, &seq2, &mfile, &penalty)) {
return NULL;
}

status = nw(seq1, seq2, mfile, penalty, &out1, &out2, &match, &score);

if (status == -1) {
PyErr_NoMemory();
return NULL;
}

results = Py_BuildValue("(sssi)", out1, out2, match, score);
return results;
}

static PyMethodDef nwMethods[] = {
{"align", nw_align, METH_VARARGS,
"Perform Needleman-Wunsch alignment of two protein sequences."},
{NULL, NULL, 0, NULL} /* This is required! No idea why. */
};

void initnw (void) {
(void) Py_InitModule("nw", nwMethods);
}

This is just the wrapper; the function that it calls is defined such:
int nw (char *seq1, char *seq2, char *matrixfile, int penalty,
char **_out1, char **_out2, char **_match, int *_score);
(I can supply this too, but I already know this works fine in a standalone
C program.)

In Python, I simply do this:
(out1, out2, match, score) = nw.align(seq1, seq2, "BLOSUM62", 0)

This is it; should be simple to fix, no?
 
M

Michael Hudson

Nat Echols said:
may god have mercy on my soul:

Well, I can't see anything flagrantly wrong with that.

Does a debugger provide any hints?

Cheers,
mwh
 
N

Nat Echols

Well, I can't see anything flagrantly wrong with that.
Does a debugger provide any hints?

Someone else suggested this, and this is what the backtrace from gdb
indicated:

#0 0x420744fe in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x0809dc0d in _PyObject_GC_Del ()
#3 0x080ce86f in PyDict_Next ()
#4 0x080d191d in _PyModule_Clear ()
(It goes on like this for a while. . .)

As far as the tls C library, which someone told me was buggy - I don't
know how to force use of a different library; Python appears to be linked
to the same one. At any rate, I seem to have the same problem on two
different machines, one running SuSE 8.1, one running RedHat 9.0.

thanks,
Nat
 
M

Michael Hudson

Nat Echols said:
Someone else suggested this, and this is what the backtrace from gdb
indicated:

#0 0x420744fe in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x0809dc0d in _PyObject_GC_Del ()
#3 0x080ce86f in PyDict_Next ()
#4 0x080d191d in _PyModule_Clear ()
^^^^^^^^^^^^^^^
*That's* very odd.
(It goes on like this for a while. . .)

Can we see a little more?

Cheers,
mwh
 
N

Nat Echols

*That's* very odd.
Can we see a little more?

Yup:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1074141248 (LWP 5500)]
0x420744b0 in _int_free () from /lib/tls/libc.so.6
(gdb) bt
#0 0x420744b0 in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x08055a1f in _PyObject_Del ()
#3 0x080593c5 in PyString_AsEncodedString ()
#4 0x0807db19 in _PyEval_SliceIndex ()
#5 0x080c41c9 in PyFunction_SetClosure ()
#6 0x080cdf49 in PyDict_New ()
#7 0x080ce35e in PyDict_SetItem ()
#8 0x080d1799 in _PyModule_Clear ()
#9 0x0808e81a in PyImport_Cleanup ()
#10 0x08096114 in Py_Finalize ()
#11 0x080539bf in Py_Main ()
#12 0x08053469 in main ()
#13 0x42015574 in __libc_start_main () from /lib/tls/libc.so.6

Now, if I run it interactively instead (still within gdb), I get this
backtrace when it segfaults after I try to exit Python:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1074141248 (LWP 5502)]
0x420744b0 in _int_free () from /lib/tls/libc.so.6
(gdb) bt
#0 0x420744b0 in _int_free () from /lib/tls/libc.so.6
#1 0x420734d6 in free () from /lib/tls/libc.so.6
#2 0x08055a1f in _PyObject_Del ()
#3 0x080593c5 in PyString_AsEncodedString ()
#4 0x080ce4d9 in PyDict_DelItem ()
#5 0x0805f1f2 in PyString_Fini ()
#6 0x08096153 in Py_Finalize ()
#7 0x080539bf in Py_Main ()
#8 0x08053469 in main ()
#9 0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top