Segmentation faults using threads

M

Mathias

Dear ng,

I use the thread module (not threading) for a client/server app where I
distribute large amounts of pickled data over ssh tunnels.
Now I get regular Segmentation Faults during high load episodes. I use a
semaphore to have pickle/unpickle run nonthreaded, but I still get
frequent nondeterministic segmentation faults.
Since there is no traceback after a sf, I have no clue what exactly
happened, and debugging a multithreaded app is no fun anyway :(

Can someone recommend me how to get extra info during such a crash?
Or any other opinion on where the problem might lie?

Thanks a lot,
Mathias
 
D

Daniel Nogradi

I use the thread module (not threading) for a client/server app where I
distribute large amounts of pickled data over ssh tunnels.
Now I get regular Segmentation Faults during high load episodes. I use a
semaphore to have pickle/unpickle run nonthreaded, but I still get
frequent nondeterministic segmentation faults.
Since there is no traceback after a sf, I have no clue what exactly
happened, and debugging a multithreaded app is no fun anyway :(

Can someone recommend me how to get extra info during such a crash?
Or any other opinion on where the problem might lie?


Hi, it would be helpful if you posted a minimalistic code snippet
which showed the problem you describe.

Daniel
 
M

Mathias

Hi, it would be helpful if you posted a minimalistic code snippet
which showed the problem you describe.

Daniel

I wish I could! If I knew exactly where the effect takes place I could
probably circumvent it. All I know know is that it happens under high
load and with a lot of waitstates I can reduce the propability of
crashing. So there must be some race condition somewhere I think.

Is there a way to analyze where the crash took place? I guess I can have
a core dumped and somehow analyze it, but that's probably very hard to do.

Would a profiler work which records the function call structure?

Does someone have experience with threading in python - are there
non-threadsafe functions I should know about?

Thanks,
Mathias
 
J

John Nagle

What module are you using for SSH?

What's in your program that isn't pure Python?
The problem is probably in some non-Python component; you shouldn't
be able to force a memory protection error from within Python code.

Also note that the "marshal" module may be unsafe.

John Nagle
 
M

Mathias

John said:
What module are you using for SSH?

What's in your program that isn't pure Python?
The problem is probably in some non-Python component; you shouldn't
be able to force a memory protection error from within Python code.

Also note that the "marshal" module may be unsafe.

John Nagle


I'm using os.popen2() to pipe into an ssh session via stdin/stdout.
That's probably not the elegant way...
Other modules: scipy 0.3.2 (with Numeric 24.2) and python 2.4

Does pickle/cPickle count as part of the marshal module?

Mathias
 
M

Mathias

PS: setting sys.setcheckinterval(1) reduces the probablilty of a failure
as well, but definetely at a performance cost.
 
H

Hendrik van Rooyen

Does someone have experience with threading in python - are there
non-threadsafe functions I should know about?

how do your threads communicate with one another - are there any
globals that are accessed from different threads?

strange this - you should get an exception, not a segment fault if its
in the python bits..

- Hendrik
 
M

Mathias

What module are you using for SSH?

What's in your program that isn't pure Python?
The problem is probably in some non-Python component; you shouldn't
be able to force a memory protection error from within Python code.

It looks like the error could be in scipy/Numeric, when a large array's
type is changed, like this:
Segmentation fault

if I use zeros directly for allocation of the doubles it works as expected:
Traceback (most recent call last):

I use python 2.4, but my scipy and Numeric aren't quite up-to-date:
scipy version 0.3.2, Numeric v 24.2
 
J

John Nagle

Mathias said:
It looks like the error could be in scipy/Numeric, when a large array's
type is changed, like this:

Segmentation fault

if I use zeros directly for allocation of the doubles it works as expected:

Traceback (most recent call last):


I use python 2.4, but my scipy and Numeric aren't quite up-to-date:
scipy version 0.3.2, Numeric v 24.2

That sounds like the case where the array has to be reallocated from
4-byte floats to 8-byte doubles is being botched.

Take a look at
"http://www.mail-archive.com/[email protected]/msg02033.html"

and then at array_cast in arrraymethods.c of scipy. There may be a
reference count bug in that C code. I'm not familiar enough with
Python reference count internals to be sure, though.

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top