Segmentation faults using threads

Discussion in 'Python' started by Mathias, Feb 13, 2007.

  1. Mathias

    Mathias Guest

    Dear ng,

    I use the thread module (not threading) for a client/server app where I
    distribute large amounts of pickled data over ssh tunnels.
    Now I get regular Segmentation Faults during high load episodes. I use a
    semaphore to have pickle/unpickle run nonthreaded, but I still get
    frequent nondeterministic segmentation faults.
    Since there is no traceback after a sf, I have no clue what exactly
    happened, and debugging a multithreaded app is no fun anyway :(

    Can someone recommend me how to get extra info during such a crash?
    Or any other opinion on where the problem might lie?

    Thanks a lot,
    Mathias
    Mathias, Feb 13, 2007
    #1
    1. Advertising

  2. > I use the thread module (not threading) for a client/server app where I
    > distribute large amounts of pickled data over ssh tunnels.
    > Now I get regular Segmentation Faults during high load episodes. I use a
    > semaphore to have pickle/unpickle run nonthreaded, but I still get
    > frequent nondeterministic segmentation faults.
    > Since there is no traceback after a sf, I have no clue what exactly
    > happened, and debugging a multithreaded app is no fun anyway :(
    >
    > Can someone recommend me how to get extra info during such a crash?
    > Or any other opinion on where the problem might lie?



    Hi, it would be helpful if you posted a minimalistic code snippet
    which showed the problem you describe.

    Daniel
    Daniel Nogradi, Feb 13, 2007
    #2
    1. Advertising

  3. Mathias

    Mathias Guest

    > Hi, it would be helpful if you posted a minimalistic code snippet
    > which showed the problem you describe.
    >
    > Daniel


    I wish I could! If I knew exactly where the effect takes place I could
    probably circumvent it. All I know know is that it happens under high
    load and with a lot of waitstates I can reduce the propability of
    crashing. So there must be some race condition somewhere I think.

    Is there a way to analyze where the crash took place? I guess I can have
    a core dumped and somehow analyze it, but that's probably very hard to do.

    Would a profiler work which records the function call structure?

    Does someone have experience with threading in python - are there
    non-threadsafe functions I should know about?

    Thanks,
    Mathias
    Mathias, Feb 13, 2007
    #3
  4. Mathias

    John Nagle Guest

    Daniel Nogradi wrote:
    >> I use the thread module (not threading) for a client/server app where I
    >> distribute large amounts of pickled data over ssh tunnels.


    What module are you using for SSH?

    What's in your program that isn't pure Python?
    The problem is probably in some non-Python component; you shouldn't
    be able to force a memory protection error from within Python code.

    Also note that the "marshal" module may be unsafe.

    John Nagle
    John Nagle, Feb 13, 2007
    #4
  5. Mathias

    Mathias Guest

    John Nagle wrote:
    > Daniel Nogradi wrote:
    >>> I use the thread module (not threading) for a client/server app where I
    >>> distribute large amounts of pickled data over ssh tunnels.

    >
    > What module are you using for SSH?
    >
    > What's in your program that isn't pure Python?
    > The problem is probably in some non-Python component; you shouldn't
    > be able to force a memory protection error from within Python code.
    >
    > Also note that the "marshal" module may be unsafe.
    >
    > John Nagle



    I'm using os.popen2() to pipe into an ssh session via stdin/stdout.
    That's probably not the elegant way...
    Other modules: scipy 0.3.2 (with Numeric 24.2) and python 2.4

    Does pickle/cPickle count as part of the marshal module?

    Mathias
    Mathias, Feb 13, 2007
    #5
  6. Mathias

    Mathias Guest

    PS: setting sys.setcheckinterval(1) reduces the probablilty of a failure
    as well, but definetely at a performance cost.
    Mathias, Feb 13, 2007
    #6
  7. "Mathias" <> wrote:


    > Does someone have experience with threading in python - are there
    > non-threadsafe functions I should know about?


    how do your threads communicate with one another - are there any
    globals that are accessed from different threads?

    strange this - you should get an exception, not a segment fault if its
    in the python bits..

    - Hendrik
    Hendrik van Rooyen, Feb 14, 2007
    #7
  8. Mathias

    Mathias Guest

    >
    > What module are you using for SSH?
    >
    > What's in your program that isn't pure Python?
    > The problem is probably in some non-Python component; you shouldn't
    > be able to force a memory protection error from within Python code.
    >


    It looks like the error could be in scipy/Numeric, when a large array's
    type is changed, like this:

    >>> from scipy import *
    >>> a=zeros(100000000,'b') #100 MiB
    >>> b=a.copy().astype('d') #800 MiB, ok
    >>> a=zeros(1000000000,'b') #1GiB
    >>> b=a.copy().astype('d') #8GiB, fails with sf

    Segmentation fault

    if I use zeros directly for allocation of the doubles it works as expected:

    >>> from scipy import *
    >>> a=zeros(1000000000,'d') #8GiB, fails with python exception

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    MemoryError: can't allocate memory for array
    >>>


    I use python 2.4, but my scipy and Numeric aren't quite up-to-date:
    scipy version 0.3.2, Numeric v 24.2
    Mathias, Feb 14, 2007
    #8
  9. Mathias

    John Nagle Guest

    Mathias wrote:
    >>
    >> What module are you using for SSH?
    >>
    >> What's in your program that isn't pure Python?
    >> The problem is probably in some non-Python component; you shouldn't
    >> be able to force a memory protection error from within Python code.
    >>

    >
    > It looks like the error could be in scipy/Numeric, when a large array's
    > type is changed, like this:
    >
    > >>> from scipy import *
    > >>> a=zeros(100000000,'b') #100 MiB
    > >>> b=a.copy().astype('d') #800 MiB, ok
    > >>> a=zeros(1000000000,'b') #1GiB
    > >>> b=a.copy().astype('d') #8GiB, fails with sf

    > Segmentation fault
    >
    > if I use zeros directly for allocation of the doubles it works as expected:
    >
    > >>> from scipy import *
    > >>> a=zeros(1000000000,'d') #8GiB, fails with python exception

    > Traceback (most recent call last):
    > File "<stdin>", line 1, in ?
    > MemoryError: can't allocate memory for array
    > >>>

    >
    > I use python 2.4, but my scipy and Numeric aren't quite up-to-date:
    > scipy version 0.3.2, Numeric v 24.2


    That sounds like the case where the array has to be reallocated from
    4-byte floats to 8-byte doubles is being botched.

    Take a look at
    "http://www.mail-archive.com//msg02033.html"

    and then at array_cast in arrraymethods.c of scipy. There may be a
    reference count bug in that C code. I'm not familiar enough with
    Python reference count internals to be sure, though.

    John Nagle
    John Nagle, Feb 14, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stanley S
    Replies:
    16
    Views:
    2,508
    Keith Thompson
    Dec 22, 2005
  2. Digital Puer
    Replies:
    18
    Views:
    707
    Ron Natalie
    Dec 28, 2005
  3. ZillionDollarSadist

    Segmentation faults on "new"

    ZillionDollarSadist, Jan 17, 2007, in forum: C++
    Replies:
    6
    Views:
    352
    Jacek Dziedzic
    Jan 18, 2007
  4. George Sakkis

    Debugging segmentation faults

    George Sakkis, Mar 7, 2007, in forum: Python
    Replies:
    4
    Views:
    381
    John Nagle
    Mar 8, 2007
  5. Kaspar Schiess

    YAML custom load: Segmentation faults

    Kaspar Schiess, Jun 25, 2004, in forum: Ruby
    Replies:
    2
    Views:
    134
    Kaspar Schiess
    Jun 25, 2004
Loading...

Share This Page