Python on Altix

Discussion in 'Python' started by Todd Miller, Oct 7, 2003.

  1. Todd Miller

    Todd Miller Guest

    I'm trying to set up Python on a 64-bit Linux super computer which looks
    like this:

    Linux somewhere.in.au 2.4.20-sgi221r3 #1 SMP Tue Jul 22 15:32:18 PDT
    2003 ia64 unknown

    I configure Python like this:

    ../configure --prefix=$HOME --with-pydebug --without-threads

    But as soon as I try to run my software self-tests (for the numarray
    Numeric-like array package), I get this:

    python(11030): unaligned access to 0x60000fffffff61cc, ip=0x40000000002b9221

    Debug memory block at address p=0x6000000000718310:
    88 bytes originally requested
    The 4 pad bytes at p-4 are FORBIDDENBYTE, as expected.
    The 4 pad bytes at tail=0x6000000000718368 are not all
    FORBIDDENBYTE (0xfb):
    at tail+0: 0x02 *** OUCH
    at tail+1: 0x00 *** OUCH
    at tail+2: 0x00 *** OUCH
    at tail+3: 0x00 *** OUCH
    The block was made by call #0 to debug malloc/realloc.
    Data at p: 00 00 00 00 00 00 00 00 ... 01 00 00 00 00 00 00 00
    Fatal Python error: bad trailing pad byte
    Abort (core dumped)

    I've poked around with this for a few hours, but I'm not getting much
    out of GDB. Does anyone have any suggestions on how to get Python up
    and running on an Altix or how to solve this problem more generally?

    Please CC me if you respond.

    Thanks,
    Todd
     
    Todd Miller, Oct 7, 2003
    #1
    1. Advertising

  2. You wrote:

    Todd> I configure Python like this:

    Todd> ./configure --prefix=$HOME --with-pydebug --without-threads

    then you jump to:

    Todd> But as soon as I try to run my software self-tests (for the
    Todd> numarray Numeric-like array package), I get this:

    ...

    then later:

    Todd> I've poked around with this for a few hours, but I'm not getting
    Todd> much out of GDB. Does anyone have any suggestions on how to get
    Todd> Python up and running on an Altix or how to solve this problem
    Todd> more generally?

    Maybe I'm being too pdeantic, but did Python's own test suite run
    successfully? You seem to be trying to run before you're walking.
    32->64-bit issues do pop up from time-to-time. It's possible that the
    Python developers have tackled this already but that the numarray folks
    haven't. If you've run "make test" successfully in your Python build
    directory, I would imagine you've figured out "how to get Python up and
    running on an Altix", and that the problem lives in the numarray code. If
    not, you probably need to ignore numarray for the time being while you debug
    your basic Python configuration.

    At first glance, it looks like whoever wrote to the block in question got
    carried away and scribbled off the end of the block. You might try gdb's
    "watch" command to keep an eye on that address. You'll probably want to be
    careful when you set the watchpoint (whittle your failing test case down as
    small as possible, set the watchpoint as close as possible to the error) to
    avoid slowing your code down to an unmanageable crawl.

    Skip
     
    Skip Montanaro, Oct 7, 2003
    #2
    1. Advertising

  3. Todd Miller

    Todd Miller Guest

    On Tue, 2003-10-07 at 11:10, Skip Montanaro wrote:
    >
    > You wrote:
    >
    > Todd> I configure Python like this:
    >
    > Todd> ./configure --prefix=$HOME --with-pydebug --without-threads
    >
    > then you jump to:
    >
    > Todd> But as soon as I try to run my software self-tests (for the
    > Todd> numarray Numeric-like array package), I get this:
    >
    > ...
    >
    > then later:
    >
    > Todd> I've poked around with this for a few hours, but I'm not getting
    > Todd> much out of GDB. Does anyone have any suggestions on how to get
    > Todd> Python up and running on an Altix or how to solve this problem
    > Todd> more generally?
    >
    > Maybe I'm being too pdeantic, but did Python's own test suite run
    > successfully?


    Not completely, but it looked close. There were two failures which may
    even be "expected":

    2 tests failed:
    test_bsddb test_imp
    38 tests skipped:
    test_aepack test_al test_asynchat test_audioop test_bsddb185
    test_bsddb3 test_cd test_cl test_curses test_dl test_email_codecs
    test_fork1 test_gl test_imageop test_imgfile test_linuxaudiodev
    test_logging test_macfs test_macostools test_normalization
    test_ossaudiodev test_pep277 test_plistlib test_queue test_rgbimg
    test_scriptpackages test_socket test_socket_ssl test_socketserver
    test_sunaudiodev test_thread test_threaded_import
    test_threadedtempfile test_threading test_timeout test_urllibnet
    test_winreg test_winsound
    12 skips unexpected on linux2:
    test_threadedtempfile test_threaded_import test_fork1 test_rgbimg
    test_threading test_socket test_thread test_queue test_asynchat
    test_audioop test_imageop test_logging

    Here are the details:

    test test_imp failed -- expected imp.lock_held() to be True

    test_bsddb
    test test_bsddb failed -- errors occurred; run in verbose mode for
    details
    test_bsddb185
    test_bsddb185 skipped -- No module named bsddb185
    test_bsddb3
    test_bsddb3 skipped -- Use of the `bsddb' resource not enabled


    > You seem to be trying to run before you're walking.
    > 32->64-bit issues do pop up from time-to-time. It's possible that the
    > Python developers have tackled this already but that the numarray folks
    > haven't.


    Bingo. I maintain and extend numarray.

    > If you've run "make test" successfully in your Python build
    > directory, I would imagine you've figured out "how to get Python up and
    > running on an Altix", and that the problem lives in the numarray code. If
    > not, you probably need to ignore numarray for the time being while you debug
    > your basic Python configuration.


    Sounds like I may be OK already.

    > At first glance, it looks like whoever wrote to the block in question got
    > carried away and scribbled off the end of the block. You might try gdb's
    > "watch" command to keep an eye on that address. You'll probably want to be
    > careful when you set the watchpoint (whittle your failing test case down as
    > small as possible, set the watchpoint as close as possible to the error) to
    > avoid slowing your code down to an unmanageable crawl.


    This sounds like good advice, I'll give it a try and report back
    later.

    Thanks for the help,
    Todd

    --
    Todd Miller
    STSCI / ESS / SSB
     
    Todd Miller, Oct 7, 2003
    #3
  4. Todd Miller

    Todd Miller Guest

    Re: Python on Altix (re-cap)

    Todd Miller wrote:
    > I'm trying to set up Python on a 64-bit Linux super computer which looks
    > like this:
    >
    > Linux somewhere.in.au 2.4.20-sgi221r3 #1 SMP Tue Jul 22 15:32:18 PDT
    > 2003 ia64 unknown
    >
    > I configure Python like this:
    >
    > ./configure --prefix=$HOME --with-pydebug --without-threads
    >
    > But as soon as I try to run my software self-tests (for the numarray
    > Numeric-like array package), I get this:
    >
    > python(11030): unaligned access to 0x60000fffffff61cc,
    > ip=0x40000000002b9221


    This message is apparently a kernel level debug message. It is not on
    stdout or stderr. This message went away when I switched off
    --with-pydebug.

    >
    > Debug memory block at address p=0x6000000000718310:
    > 88 bytes originally requested
    > The 4 pad bytes at p-4 are FORBIDDENBYTE, as expected.
    > The 4 pad bytes at tail=0x6000000000718368 are not all FORBIDDENBYTE
    > (0xfb):
    > at tail+0: 0x02 *** OUCH
    > at tail+1: 0x00 *** OUCH
    > at tail+2: 0x00 *** OUCH
    > at tail+3: 0x00 *** OUCH
    > The block was made by call #0 to debug malloc/realloc.
    > Data at p: 00 00 00 00 00 00 00 00 ... 01 00 00 00 00 00 00 00
    > Fatal Python error: bad trailing pad byte
    > Abort (core dumped)


    This message was related to a bug in the numarray setup.py. Once I
    configured numarray for LP64=1, it disappeared.

    Interestingly, there is no discernable difference between the
    sys.platform of i386 Linux and the sys.platform of a 64-bit x86 Linux;
    they are both "linux2". I worked around this by adding a branch based
    on sys.maxint > 2**31-1 to distinguish between 32 and 64 bit linux.

    I fixed a total of 3 other bugs in numarray, one out-and-out bug, one
    64-bit self-test issue, and one C type casting portability issue.

    >
    > I've poked around with this for a few hours, but I'm not getting much
    > out of GDB. Does anyone have any suggestions on how to get Python up
    > and running on an Altix or how to solve this problem more generally?
    >


    Even with Skip's suggestion to use watchpoints, I still had poor luck
    with GDB compared to 32-bit Linux. I did learn the *<hex_address>
    syntax for specifying addresses rather than symbolic names for
    breakpoints and watchpoints.

    With pydebug switched on, numarray still dumps core (on exit?) like this:

    python: Modules/gcmodule.c:231: visit_decref: Assertion `gc->gc.gc_refs
    != 0' failed.

    With pydebug off, all is quiet and all the numarray self-tests now pass.


    > Please CC me if you respond.


    Thanks again for the pointers.

    Todd
     
    Todd Miller, Oct 8, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    772
  2. Paul Moore
    Replies:
    0
    Views:
    655
    Paul Moore
    Mar 1, 2008
  3. Martin v. Löwis
    Replies:
    0
    Views:
    691
    Martin v. Löwis
    Mar 1, 2008
  4. Senthil Kumaran
    Replies:
    0
    Views:
    601
    Senthil Kumaran
    Jan 17, 2011
  5. Bil Kleb
    Replies:
    17
    Views:
    229
    Bil Kleb
    Sep 8, 2004
Loading...

Share This Page