Strange crash issue on Windows w/ PyGTK, Cairo...

C

CJ Kucera

Hello list!

I'm having a strange issue, and I'm not entirely certain yet where
the actual problem is (ie, Python, PyGTK, or gtk+), but I figure I'll
start here. Bear with me, this'll probably be a long explanation...

I've been building an app which is meant to be run on both Linux and
Windows. It uses PyGTK for its GUI, and the main area of the app is
a gtk.DrawingArea which I draw on using PyCairo. I've been developing
on Linux, and it works great on that platform, with no issues that
I'm aware of. When running on Windows, though, the app exhibits the
following behavior:

1) When the .py of the main file which runs the application GUI first
gets compiled to a .pyc (ie: the first time it's run, or the first
time after .py modification), the application runs totally fine, with
no apparent problems.

2) Any attempt AFTER that, the application will start up, *start* to
do its data-loading, but then almost immediately crash with an
enigmatic "python.exe has generated errors and will be closed by
Windows." When it does so, there is no output whatsoever to the
console that the application was launched from, and the crash doesn't
always happen in exactly the same place.

The pattern remains the same, though - if the .pyc needs to be compiled,
the application works fine, but if not: boom.

I've been steadily stripping the program down to what I hoped would be a
small, reproducible app that I could post here, and I do intend to do so
still, but it's rather slow going. For now, I was hoping to see if
anyone's ever heard of behavior like this before, and might know what
to do about it, or at least a possible avenue of attack.

As I've been reducing the program down, I've encountered even stranger
(IMO) behavior... In one instance, changing a function name seemed to
make the program work. I took out the handler which draws my app's
"About" box, and suddenly my problem went away. Occasionally I would
remove a function and the app would suddenly *always* fail with that
Windows crash error, and I'd have to put the function back in. Keep
in mind, these are functions which *aren't being called anywhere.*

Sometimes I could replace a function's entire contents with just "pass"
and the app would suddenly behave properly, or not behave at all.

It's almost as if whatever's doing the byte-compilation is getting
screwed up somehow, and really small changes to parts of the file which
aren't even being touched are having a huge impact on the application as
a whole. It's seriously vexing, and certainly the oddest problems I've
seen in Python.

Windows versions I can reproduce this on: XP and win2k
Python versions I've reproduced this on:
Python 2.5.4 with:
PyGTK 2.12.1-2-win32-py2.5
PyGObject 2.14.1-1.win32-py2.5
PyCairo 1.4.12-1.win32-py2.5
Python 2.6.1 with:
PyGTK 2.12.1-3-win32-py2.6
PyGObject 2.14.2-2.win32-py2.6
PyCairo 1.4.12-2.win32-py2.6
gtk+ 2.12.9-win32-2 (from http://sf.net/projects/gladewin32 , which is
the version linked to from pygtk.org)

The 2.6 Python stuff I've actually only tried on win2k so far, not XP,
though given my history with this, I suspect that that wouldn't make a
difference.

Since gtk+ is the one bit of software that hasn't been swapped out for
another version, I suppose that perhaps that's where the issue is, but
it seems like Python should be able to at least throw an Exception or
something instead of just having a Windows crash. And having it work
the FIRST time, when the .pyc's getting compiled, is rather suspicious.

Anyway, I'll continue trying to pare this app down to one manageable
script which I can post here, but until then I'd be happy to hear ideas
from anyone else about this.

Thanks!

-CJ
 
B

bieffe62

Hello list!

I'm having a strange issue, and I'm not entirely certain yet where
the actual problem is (ie, Python, PyGTK, or gtk+), but I figure I'll
start here.  Bear with me, this'll probably be a long explanation...

I've been building an app which is meant to be run on both Linux and
Windows.  It uses PyGTK for its GUI, and the main area of the app is
a gtk.DrawingArea which I draw on using PyCairo.  I've been developing
on Linux, and it works great on that platform, with no issues that
I'm aware of.  When running on Windows, though, the app exhibits the
following behavior:

  1) When the .py of the main file which runs the application GUI first
  gets compiled to a .pyc (ie: the first time it's run, or the first
  time after .py modification), the application runs totally fine, with
  no apparent problems.

  2) Any attempt AFTER that, the application will start up, *start* to
  do its data-loading, but then almost immediately crash with an
  enigmatic "python.exe has generated errors and will be closed by
  Windows."  When it does so, there is no output whatsoever to the
  console that the application was launched from, and the crash doesn't
  always happen in exactly the same place.

The pattern remains the same, though - if the .pyc needs to be compiled,
the application works fine, but if not: boom.

I've been steadily stripping the program down to what I hoped would be a
small, reproducible app that I could post here, and I do intend to do so
still, but it's rather slow going.  For now, I was hoping to see if
anyone's ever heard of behavior like this before, and might know what
to do about it, or at least a possible avenue of attack.

As I've been reducing the program down, I've encountered even stranger
(IMO) behavior...  In one instance, changing a function name seemed to
make the program work.  I took out the handler which draws my app's
"About" box, and suddenly my problem went away.  Occasionally I would
remove a function and the app would suddenly *always* fail with that
Windows crash error, and I'd have to put the function back in.  Keep
in mind, these are functions which *aren't being called anywhere.*

Sometimes I could replace a function's entire contents with just "pass"
and the app would suddenly behave properly, or not behave at all.

It's almost as if whatever's doing the byte-compilation is getting
screwed up somehow, and really small changes to parts of the file which
aren't even being touched are having a huge impact on the application as
a whole.  It's seriously vexing, and certainly the oddest problems I've
seen in Python.

Windows versions I can reproduce this on: XP and win2k
Python versions I've reproduced this on:
  Python 2.5.4 with:
    PyGTK 2.12.1-2-win32-py2.5
    PyGObject 2.14.1-1.win32-py2.5
    PyCairo 1.4.12-1.win32-py2.5
  Python 2.6.1 with:
    PyGTK 2.12.1-3-win32-py2.6
    PyGObject 2.14.2-2.win32-py2.6
    PyCairo 1.4.12-2.win32-py2.6
gtk+ 2.12.9-win32-2 (fromhttp://sf.net/projects/gladewin32, which is
the version linked to from pygtk.org)

The 2.6 Python stuff I've actually only tried on win2k so far, not XP,
though given my history with this, I suspect that that wouldn't make a
difference.

Since gtk+ is the one bit of software that hasn't been swapped out for
another version, I suppose that perhaps that's where the issue is, but
it seems like Python should be able to at least throw an Exception or
something instead of just having a Windows crash.  And having it work
the FIRST time, when the .pyc's getting compiled, is rather suspicious.

Anyway, I'll continue trying to pare this app down to one manageable
script which I can post here, but until then I'd be happy to hear ideas
from anyone else about this.

Thanks!

-CJ

It looks like some of the C extension you are using is causing a
segfault or similar in python
interpreter (or it could be a bug in the interpreter itself, but it is
a lot less likely).
I would suggest to fill the startup portion of your code with trace
statements to try to understand which module function is the
troublesome one, then go looking in the big tracking system of the
module, try the newest version and ask on the dedicated mailing list
if any.

Making a small script that cabn reproduce the bug is also a very good
idea, and will help speed-up the problem solution.

Ciao
 
C

CJ Kucera

It looks like some of the C extension you are using is causing a
segfault or similar in python
interpreter (or it could be a bug in the interpreter itself, but it is
a lot less likely).

Okay... I assume by "C extension" you'd include the PyGTK stuff, right?
(ie: pycairo, pygobject, and pygtk) Those are the only extras I've got
installed, otherwise it's just a base Python install.

Would a bad extension really cause this kind of behavior though?
Specifically the working-the-first-time and crash-subsqeuent-times? Do
C extensions contribute to the bytecode generated while compiling?
I would suggest to fill the startup portion of your code with trace
statements to try to understand which module function is the
troublesome one, then go looking in the big tracking system of the
module, try the newest version and ask on the dedicated mailing list
if any.

Are you talking about just throwing in various print statements, to find
out where exactly it's dying? Or I see that there is an actual "trace"
module in Python... I did do the former awhile ago and didn't find
anything conclusive really. It was when approaching it from that angle
that I stumbled across the case that if I simply renamed one of my
functions, everything started working again. I'll do this a bit more
once I've gotten the program down to a more manageable level.
Making a small script that cabn reproduce the bug is also a very good
idea, and will help speed-up the problem solution.

Right, that's the goal. Right now it's still pretty unwieldy, still.
I'll keep on it.

Thanks for the response!

-CJ
 
M

MRAB

CJ said:
Okay... I assume by "C extension" you'd include the PyGTK stuff, right?
(ie: pycairo, pygobject, and pygtk) Those are the only extras I've got
installed, otherwise it's just a base Python install.

Would a bad extension really cause this kind of behavior though?
Specifically the working-the-first-time and crash-subsqeuent-times? Do
C extensions contribute to the bytecode generated while compiling?
[snip]
One time I was doing some speed tests and I found that the code was
faster the first time it ran, when it had to compile to .pyc. If there
had been caching then it would've been faster after the first time, but
_slower_?

My conclusion was that it was down to memory allocation. If the
interpreter had claimed memory from the OS because it had to compile to
..pyc then more would be immediately available during my timing, hence
higher speed.

So perhaps your problem is that one of the C extensions segfaults unless
more memory is immediately available because the interpreter had to
compile to .pyc.
 
B

bieffe62

Okay...  I assume by "C extension" you'd include the PyGTK stuff, right?
(ie: pycairo, pygobject, and pygtk)  Those are the only extras I've got
installed, otherwise it's just a base Python install.

Would a bad extension really cause this kind of behavior though?
Specifically the working-the-first-time and crash-subsqeuent-times?  Do
C extensions contribute to the bytecode generated while compiling?

If you have worked with C/C++, you know that memory-related bugs can
be very tricky.
More than once - working with C code - I had crashes that disappeared
if I just added
a 'printf', because the memory allocation scheme changed and the
memory corrupted was not anymore
relevant.

Ciao
 
C

CJ Kucera

If you have worked with C/C++, you know that memory-related bugs can
be very tricky.
More than once - working with C code - I had crashes that disappeared
if I just added
a 'printf', because the memory allocation scheme changed and the
memory corrupted was not anymore
relevant.

Well, you turned out to be dead right about this, as I suppose should
have been pretty obvious given the nature of the problems I was having.

Anyway, the issue turned out to be zlib.decompress() - for larger sets
of data, if I wasn't specifying "bufsize," the malloc()s that it was
doing behind-the-scenes must have been clobbering memory. As soon as I
specified bufsize, everything was totally kosher.

Once I'm a bit more awake tomorrow I'll put together a testcase and send
it in to the bug tracker.

This does bring up one question: for larger chunks of data, is it More
Appropriate to use a zlib decompression object instead of just passing
it all through zlib.decompress()?

Thanks, everyone...

-CJ
 
C

CJ Kucera

CJ said:
Anyway, the issue turned out to be zlib.decompress() - for larger sets
of data, if I wasn't specifying "bufsize," the malloc()s that it was
doing behind-the-scenes must have been clobbering memory. As soon as I
specified bufsize, everything was totally kosher.

Okay, I've got a reproducible testcase of this available up here:

http://apocalyptech.com/pygtk-zlib/

I'm no longer *totally* convinced that it's a zlib issue... zlib's call
actually returns a valid string, and the error happens later in the app.
I've yet to be able to engineer a crash using anything other than that
cairo.ImageSurface.create_from_png() function, so it's possible that
specifying "bufsize" in zlib.decompress() merely allocates memory in
such a way that a bug in PyCairo doesn't come to light in that case.

So, I'm not really sure if I should submit this to Python or PyGTK's
tracker yet. Could someone check it out and let me know what you think?
That'd be great. Thanks!

As I mention on that page, removing "import os" and "import sys" will
"fix" the issue on XP, though you can remove them on win2k and still see
the crash.

Thanks,
CJ
 
C

CJ Kucera

CJ said:
Okay, I've got a reproducible testcase of this available up here:
http://apocalyptech.com/pygtk-zlib/

Hello, two brief notes here:

1) Someone on the PyGTK list mentioned that I should really be using
StringIO instead of my own hacky attempt at one, in there, and of course
he was right. Replacing my class with StringIO doesn't result in any
changed behavior, at least, so I can definitely rule out any weirdness
that my own file-like class could have introduced. I've uploaded a new
version of run.py to the above URL which includes that change.

2) It looks like specifying "bufsize" isn't actually the totally magic
bullet for this. I had put in that fix to my application, so it was
using bufsize, and things seemed to be working all right, but after four
or five data loads, I got that Windows crash again, regardless. The fix
does make the application stable enough that I'm not that worried about
it, at least, but it doesn't seem to have totally bypassed whatever bug
I'm running into.

-CJ
 
C

CJ Kucera

CJ said:
Okay, I've got a reproducible testcase of this available up here:
http://apocalyptech.com/pygtk-zlib/

I'm no longer *totally* convinced that it's a zlib issue... zlib's call
actually returns a valid string, and the error happens later in the app.

Hello, again, list. One last update on this, in case anyone happened to
be curious about it. After talking with the PyCairo folks, it looks
like this *was* a PyCairo issue after all, and the changes to the zlib
call which had apparently fixed the issue were just incidental,
basically, because of how the memory allocation ended up changing
because of the modified vars.

Anyway, a recent fix in PyCairo CVS seems to take care of it. zlib, as
probably should have been expected, is vindicated!

-CJ
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top