L
Larry Hastings
I'm an indie shareware Windows game developer. In indie shareware
game development, download size is terribly important; conventional
wisdom holds that--even today--your download should be 5MB or less.
I'd like to use Python in my games. However, python24.dll is 1.86MB,
and zips down to 877k. I can't afford to devote 1/6 of my download
to just the scripting interpreter; I've got music, and textures, and
my own crappy code to ship.
Following a friend's suggestion, as an experiment I downloaded the
Python 2.4.2 source, then set about stripping out everything I could.
I removed:
* Unicode support, including the CJK codecs
* All doc strings
* *Every* module written in C
Now when I build, python24.dll is 570k, and zips down to about 260k.
But I learned some things on the way.
First and foremost: turning off Py_USING_UNICODE *breaks the build*
on Windows. The following list of breakages were all fixed with
judicious applications of #ifdef Py_USING_UNICODE:
* The implementation of "multi-byte codecs" (CJK codecs) implicitly
assumes that they can use all the Unicode facilities. So all the
files in "Modules/cjkcodecs" fail to build.
* Obviously, the Unicode string object depends on Unicode support,
so Objects/unicode* doesn't build.
* There are several spots in the code that need to handle Unicode
strings in some slightly special way, and assume Unicode is turned
on. E.g.:
* Modules/posixmodule.c, posix__getfullpathname(), line 1745
* same file, posix_open(), starting on line 5201
* Objects/fileobject.c, open_the_file(), starting on line 158
* _winreg.c, Py2Reg(), starting on lines 724 and 777
In addition, there was one slightly more complicated problem: _winreg.c
assumes it should call PyUnicode_DecodeMBCS() to turn strings pulled
from the registry into Unicode strings. I'm not sure what the correct
thing to do here is; I went with changing the calls from
PyUnicode_DecodeMBCS() to PyString_FromStringAndSize() for non-Unicode
builds.
Of course, it's not the most important thing in the world--after all,
I'm the first person to even *notice*, right? But it seems a shame
that
one can break the build so easily. If it pleases the stewards of
Python, I would be happy to submit patches that fix the non-"using
Unicode" build.
Second of all, the dumb-as-a-bag-of-rocks Windows linker (at least
the one used by VC++ under MSVS .Net 2003) *links in unused static
symbols*. If I want to excise the code for a module, it is not
sufficient to comment-out the relevant _inittab line in config.c.
Nor does it help if I comment out the "extern" prototype for the
init function. As far as I can tell, the only way to *really* get
rid of a module, including all its static functions and static data,
is to actually *remove all the code* (with comments, or #if, or
whatnot). What a nosebleed, huh?
So in order to build my *really* minimal python24.dll, I have to hack
up the source something fierce. It would be pleasant if the Python
source code provided an easy facility for turning off modules at
compile-time. I would be happy to propose something / write a PEP
/ submit patches to do such a thing, if there is a chance that such
a thing could make it into the official Python source. However, I
realize that this has terribly limited appeal; that, and the fact
that Python releases are infrequent, makes me think it's not a
terrible hardship if I had to re-hack up each new Python release
by hand.
Whatcha think, froods?
/larry/
game development, download size is terribly important; conventional
wisdom holds that--even today--your download should be 5MB or less.
I'd like to use Python in my games. However, python24.dll is 1.86MB,
and zips down to 877k. I can't afford to devote 1/6 of my download
to just the scripting interpreter; I've got music, and textures, and
my own crappy code to ship.
Following a friend's suggestion, as an experiment I downloaded the
Python 2.4.2 source, then set about stripping out everything I could.
I removed:
* Unicode support, including the CJK codecs
* All doc strings
* *Every* module written in C
Now when I build, python24.dll is 570k, and zips down to about 260k.
But I learned some things on the way.
First and foremost: turning off Py_USING_UNICODE *breaks the build*
on Windows. The following list of breakages were all fixed with
judicious applications of #ifdef Py_USING_UNICODE:
* The implementation of "multi-byte codecs" (CJK codecs) implicitly
assumes that they can use all the Unicode facilities. So all the
files in "Modules/cjkcodecs" fail to build.
* Obviously, the Unicode string object depends on Unicode support,
so Objects/unicode* doesn't build.
* There are several spots in the code that need to handle Unicode
strings in some slightly special way, and assume Unicode is turned
on. E.g.:
* Modules/posixmodule.c, posix__getfullpathname(), line 1745
* same file, posix_open(), starting on line 5201
* Objects/fileobject.c, open_the_file(), starting on line 158
* _winreg.c, Py2Reg(), starting on lines 724 and 777
In addition, there was one slightly more complicated problem: _winreg.c
assumes it should call PyUnicode_DecodeMBCS() to turn strings pulled
from the registry into Unicode strings. I'm not sure what the correct
thing to do here is; I went with changing the calls from
PyUnicode_DecodeMBCS() to PyString_FromStringAndSize() for non-Unicode
builds.
Of course, it's not the most important thing in the world--after all,
I'm the first person to even *notice*, right? But it seems a shame
that
one can break the build so easily. If it pleases the stewards of
Python, I would be happy to submit patches that fix the non-"using
Unicode" build.
Second of all, the dumb-as-a-bag-of-rocks Windows linker (at least
the one used by VC++ under MSVS .Net 2003) *links in unused static
symbols*. If I want to excise the code for a module, it is not
sufficient to comment-out the relevant _inittab line in config.c.
Nor does it help if I comment out the "extern" prototype for the
init function. As far as I can tell, the only way to *really* get
rid of a module, including all its static functions and static data,
is to actually *remove all the code* (with comments, or #if, or
whatnot). What a nosebleed, huh?
So in order to build my *really* minimal python24.dll, I have to hack
up the source something fierce. It would be pleasant if the Python
source code provided an easy facility for turning off modules at
compile-time. I would be happy to propose something / write a PEP
/ submit patches to do such a thing, if there is a chance that such
a thing could make it into the official Python source. However, I
realize that this has terribly limited appeal; that, and the fact
that Python releases are infrequent, makes me think it's not a
terrible hardship if I had to re-hack up each new Python release
by hand.
Whatcha think, froods?
/larry/