PEP 3147 - new .pyc format

J

John Roth

PEP 3147 has just been posted, proposing that, beginning in release
3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
directory with a .pyr extension. The reason is so that compiled
versions of a program can coexist, which isn't possible now.

Frankly, I think this is a really good idea, although I've got a few
comments.

1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
10.6) ship with both Python release 2.3 and 2.5 installed.

2. I think the proposed logic is too complex. If this is installed in
3.2, then that release should simply store its .pyc file in the .pyr
directory, without the need for either a command line switch or an
environment variable (both are mentioned in the PEP.)

3. Tool support. There are tools that look for the .pyc files; these
need to be upgraded somehow. The ones that ship with Python should, of
course, be fixed with the PEP, but there are others.

4. I'm in favor of putting the source in the .pyr directory as well,
but that's got a couple more issues. One is tool support, which is
likely to be worse for source, and the other is some kind of algorithm
for identifying which source goes with which object.

Summary: I like it, but I think it needs a bit more work.

John Roth
 
M

Mensanator

PEP 3147 has just been posted, proposing that, beginning in release
3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
directory with a .pyr extension. The reason is so that compiled
versions of a program can coexist, which isn't possible now.

Frankly, I think this is a really good idea, although I've got a few
comments.

1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
10.6) ship with both Python release 2.3 and 2.5 installed.

Mac OSX 10.6 has 2.6 installed.
 
M

MRAB

John said:
PEP 3147 has just been posted, proposing that, beginning in release
3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
directory with a .pyr extension. The reason is so that compiled
versions of a program can coexist, which isn't possible now.

Frankly, I think this is a really good idea, although I've got a few
comments.

1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
10.6) ship with both Python release 2.3 and 2.5 installed.

2. I think the proposed logic is too complex. If this is installed in
3.2, then that release should simply store its .pyc file in the .pyr
directory, without the need for either a command line switch or an
environment variable (both are mentioned in the PEP.)

3. Tool support. There are tools that look for the .pyc files; these
need to be upgraded somehow. The ones that ship with Python should, of
course, be fixed with the PEP, but there are others.

4. I'm in favor of putting the source in the .pyr directory as well,
but that's got a couple more issues. One is tool support, which is
likely to be worse for source, and the other is some kind of algorithm
for identifying which source goes with which object.

Summary: I like it, but I think it needs a bit more work.
The PEP has a .pyr directory for each .py file:

foo.py
foo.pyr/
f2b30a0d.pyc # Python 2.5
f2d10a0d.pyc # Python 2.6
f2d10a0d.pyo # Python 2.6 -O
f2d20a0d.pyc # Python 2.6 -U
0c4f0a0d.pyc # Python 3.1

Other possibilities are:

1. A single pyr directory:

foo.py
pyr/
foo.f2b30a0d.pyc # Python 2.5
foo.f2d10a0d.pyc # Python 2.6
foo.f2d10a0d.pyo # Python 2.6 -O
foo.f2d20a0d.pyc # Python 2.6 -U
foo.0c4f0a0d.pyc # Python 3.1

2. A .pyr directory for each version of Python:

foo.py
f2b30a0d.pyr/ # Python 2.5
foo.pyc
f2d10a0d.pyr/ # Python 2.6/Python 2.6 -O
foo.pyc
foo.pyo
f2d20a0d.pyr/ # Python 2.6 -U
foo.pyc
0c4f0a0d.pyr/ # Python 3.1
foo.pyc
 
J

John Bokma

MRAB said:
The PEP has a .pyr directory for each .py file:

foo.py
foo.pyr/
f2b30a0d.pyc # Python 2.5
f2d10a0d.pyc # Python 2.6
f2d10a0d.pyo # Python 2.6 -O
f2d20a0d.pyc # Python 2.6 -U
0c4f0a0d.pyc # Python 3.1

wow: so much for human readable file names :-(
 
C

Carl Banks

PEP 3147 has just been posted, proposing that, beginning in release
3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
directory with a .pyr extension. The reason is so that compiled
versions of a program can coexist, which isn't possible now.

Frankly, I think this is a really good idea, although I've got a few
comments.


-1

I think it's a terrible, drastic approach to a minor problem. I'm not
sure why the simple approach of just appending a number (perhaps the
major-minor version, or a serial number) to the filename wouldn't
work, like this:

foo.pyc25

All I can think of is they are concerned with the typically minor
expense of listing the directory (to see if there's already .pyc??
file present). This operation can be reasonably cached; when scanning
a directory listing it need only record all occurrencs of .pyc?? and
mark those modules as subject to version-specific .pyc files. Anyway,
I'd expect the proposed -R switch would only be used in special cases
(like installation time) when a minor inefficiency would be tolerable.


1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
10.6) ship with both Python release 2.3 and 2.5 installed.

2. I think the proposed logic is too complex. If this is installed in
3.2, then that release should simply store its .pyc file in the .pyr
directory, without the need for either a command line switch or an
environment variable (both are mentioned in the PEP.)

This is utterly unacceptable. Versioned *.pyc files should only be
optionally requested by people who have to deal multiple versions,
such as distro package maintainers. For my projects I don't give a
flying F about versioned *.pyc and I don't want my project directory
cluttered with a million subdirectories. (It would be a bit more
tolerable if my directory was merely cluttered with *.pyc?? files, but
I'd still rather Python didn't do that unless asked.)

3. Tool support. There are tools that look for the .pyc files; these
need to be upgraded somehow. The ones that ship with Python should, of
course, be fixed with the PEP, but there are others.

How will this affect tools like Py2exe? Now you have a bunch of
identically-named *.pyc files.

4. I'm in favor of putting the source in the .pyr directory as well,
but that's got a couple more issues. One is tool support, which is
likely to be worse for source, and the other is some kind of algorithm
for identifying which source goes with which object.

Now this just too much. I didn't like the suggestion that I should be
forced to put up with dozens of subdirectories, now you want me to
force me to put the source files into the subdirectories as well?
That would be a deal-breaker. Thankfully it is too ridiculous to ever
happen.

Summary: I like it, but I think it needs a bit more work.

I hope it's replaced with something less drastic.


Carl Banks
 
J

John Bokma

MRAB said:
The names are the magic numbers. It's all in the PEP.

Naming files using magic numbers is really beyond me. The fact that the
above needs comments to explain what's what already shows to me that
there's a problem with this naming scheme. What if for one reason or
another I want to delete all pyc files for Python 2.5? Where do I look
up the magic number?
 
N

Neil Hodgson

John Roth:
4. I'm in favor of putting the source in the .pyr directory as well,
but that's got a couple more issues. One is tool support, which is
likely to be worse for source, and the other is some kind of algorithm
for identifying which source goes with which object.

Many tools work recursively except for hidden directories so would
return both the source in the repository as well as the original source.
If you want to do this then the repository directory should be hidden by
starting with ".".

Neil
 
M

MRAB

John said:
Naming files using magic numbers is really beyond me. The fact that the
above needs comments to explain what's what already shows to me that
there's a problem with this naming scheme. What if for one reason or
another I want to delete all pyc files for Python 2.5? Where do I look
up the magic number?
True. You might also want to note that "Python 2.6 -U" appears to have a
different magic number from "Python 2.6" and "Python 2.6 -O".

I don't know whether they always change for each new version.
 
S

Steven D'Aprano

PEP 3147 has just been posted, proposing that, beginning in release 3.2
(and possibly 2.7) compiled .pyc and .pyo files be placed in a directory
with a .pyr extension. The reason is so that compiled versions of a
program can coexist, which isn't possible now.


http://www.python.org/dev/peps/pep-3147/


Reading through the PEP, I went from an instinctive "oh no, that sounds
horrible" reaction to "hmmm, well, that doesn't sound too bad". I don't
think I need this, but I could live with it.

Firstly, it does sound like there is a genuine need for a solution to the
problem of multiple Python versions. This is not the first PEP trying to
solve it, so even if you personally don't see the need, we can assume
that others do.

Secondly, the current behaviour will remain unchanged. Python will
compile spam.py to spam.pyc (or spam.pyo with the -O switch) by default.
If you don't need to support multiple versions, you don't need to care
about this PEP. I like this aspect of the PEP very much. I would argue
that any solution MUST support the status quo for those who don't care
about multiple versions.

To get the new behaviour, you have to explicitly ask for it. You ask for
it by calling python with the -R switch, by setting an environment
variable, or explicitly providing the extra spam/<magic>.pyc files.

Thirdly, the magic file names aren't quite as magic as they appear at
first glance. They represent the hexified magic number of the version of
Python. More about the magic number here:

http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html

Unfortunately the magic number doesn't seem to be documented anywhere I
can find other than in the source code (import.c). The PEP gives some
examples:

f2b30a0d.pyc # Python 2.5
f2d10a0d.pyc # Python 2.6
f2d10a0d.pyo # Python 2.6 -O
f2d20a0d.pyc # Python 2.6 -U
0c4f0a0d.pyc # Python 3.1

but how can one map magic numbers to versions, short of reading import.c?
I propose that sys grow an object sys.magic which is the hexlified magic
number.

2. I think the proposed logic is too complex. If this is installed in
3.2, then that release should simply store its .pyc file in the .pyr
directory, without the need for either a command line switch or an
environment variable (both are mentioned in the PEP.)

I disagree. Making the new behaviour optional is an advantage, even if it
leads to extra complexity. It is pointless forcing .pyc files to be in a
subdirectory if you don't need multiple versions.

3. Tool support. There are tools that look for the .pyc files; these
need to be upgraded somehow. The ones that ship with Python should, of
course, be fixed with the PEP, but there are others.

Third party tools will be the responsibility of the third parties.

4. I'm in favor of putting the source in the .pyr directory as well, but
that's got a couple more issues. One is tool support, which is likely to
be worse for source, and the other is some kind of algorithm for
identifying which source goes with which object.

It certain does.

What's the advantage of forcing .py files to live inside a directory with
the same name?

Modules:
mymodule.py => mymodule/mymodule.py

Packages:
mypackage/__init__.py => mypackage/__init__/__init__.py
mypackage/spam.py => mypackage/spam/spam.py


Seems like a pointless and annoying extra layer to me.
 
P

Paul Rubin

Ben Finney said:
Mapping magic numbers to versions is infeasible and will be incomplete:
Any mapping that exists in (say) Python 3.1 can't know in advance what
the magic number will be for Python 4.5.

But why do the filenames have magic numbers instead of version numbers?
 
S

Steven D'Aprano

But why do the filenames have magic numbers instead of version numbers?

The magic number changes with each incompatible change in the byte code
format, which is not the same as each release. Selected values taken from
import.c:

Python 2.5a0: 62071
Python 2.5a0: 62081 (ast-branch)
Python 2.5a0: 62091 (with)
Python 2.5a0: 62092 (changed WITH_CLEANUP opcode)
Python 2.5b3: 62101 (fix wrong code: for x, in ...)
Python 2.5b3: 62111 (fix wrong code: x += yield)
Python 2.5c1: 62121 (fix wrong lnotab with for loops and
storing constants that should have been
removed)
Python 2.5c2: 62131 (fix wrong code: for x, in ... in
listcomp/genexp)


http://svn.python.org/view/python/trunk/Python/import.c?view=markup

The relationship between byte code magic number and release version
number is not one-to-one. We could have, for the sake of the argument,
releases 3.2.3 through 3.5.0 (say) all having the same byte codes. What
version number should the .pyc file show?
 
A

Alf P. Steinbach

* Steven D'Aprano:
The magic number changes with each incompatible change in the byte code
format, which is not the same as each release. Selected values taken from
import.c:

Python 2.5a0: 62071
Python 2.5a0: 62081 (ast-branch)
Python 2.5a0: 62091 (with)
Python 2.5a0: 62092 (changed WITH_CLEANUP opcode)
Python 2.5b3: 62101 (fix wrong code: for x, in ...)
Python 2.5b3: 62111 (fix wrong code: x += yield)
Python 2.5c1: 62121 (fix wrong lnotab with for loops and
storing constants that should have been
removed)
Python 2.5c2: 62131 (fix wrong code: for x, in ... in
listcomp/genexp)


http://svn.python.org/view/python/trunk/Python/import.c?view=markup

The relationship between byte code magic number and release version
number is not one-to-one. We could have, for the sake of the argument,
releases 3.2.3 through 3.5.0 (say) all having the same byte codes. What
version number should the .pyc file show?

I don't know enough about Python yet to comment on your question, but, just an
idea: how about a human readable filename /with/ some bytecode version id (that
added id could be the magic number)?

I think that combo could serve both the human and machine needs, so to speak. :)


Cheers,

- Alf
 
S

Steven D'Aprano

I don't know enough about Python yet to comment on your question, but,
just an idea: how about a human readable filename /with/ some bytecode
version id (that added id could be the magic number)?

Sorry, that still doesn't work. Consider the hypothetical given above.
For simplicity, I'm going to drop the micro point versions, so let's say
that releases 3.2 through 3.5 all use the same byte-code. (In reality,
you'll be likely looking at version numbers like 3.2.1 rather than just
3.2.) Now suppose you have 3.2 and 3.5 both installed, and you look
inside your $PYTHONPATH and see this:

spam.py
spam.pyr/
3.2-f2e70a0d.pyc


It would be fairly easy to have the import machinery clever enough to
ignore the version number prefix, so that Python 3.5 correctly uses
3.2-f2e70a0d.pyc. That part is easy.

(I trust nobody is going to suggest that Python should create multiple
redundant, identical, copies of the .pyc files. That would be just lame.)

But now consider the human reader. You want human-readable file names for
the benefit of the human reader. How is the human reader supposed to know
that Python 3.5 is going to use 3.2-f2e70a0d.pyc?

Suppose I'm running Python 3.5, and have a troubling bug, and I think "I
know, maybe there's some sort of problem with the compiled byte code, I
should delete it". I go looking for spam.pyr/3.5-*.pyc and don't find
anything.

Now I have two obvious choices:

(1) Start worrying about why Python 3.5 isn't writing .pyc files. Is my
installation broken? Have I set the PYTHONDONTWRITEBYTECODE environment
variable? Do I not have write permission to the folder? WTF is going on?
Confusion will reign.

(2) Learn that the 3.2- prefix is meaningless, and ignore it.

Since you have to ignore the version number prefix anyway, let's not lay
a trap for people by putting it there. You need to know the magic number
to do anything sensible with the .pyc files other than delete them, so
let's not pretend otherwise.

If you don't wish to spend time looking up the magic number, the solution
is simple: hit it with a sledgehammer. Why do you care *which*
specific .pyc file is being used anyway?

rm -r spam.pyr/

or for the paranoid:

rm spam.pyr/*.py[co]

(Obviously there will be people who care about the specific .pyc file
being deleted. Those people can't use a sledgehammer, they need one of
those little tack-hammers with the narrow head. But 99% of the time, a
sledgehammer is fine.)

Frankly, unless you're a core developer or you're hacking byte-code, you
nearly always won't care about the .pyc files. You want the compiler to
manage them. And for the few times you do care, it isn't hard to find out:
b'4f0c0d0a'

Stick that in your "snippets" folder (you do have a snippets folder,
don't you?) and, as they say Down Under, "She'll be right mate".
 
M

Martin v. Loewis

Naming files using magic numbers is really beyond me. The fact that the
above needs comments to explain what's what already shows to me that
there's a problem with this naming scheme. What if for one reason or
another I want to delete all pyc files for Python 2.5? Where do I look
up the magic number?

py> import imp, binascii
py> binascii.hexlify(imp.get_magic())
'b3f20d0a'

(note: this is on a little-endian system)

Regards,
Martin
 
M

Martin v. Loewis

True. You might also want to note that "Python 2.6 -U" appears to have a
different magic number from "Python 2.6" and "Python 2.6 -O".

I don't know whether they always change for each new version.

Here is a recent list of magic numbers:

Python 2.6a0: 62151 (peephole optimizations and STORE_MAP opcode)
Python 2.6a1: 62161 (WITH_CLEANUP optimization)
Python 2.7a0: 62171 (optimize list comprehensions/change LIST_APPEND)
Python 2.7a0: 62181 (optimize conditional branches:
introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
Python 2.7a0 62191 (introduce SETUP_WITH)
Python 2.7a0 62201 (introduce BUILD_SET)
Python 2.7a0 62211 (introduce MAP_ADD and SET_ADD)

#define MAGIC (62211 | ((long)'\r'<<16) | ((long)'\n'<<24))

Regards,
Martin
 
J

John Bokma

Steven D'Aprano said:
Sorry, that still doesn't work. Consider the hypothetical given above.
For simplicity, I'm going to drop the micro point versions, so let's say
that releases 3.2 through 3.5 all use the same byte-code. (In reality,

Based on the magic numbers I've seen so far it looks like that not an
option. They increment with every minor change. So to me, at this moment
(and maybe it's my ignorance) it looks like a made up example to justify
what to me still looks like a bad decision.
 
S

Sean DiZazzo

Here is a recent list of magic numbers:

       Python 2.6a0: 62151 (peephole optimizations and STORE_MAP opcode)
       Python 2.6a1: 62161 (WITH_CLEANUP optimization)
       Python 2.7a0: 62171 (optimize list comprehensions/change LIST_APPEND)
       Python 2.7a0: 62181 (optimize conditional branches:
                            introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
       Python 2.7a0  62191 (introduce SETUP_WITH)
       Python 2.7a0  62201 (introduce BUILD_SET)
       Python 2.7a0  62211 (introduce MAP_ADD and SET_ADD)

#define MAGIC (62211 | ((long)'\r'<<16) | ((long)'\n'<<24))

Regards,
Martin

Does "magic" really need to be used? Why not just use the revision
number?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top