Apparent magic number problem

  • Thread starter Colin J. Williams
  • Start date
C

Colin J. Williams

I have a program that I wish to run in both Python 2.7 and Python 3.2

The program runs correctly under each version, but it runs more slowly
under 3.2.

This is probably due to the fact that the .pyc file is created for the
Python 2.7 execution.

When Python 3.2 is run it fails to create a new .pyc file and if the 2.7
..pyc is offered directly a magic number problem is reported.

Is there a bug here? it seems to me that the Magic Number exception
should lead to a new compile of the program.

Colin W.
 
S

Steven D'Aprano

I have a program that I wish to run in both Python 2.7 and Python 3.2

The program runs correctly under each version, but it runs more slowly
under 3.2.

Without knowing what your program does, it is impossible to comment on
why it is slower under 3.2.

This is probably due to the fact that the .pyc file is created for the
Python 2.7 execution.

I doubt it.

When Python 3.2 is run it fails to create a new .pyc file and if the 2.7
.pyc is offered directly a magic number problem is reported.

What do you mean, "offered directly"? Can you show exactly how you are
running the program?

Is there a bug here? it seems to me that the Magic Number exception
should lead to a new compile of the program.

Certainly not. Consider what that would mean. Suppose I try to run a .pyc
file directly:

python32 myprogram.pyc


If myprogram.pyc is compiled for the correct version, it will run. If it
is not, then an error occurs. You want Python to recompile that. But
consider what that would mean:

- Python would have to *guess* which .py file it should compile. Just
because it can find something called "myprogram.py", doesn't mean that it
is the right file. You might have renamed the file after compiling it, or
moved it into a different folder. Who knows?

- After guessing what file to compile, it would have to compile it, and
*delete* the existing .pyc file, overwriting it with the newly compiled
version. This potentially loses data.

- And finally it would run the brand new .pyc file, which could do
something *completely different* from the .pyc file you thought you were
running.


Python only compiles files when you import them, or if you use a tool
like compileall. Merely running a file does not compile it.

Likewise, if you run:

python myprogram.py

any myprogram.pyc file is ignored. .pyc files are only used if you
directly run them, or if you import them.
 
P

Peter Otten

Colin said:
The program runs correctly under each version, but it runs more slowly
under 3.2.
This is probably due to the fact that the .pyc file is created for the
Python 2.7 execution.
When Python 3.2 is run it fails to create a new .pyc file and if the 2.7
.pyc is offered directly a magic number problem is reported.

(1) .pyc files are only created if a module is imported
(2) The 2.7 .pyc file is put alongside the .py file whereas the 3.2 .pyc is
put into the __pycache__ subfolder. No clash can occur.

A simple example:

$ ls
mod.py
$ cat mod.py
print("hello world")

Run it; no pyc is created:

$ python2.7 mod.py
hello world
$ ls
mod.py

Import it using 2.7:

$ python2.7 -c 'import mod'
hello world
$ ls
mod.py mod.pyc

Import it using 3.2:

$ python3.2 -c 'import mod'
hello world
$ ls
mod.py mod.pyc __pycache__
$ ls __pycache__/
mod.cpython-32.pyc

Run the compiled code:

$ python2.7 mod.pyc
hello world
$ python3.2 __pycache__/mod.cpython-32.pyc
hello world

But I'm with Steven, it's unlikely that the module compilation phase is
responsible for a noticeable slowdown.
 
C

Colin J. Williams

(1) .pyc files are only created if a module is imported
(2) The 2.7 .pyc file is put alongside the .py file whereas the 3.2 .pyc is
put into the __pycache__ subfolder. No clash can occur.

A simple example:

$ ls
mod.py
$ cat mod.py
print("hello world")

Run it; no pyc is created:

$ python2.7 mod.py
hello world
$ ls
mod.py

Import it using 2.7:

$ python2.7 -c 'import mod'
hello world
$ ls
mod.py mod.pyc

Import it using 3.2:

$ python3.2 -c 'import mod'
hello world
$ ls
mod.py mod.pyc __pycache__
$ ls __pycache__/
mod.cpython-32.pyc

Run the compiled code:

$ python2.7 mod.pyc
hello world
$ python3.2 __pycache__/mod.cpython-32.pyc
hello world

But I'm with Steven, it's unlikely that the module compilation phase is
responsible for a noticeable slowdown.
Thanks to Steven and Peter for their responses.

My main problem appears to be with:

Profile with Python 2.7
11 25.736 2.340 25.736 2.340
{numpy.linalg.lapack_lite.dgesv}

Profile with Python 3.2
11 152.111 13.828 152.111 13.828 {built-in method dgesv}

In other words, the Python 3.2 linear equation solve takes longer than
with Python 2.7. I'll pursue this with the numpy folk.

There also appears to be a problem with the generation of the .pyc.
Please see the example below:

rem temp.bat
dir *.pyc
del *.pyc
C:\python32\python.exe profiler.py Intel P4 2.8GHz 2MB Ram 221 GB Free
Disk cjw> prof3.txt
dir *.pyc

This is executed with: tmp.bat > tmp.lst

profiler.py contains:
#-------------------------------------------------------------------------------
# Name: profiler.py
# Purpose:
#
# Author: cjw
#
# Created: 17/02/2013
# Copyright: (c) cjw 2013
# Licence: <your licence>
#-------------------------------------------------------------------------------

import cProfile as p, pstats as s, sys

def main():
v= sys.version
statsFile= 'FPSStats' + v[0] + v[2] + '.txt'
p.run('import testFPSpeed; testFPSpeed.main()',
statsFile)
t= s.Stats(statsFile)
t.strip_dirs().sort_stats('cumulative').print_stats(40)

pass

if __name__ == '__main__':
main()

tmp.lst contains:

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>rem temp.bat

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi


C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>del *.pyc

C:\Documents and Settings\cjw.P4A\My
Documents\devpy\raspi>C:\python32\python.exe profiler.py Intel P4 2.8GHz
2MB Ram 221 GB Free Disk cjw 1>prof3.txt

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi


C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>rem temp.bat

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi


C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>del *.pyc

C:\Documents and Settings\cjw.P4A\My
Documents\devpy\raspi>C:\python32\python.exe profiler.py Intel P4 2.8GHz
2MB Ram 221 GB Free Disk cjw 1>prof3.txt

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi




tmp.lst contains:

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>rem temp.bat

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi


C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>del *.pyc

C:\Documents and Settings\cjw.P4A\My
Documents\devpy\raspi>C:\python32\python.exe profiler.py Intel P4 2.8GHz
2MB Ram 221 GB Free Disk cjw 1>prof3.txt

C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi>dir *.pyc
Volume in drive C has no label.
Volume Serial Number is D001-FAC4

Directory of C:\Documents and Settings\cjw.P4A\My Documents\devpy\raspi

From this we see that no .pys was reported for testFPSpeed.py.

However, looking at the directory itself. the .pyc is in fact created.

Thus, the .pyc is, if necessary, generated upon the import of a .py and
so this does not explain the time difference between 2.7 and 3.2.

Thanks to Steven for pointing to the __cache__ directory, I find no
reference to it in the docs.

Colin W.
PS I wish we could format the text in these messages.
 
P

Peter Otten

Colin said:
My main problem appears to be with:

Profile with Python 2.7
11 25.736 2.340 25.736 2.340
{numpy.linalg.lapack_lite.dgesv}

Profile with Python 3.2
11 152.111 13.828 152.111 13.828 {built-in method dgesv}

In other words, the Python 3.2 linear equation solve takes longer than
with Python 2.7. I'll pursue this with the numpy folk.

That seems to be a more promising route for further research than byte code
compilation.
There also appears to be a problem with the generation of the .pyc.
Please see the example below:

Sorry, the signal-to-noise ratio is too low for me to bother with that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top