Strange behaviour of floating point constants in imported modules

T

Tomasz Lisowski

Hi,

We are distributing our Python application as the short main script (.py
file) and a set of modules compiled to the .pyc files. So far, we have
always treated .pyc files as portable between platforms, but recently we
have discovered an annoying problem. In a module, there is the following
code fragment:

Deg2Rad = math.pi/180.0
angleEPS = 0.5
angle0B = angleEPS*Deg2Rad

which calculates 'angle0B' as the angle of a half of a degree, converted
to radians. The module has been compiled on an English Windows XP
machine, and then tested on a Polish Windows XP workstation.

What was our astonishment, when various exceptions started to be raised
on a test machine (no problem on the original English-version Windows
XP). We have traced them to the fact, that both angleEPS and angle0B
were found to be ZERO (!!!), whereas in reality, angle0B is about 0.008.
And this all happened silently, without any error during the import of
the module!

What's the reason of this error? I start thinking, that it may be
related to the fact, that the decimal point on the Enlish Windows XP is
the '.' character, and on the Polish one - ','.

Is there a good method to avoid this kind of problems? How to make such
distributed modules really portable?

Thanks in advance
 
J

John Machin

Hi,

We are distributing our Python application as the short main script (.py
file) and a set of modules compiled to the .pyc files. So far, we have
always treated .pyc files as portable between platforms,

There is no guarantee at all that a .pyc file is good for any purpose
outside the machine that produced it. In practice, however, you
*should* be able to rely on no surprises if you have the same platform
and the same version of Python; do you?

How did you transfer the .pyc files from one box to the other? A
Windows installer? A ZIP file? FTP using text mode? Plain-text
attachments to an e-mail message?

but recently we
have discovered an annoying problem. In a module, there is the following
code fragment:

Deg2Rad = math.pi/180.0
angleEPS = 0.5
angle0B = angleEPS*Deg2Rad

which calculates 'angle0B' as the angle of a half of a degree, converted
to radians. The module has been compiled on an English Windows XP
machine, and then tested on a Polish Windows XP workstation.

What was our astonishment, when various exceptions started to be raised
on a test machine (no problem on the original English-version Windows
XP). We have traced them to the fact, that both angleEPS and angle0B
were found to be ZERO (!!!), whereas in reality, angle0B is about 0.008.

What evidence do you have? Have you disassembled the .pyc file on both
boxes and diff'ed the results? Have you computed checksums on both
boxes?
And this all happened silently, without any error during the import of
the module!

What's the reason of this error? I start thinking, that it may be
related to the fact, that the decimal point on the Enlish Windows XP is
the '.' character, and on the Polish one - ','.

This is *extremely* unlikely. Firstly, you are (I understand) talking
about a .pyc file, that was produced on an English Windows box. Even
though the "180.0" and the "0.5" are visible as character strings in
the .pyc file, Python sure doesn't use the locale when it loads a .pyc
file.

Secondly, even if you are talking about a .py file, Python takes
absolutely no notice of the locale when it compiles the .py file.
Polish programmers write "0.5", not "0,5". Read the language reference
manual, section 2.4.5 -- it uses ".", not "whatever the decimal point
character might be in your locale". If it did depend on locale, you
would need a locale declaration at the top of the file, if one wanted
..py files to be portable internationally; ever seen or heard of such a
declaration?

Thirdly, if the dot was interpreted as something other than a decimal
point, then what? Perhaps assign a tuple (0, 5), or perhaps a syntax
error; zero is achieved under what conditions?

It's more likely that the .pyc file has been damaged somehow. AFAIK
they don't have checksums.
Is there a good method to avoid this kind of problems? How to make such
distributed modules really portable?

Distribute source.

HTH,

John
 
J

Jeff Epler

This may be relevant to the problems you're seeing:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=774665&group_id=5470

The short story, as the tracker item paints it, is that setting
LC_NUMERIC to anything other than 'C' can give results like the ones you
describe---Python itself should never do this, but third parties code
may.

A web search for python LC_NUMERIC should turn up more about this topic,
probably even some past threads on this mailing list/newsgroup.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCkctNJd01MZaTXX0RAmBsAJ9P92+JRG4n/wZhewWB6+lXzy7ndQCdEbka
hlFs6R9FAAuG8i+mey3Sw0o=
=smLI
-----END PGP SIGNATURE-----
 
T

Tomasz Lisowski

There is no guarantee at all that a .pyc file is good for any purpose
outside the machine that produced it. In practice, however, you
*should* be able to rely on no surprises if you have the same platform
and the same version of Python; do you?

Python 2.3.5 and 2.3.3 on the test machine - it is the same major and
minor version of the interpreter. I think it should be fine. The
platform - Windows XP on both machines - difference is in the language
version of Windows, and in the locale setting of the decimal point ('.'
and ',')
How did you transfer the .pyc files from one box to the other? A
Windows installer? A ZIP file? FTP using text mode? Plain-text
attachments to an e-mail message?

E-mailed in a ZIP file
What evidence do you have? Have you disassembled the .pyc file on both
boxes and diff'ed the results? Have you computed checksums on both
boxes?

I have prepared a 'debug' version of the module, printing out some
variables. The printouts on a test machine showed ZERO values, where
there should be non-zero ones (0.008). The original machine, where the
modules were compiled, showed the correct values.
This is *extremely* unlikely. Firstly, you are (I understand) talking
about a .pyc file, that was produced on an English Windows box. Even
though the "180.0" and the "0.5" are visible as character strings in
the .pyc file, Python sure doesn't use the locale when it loads a .pyc
file.

If I modify the decimal point setting in the Regional Settings in the
Control Panel to the dot character '.' - everything seems to work fine.
Whenever it is set to the comma ',' - floating point constants, like 0.5
are considered ZERO by the import statement.
Secondly, even if you are talking about a .py file, Python takes
absolutely no notice of the locale when it compiles the .py file.
Polish programmers write "0.5", not "0,5". Read the language reference
manual, section 2.4.5 -- it uses ".", not "whatever the decimal point
character might be in your locale". If it did depend on locale, you
would need a locale declaration at the top of the file, if one wanted
.py files to be portable internationally; ever seen or heard of such a
declaration?

Right! The language syntax requires to use the dot regardless of the
locale, BUT the constants are written to the .pyc file in a string form,
probably using repr(), WHICH APPARENTLY DEPENDS ON THE LOCALE (!), when
the documentation states, that the built-in float(), str() functions are
locale-unaware (the locale module provides appropriate functions
supporting the locale).
Thirdly, if the dot was interpreted as something other than a decimal
point, then what? Perhaps assign a tuple (0, 5), or perhaps a syntax
error; zero is achieved under what conditions?

No, it is not a problem with possibly using the comma instead of a dot
in the SOURCE - there only a dot can be used. That's clear.
It's more likely that the .pyc file has been damaged somehow. AFAIK
they don't have checksums.

Very unlikely. I have made these test also directly, sharing the folder
with the .pyc files on the LAN, and running the program from the test
machine. Then, the .pyc files were not manipulated at all.
Distribute source.

Yes, that's an option, but not in this case :)

Tomasz Lisowski
 
T

Tomasz Lisowski

This may be relevant to the problems you're seeing:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=774665&group_id=5470

The short story, as the tracker item paints it, is that setting
LC_NUMERIC to anything other than 'C' can give results like the ones you
describe---Python itself should never do this, but third parties code
may.

A web search for python LC_NUMERIC should turn up more about this topic,
probably even some past threads on this mailing list/newsgroup.

You've got the point. My code uses wxLocale class from wxPython, and
sets the wxLANGUAGE_POLISH locale. After setting this locale, I have
added the statement:

locale.setlocale(locale.LC_NUMERIC, "C")

and everything seems to be normal now. I agree with the comments in the
tracker item, that the float, str(), repr() functions should be
locale-independent. We have the functions in the locale module, if
someone needs the locale-dependent string-float conversions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top