Engineering numerical format PEP discussion

K

Keith

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

Background (for those who don't already know about engineering
notation):

Engineering notation (EN) is type of floating point representation.
The idea with EN is that the powers of 10 are all multiples of 3,
which correspond to the familiar Greek unit prefixes that engineers
use when describing the different sizes of all sorts of real-world
devices and phenomena:

1e-12 == pico
1e-9 == nano
1e-6 == micro
1e-3 == milli
1e+3 == kilo
1e+6 == mega
1e+9 == giga

When people are talking about Ohms, Farads, Henries, Hz, and many
others, they routinely have to normalize to EN. Fancy calculators
from HP and TI routinely allow the users to go into engineering mode,
but mysteriously things like C, Python, Excel, etc. don't

For instance, no one talks about 4.7e-5F, as they would rather see
47e-6 (micro). Instead of 2.2e-2, engineers need to see 22.0e-3
(milli).

Originally to address this issue, I wrote for myself an "EFloat" class
that subclassed float:

import math
class EFloat(float):
"""EFloat(x) -> floating point number with engineering
representation when printed
Convert a string or a number to a floating point number, if
possible.
When asked to render itself for printing (via str() or print)
it is normalized
to engineering style notation at powers of 10 in multiples of 3
(for micro, milli, kilo, mega, giga, etc.)
"""

def __init__(self, value=0.0, prec=12):
super(EFloat, self).__init__(value)
self.precision = prec

def _get_precision(self):
return self._precision
def _set_precision(self, p):
self._precision = p
self.format_string = "%3." + ("%d" % self._precision) + "fe%
+d"
return
precision = property(_get_precision, _set_precision, doc="The
number of decimal places printed")

def _exponent(self):
if self == 0.0:
ret = 0
else:
ret = math.floor(math.log10(abs(self)))
return ret

def _mantissa(self):
return self/math.pow(10, self._exponent())

def _asEng(self):
shift = self._exponent() % 3
retval = self.format_string % (self._mantissa()*math.pow(10,
shift), self._exponent() - shift)
return retval

def __str__(self):
return self._asEng()

def __repr__(self):
return str(self)

def __add__(self, x):
return EFloat(float.__add__(self, float(x)))

def __radd__(self, x):
return EFloat(float.__add__(self, float(x)))

def __mul__(self, x):
return EFloat(float.__mul__(self, float(x)))

def __rmul__(self, x):
return EFloat(float.__mul__(self, float(x)))

def __sub__(self, x):
return EFloat(float.__sub__(self, float(x)))

def __rsub__(self, x):
return EFloat(float.__rsub__(self, float(x)))

def __div__(self, x):
return EFloat(float.__div__(self, float(x)))

def __rdiv__(self, x):
return EFloat(float.__rdiv__(self, float(x)))

def __truediv__(self, x):
return EFloat(float.__truediv__(self, float(x)))

def __rtruediv__(self, x):
return EFloat(float.__rtruediv__(self, float(x)))

def __pow__(self, x):
return EFloat(float.__pow__(self, float(x)))

def __rpow__(self, x):
return EFloat(float.__rpow__(self, float(x)))

def __divmod__(self, x):
return EFloat(float.__divmod__(self, float(x)))

def __neg__(self):
return EFloat(float.__neg__(self))

def __floordiv__(self, x):
return EFloat(float.__floordiv__(self, float(x)))


which works well for working with interactive Python.

There are places on the web where I've read that people have to work
their butts off trying to "trick" Excel or OpenOffice to do
engineering notation, or there is some work-around that is purported
to work if you use the right version of the spreadsheet.

After many months of using my EFloat class extensively with lots of
apps dealing with embedded engineering tasks, it dawns on me that what
we really need is simply a new format specifier.

I am thinking that if we simply added something like %n (for eNgineer)
to the list of format specifiers that we could make life easier for
engineers:

("%n" % 12345) == "12.345e+03"
("%n" % 1234) == "1.234e+03"
("%n" % 123) == "123e+00"
("%n" % 1.2345e-5) == "12.345e+06"

Of course, the normal dot fields would be put to use to allow us to
specify how many total digits or digits of precision we wanted, or if
we want zero prepend. (whatever makes the most sense, and keeps the
standard most like what is already in the language):

("%.12n" % 12345678) == "12.345678000000e+06"

Do you think this idea has enough merit to make it to PEP status?

--Keith Brafford
 
C

Chris Rebert

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

Background (for those who don't already know about engineering
notation):

Engineering notation (EN) is type of floating point representation.
The idea with EN is that the powers of 10 are all multiples of 3,
which correspond to the familiar Greek unit prefixes that engineers
use when describing the different sizes of all sorts of real-world
devices and phenomena:

1e-12 == pico
1e-9  == nano
1e-6  == micro
1e-3  == milli
1e+3  == kilo
1e+6  == mega
1e+9  == giga

When people are talking about Ohms, Farads, Henries, Hz, and many
others, they routinely have to normalize to EN.  Fancy calculators
from HP and TI routinely allow the users to go into engineering mode,
but mysteriously things like C, Python, Excel, etc. don't

For instance, no one talks about 4.7e-5F, as they would rather see
47e-6 (micro).  Instead of 2.2e-2, engineers need to see 22.0e-3
(milli).
There are places on the web where I've read that people have to work
their butts off trying to "trick" Excel or OpenOffice to do
engineering notation, or there is some work-around that is purported
to work if you use the right version of the spreadsheet.

Relevant related information:
The Decimal datatype supports engineering format directly:
http://docs.python.org/library/decimal.html#decimal.Decimal.to_eng_string

Cheers,
Chris
 
S

Steven D'Aprano

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others. [...]
For instance, no one talks about 4.7e-5F, as they would rather see 47e-6
(micro). Instead of 2.2e-2, engineers need to see 22.0e-3 (milli).

I'd be cautious about making claims about "no one", because not everyone
wants to see engineering notation. You may recall that the other common
display format on scientific calculators is Scientific Notation, which
*does* display 2.2e-2.

After many months of using my EFloat class extensively with lots of apps
dealing with embedded engineering tasks, it dawns on me that what we
really need is simply a new format specifier.

I am thinking that if we simply added something like %n (for eNgineer)
to the list of format specifiers that we could make life easier for
engineers:

I for one don't like %n. I already write %n for integer, at least now I
get an error immediately instead of code that silently does the wrong
thing. But I don't have a better idea of what letter to use.

However, for good or ill the general consensus among the Python language
developers is that % formatting is to be discouraged in favour of the
format() method. For this reason, I expect that there will be zero (or
negative) interest in extending the list of % format specifiers. But
there may be some interest in adding a specifier to format().

http://docs.python.org/library/string.html#formatstrings


It may be worth mentioning in the PEP that Decimals already have a method
for converting to engineering notation, to_eng_string.
 
K

Keith

snip
Relevant related information:
The Decimal datatype supports engineering format directly:http://docs.python.org/library/decimal.html#decimal.Decimal.to_eng_st...

Cheers,
Chris

Thanks for pointing that out. Does the engineering community get by
with the decimal module?

Even though this uses the to_eng_string() function, and even though I
am using the decimal.Context class:
'1234567'

That is not an engineering notation string.

--Keith Brafford
 
K

Keith

I'd be cautious about making claims about "no one"

Good point, and I don't intend to belittle scientific computing folks
for whom traditional floating point representation is expected.

Nor am I suggesting that any of the six format specifiers that we
already have for scientific notation (e, E, f, F, g, G) be altered in
any way.

I guess I wasn't clear about the F in the 4.7e-5F in the example.
People doing engineering don't use 4.7e-5 Farads. They typically have
to do extra work to get that number to print out correctly, as 47 e-6
Farads. The same goes for lots of signal processing entities. People
doing things with Hz, seconds, you name it, have the same problem.

--Keith
 
C

Chris Rebert

Thanks for pointing that out.  Does the engineering community get by
with the decimal module?

Even though this uses the to_eng_string() function, and even though I
am using the decimal.Context class:

'1234567'

That is not an engineering notation string.

Apparently either you and the General Decimal Arithmetic spec differ
on what constitutes engineering notation, there's a bug in the Python
decimal library, or you're hitting some obscure part of the spec's
definition. I don't have the expertise to know which is the case.

The spec: http://speleotrove.com/decimal/decarith.pdf
(to-engineering-string is on page 20 if you're interested)

Cheers,
Chris
 
C

Chris Rebert

I just gave Page 20 a quick read, and it says:


Obviously that would make  '1234567' not an Engineering notation?

Well, I saw that too, but note how it's prefixed by (emphasis mine):

"The conversion **exactly follows the rules for conversion to
scientific numeric string** except in the case of finite numbers
**where exponential notation is used.**"

The description of to-scientific-string explains exactly when it uses
exponential notation, but it gets slightly technical and I'm not
interested enough, nor do I have the time at the moment, to read the
entire spec.

Cheers,
Chris
 
K

Keith

Apparently either you and the General Decimal Arithmetic spec differ
on what constitutes engineering notation, there's a bug in the Python
decimal library, or you're hitting some obscure part of the spec's
definition. snip
The spec:http://speleotrove.com/decimal/decarith.pdf
(to-engineering-string is on page 20 if you're interested)

Thanks for that. I didn't realize that Mike Cowlishaw wrote the spec
we're discussing. It's too bad OS/2 didn't fare better, or we'd
possibly be discussing a proposal for a REP (Rexx Enhancement
Proposal) ;-)

From that document it appears that my decimal.Decimal(1234567) example
shows that the module has a bug:

Doc says:
[0,123,3] ===> "123E+3"

But Python does:'123000'

Regardless, given that the whole point of format specifiers (whether
they are the traditional python 2.x/C style %[whatever] strings, or
the new format() function) is to make it easy for you to format
numbers for printing, wouldn't the language be better off if we added
engineering notation to the features that already offer scientific
notation?

--Keith Brafford
 
T

Terry Reedy

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

I tested that input is no problem, so the only question is output.
Do you think this idea has enough merit to make it to PEP status?

I think it has enough merit to be considered. A minor addition to
..format() specifiers for 3.whenever would probably not require a PEP.
(It is too late at night for me to think about anything concrete at the
moment, though.) A concrete proposal on the python-ideas list might be
enough. I am not sure if this would be covered by the current moratorium
on core changes, though.

Terry Jan Reedy
 
S

Stefan Krah

Chris Rebert said:
Apparently either you and the General Decimal Arithmetic spec differ
on what constitutes engineering notation, there's a bug in the Python
decimal library, or you're hitting some obscure part of the spec's
definition. I don't have the expertise to know which is the case.

The spec: http://speleotrove.com/decimal/decarith.pdf
(to-engineering-string is on page 20 if you're interested)

The module is correct. Printing without exponent follows the same rules
as to-scientific-string:

"If the exponent is less than or equal to zero and the adjusted exponent
is greater than or equal to -6, the number will be converted to a
character form without using exponential notation."


Stefan Krah
 
S

Stefan Krah

Keith said:
Even though this uses the to_eng_string() function, and even though I
am using the decimal.Context class:

'1234567'

That is not an engineering notation string.

To clarify further: The spec says that the printing functions are not
context sensitive, so to_eng_string does not *apply* the context.

The context is only passed in for the 'capitals' value, which determines
whether the exponent letter is printed in lower or upper case.


This is one of the unfortunate situations where passing a context can
create great confusion for the user. Another one is:

Decimal('12345678')


Here the context is passed only for the 'flags' and 'traps' members:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.2/decimal.py", line 548, in __new__
"Invalid literal for Decimal: %r" % value)
File "/usr/lib/python3.2/decimal.py", line 3836, in _raise_error
raise error(explanation)
decimal.InvalidOperation: Invalid literal for Decimal: 'wrong'

c.traps[InvalidOperation] = False
Decimal("wrong", c)
Decimal('NaN')


Stefan Krah
 
G

Grant Edwards

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

I very regularly do something similar in various apps, though I often
want to specify the exponent (e.g. I always want to print a given
value in "mega" scaling even if that ends up as 0.090e3 or 1000.010e3.

So I would suggest adding an optional "exponent" value such that "%3n"
would always result in <whatever>e+03
 
M

Mark Dickinson

I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

I am thinking that if we simply added something like %n (for eNgineer)
to the list of format specifiers that we could make life easier for
engineers:

("%n" % 12345)  == "12.345e+03"
("%n" %  1234)  == "1.234e+03"
("%n" %   123)  == "123e+00"
("%n" % 1.2345e-5)  == "12.345e+06"

I don't think there's much chance of getting changes to old-style
string formatting accepted; you might be better off aiming at the new-
style string formatting. (And there, the 'n' modifier is already
taken for internationalization, so you'd have to find something
different. :)
 
M

Mark Dickinson

From that document it appears that my decimal.Decimal(1234567) example
shows that the module has a bug:

Doc says:
[0,123,3] ===>  "123E+3"

But Python does:>>> import decimal
'123000'

That's not a bug. The triple [0,123,3] is Decimal('123e3'), which is
not the same thing as Decimal('123000'). The former has an exponent
of 3 (in the language of the specification), while the latter has an
exponent of 0.
'123E+3'
 
K

Keith

Apparently either you and the General Decimal Arithmetic spec differ
on what constitutes engineering notation, there's a bug in the Python
decimal library,

You've distilled it precisely, and as you've shown in a different
post, it's the former.

The Python decimal module seems to implement correctly Mike
Cowlishaw's spec, but what that spec refers to as "engineering
notation" isn't really what engineers actually use. That is, even
with decimal.Decimal.to_eng_string(), engineers still end up having to
write their own string formatting code.

I think it's worth making the print statement (or print function, as
the case may be) let us do engineering notation, just like it lets us
specify scientific notation.

--Keith Brafford
 
K

Keith

Keith said:
Even though this uses the to_eng_string() function, and even though I
am using the decimal.Context class:

That is not an engineering notation string.

To clarify further: The spec says that the printing functions are not
context sensitive, so to_eng_string does not *apply* the context.

The context is only passed in for the 'capitals' value, which determines
whether the exponent letter is printed in lower or upper case.

This is one of the unfortunate situations where passing a context can
create great confusion for the user. Another one is:

Decimal('12345678')

Here the context is passed only for the 'flags' and 'traps' members:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/decimal.py", line 548, in __new__
    "Invalid literal for Decimal: %r" % value)
  File "/usr/lib/python3.2/decimal.py", line 3836, in _raise_error
    raise error(explanation)
decimal.InvalidOperation: Invalid literal for Decimal: 'wrong'
c.traps[InvalidOperation] = False
Decimal("wrong", c)

Decimal('NaN')

Stefan Krah

Thank you for that illustrative clarification, Stefan. I should not
have used decimal.Context in that case, nor should I have implied that
it would have helped prove my case.

--Keith Brafford
 
M

MRAB

Mark said:
I am considering writing a PEP for the inclusion of an engineering
format specifier, and would appreciate input from others.

I am thinking that if we simply added something like %n (for eNgineer)
to the list of format specifiers that we could make life easier for
engineers:

("%n" % 12345) == "12.345e+03"
("%n" % 1234) == "1.234e+03"
("%n" % 123) == "123e+00"
("%n" % 1.2345e-5) == "12.345e+06"

I don't think there's much chance of getting changes to old-style
string formatting accepted; you might be better off aiming at the new-
style string formatting. (And there, the 'n' modifier is already
taken for internationalization, so you'd have to find something
different. :)
"e" is already used, "n" is already used, "g" is already used, etc

"t" for "powers of a thousand", perhaps? (Or "m"?)
 
K

Keith

From that document it appears that my decimal.Decimal(1234567) example
shows that the module has a bug:
Doc says:
[0,123,3] ===>  "123E+3"
But Python does:>>> import decimal

That's not a bug.  The triple [0,123,3] is Decimal('123e3'), which is
not the same thing as Decimal('123000').  The former has an exponent
of 3 (in the language of the specification), while the latter has an
exponent of 0.

'123E+3'

Thanks, Mark, you're right. It's clear that Decimal.to_eng_string()
doesn't solve the problem that I am trying to solve, which is: take
the Python "float" type, and print it in engineering format.

--Keith Brafford
 
K

Keith

"t" for "powers of a thousand", perhaps? (Or "m"?)

Both of those letters are fine. I kinda like "m" for the whole Greco-
Roman angle, now that you point it out :)

--Keith Brafford
 
L

Lie Ryan

I think it's worth making the print statement (or print function, as
the case may be) let us do engineering notation, just like it lets us
specify scientific notation.

The print statement/function does no magic at all in specifying how
numbers look like when. The apparently magical formatting is done by the
%-operator and for {}-style formatting, the obj.__format__().

If you're using the {}-format, you can override .__format__() to define
your own formatting mini-language.

If you need to, you can monkey patch format() so that it wraps int/float
in Engineer wrapper that defines Engineer.__format__() to whatever
formatting style you're using.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,058
Latest member
QQXCharlot

Latest Threads

Top