R
Raymond Hettinger
If anyone here is interested, here is a proposal I posted on the
python-ideas list.
The idea is to make numbering formatting a little easier with the new
format() builtin
in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec
-------------------------------------------------------------
Motivation:
Provide a simple, non-locale aware way to format a number
with a thousands separator.
Adding thousands separators is one of the simplest ways to
improve the professional appearance and readability of
output exposed to end users.
In the finance world, output with commas is the norm. Finance
users
and non-professional programmers find the locale approach to be
frustrating, arcane and non-obvious.
It is not the goal to replace locale or to accommodate every
possible convention. The goal is to make a common task easier
for many users.
Research so far:
Scanning the web, I've found that thousands separators are
usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The
COMMA is used when a PERIOD is the decimal separator.
James Knight observed that Indian/Pakistani numbering systems
group by hundreds. Ben Finney noted that Chinese group by
ten-thousands.
Visual Basic and its brethren (like MS Excel) use a completely
different style and have ultra-flexible custom format specifiers
like: "_($* #,##0_)".
Proposal I (from Nick Coghlan]:
A comma will be added to the format() specifier mini-language:
[[fill]align][sign][#][0][minimumwidth][,][.precision][type]
The ',' option indicates that commas should be included in the
output as a
thousands separator. As with locales which do not use a period as
the
decimal point, locales which use a different convention for digit
separation will need to use the locale module to obtain
appropriate
formatting.
The proposal works well with floats, ints, and decimals. It also
allows easy substitution for other separators. For example:
format(n, "6,f").replace(",", "_")
This technique is completely general but it is awkward in the one
case where the commas and periods need to be swapped.
format(n, "6,f").replace(",", "X").replace(".", ",").replace
("X", ".")
Proposal II (to meet Antoine Pitrou's request):
Make both the thousands separator and decimal separator user
specifiable
but not locale aware. For simplicity, limit the choices to a
comma, period,
space, or underscore..
[[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
[type]
Examples:
format(1234, "8.1f") --> ' 1234.0'
format(1234, "8,1f") --> ' 1234,0'
format(1234, "8T.,1f") --> ' 1.234,0'
format(1234, "8T .f") --> ' 1 234,0'
format(1234, "8d") --> ' 1234'
format(1234, "8T,d") --> ' 1,234'
This proposal meets mosts needs (except for people wanting
grouping
for hundreds or ten-thousands), but it comes at the expense of
being a little more complicated to learn and remember. Also, it
makes it
more challenging to write custom __format__ methods that follow
the
format specification mini-language.
For the locale module, just the "T" is necessary in a formatting
string
since the tool already has procedures for figuring out the actual
separators from the local context.
Comments and suggestions are welcome but I draw the line at supporting
Mayan numbering conventions ;-)
Raymond
python-ideas list.
The idea is to make numbering formatting a little easier with the new
format() builtin
in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec
-------------------------------------------------------------
Motivation:
Provide a simple, non-locale aware way to format a number
with a thousands separator.
Adding thousands separators is one of the simplest ways to
improve the professional appearance and readability of
output exposed to end users.
In the finance world, output with commas is the norm. Finance
users
and non-professional programmers find the locale approach to be
frustrating, arcane and non-obvious.
It is not the goal to replace locale or to accommodate every
possible convention. The goal is to make a common task easier
for many users.
Research so far:
Scanning the web, I've found that thousands separators are
usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The
COMMA is used when a PERIOD is the decimal separator.
James Knight observed that Indian/Pakistani numbering systems
group by hundreds. Ben Finney noted that Chinese group by
ten-thousands.
Visual Basic and its brethren (like MS Excel) use a completely
different style and have ultra-flexible custom format specifiers
like: "_($* #,##0_)".
Proposal I (from Nick Coghlan]:
A comma will be added to the format() specifier mini-language:
[[fill]align][sign][#][0][minimumwidth][,][.precision][type]
The ',' option indicates that commas should be included in the
output as a
thousands separator. As with locales which do not use a period as
the
decimal point, locales which use a different convention for digit
separation will need to use the locale module to obtain
appropriate
formatting.
The proposal works well with floats, ints, and decimals. It also
allows easy substitution for other separators. For example:
format(n, "6,f").replace(",", "_")
This technique is completely general but it is awkward in the one
case where the commas and periods need to be swapped.
format(n, "6,f").replace(",", "X").replace(".", ",").replace
("X", ".")
Proposal II (to meet Antoine Pitrou's request):
Make both the thousands separator and decimal separator user
specifiable
but not locale aware. For simplicity, limit the choices to a
comma, period,
space, or underscore..
[[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
[type]
Examples:
format(1234, "8.1f") --> ' 1234.0'
format(1234, "8,1f") --> ' 1234,0'
format(1234, "8T.,1f") --> ' 1.234,0'
format(1234, "8T .f") --> ' 1 234,0'
format(1234, "8d") --> ' 1234'
format(1234, "8T,d") --> ' 1,234'
This proposal meets mosts needs (except for people wanting
grouping
for hundreds or ten-thousands), but it comes at the expense of
being a little more complicated to learn and remember. Also, it
makes it
more challenging to write custom __format__ methods that follow
the
format specification mini-language.
For the locale module, just the "T" is necessary in a formatting
string
since the tool already has procedures for figuring out the actual
separators from the local context.
Comments and suggestions are welcome but I draw the line at supporting
Mayan numbering conventions ;-)
Raymond