Rough draft: Proposed format specifier for a thousands separator

Discussion in 'Python' started by Raymond Hettinger, Mar 12, 2009.

  1. If anyone here is interested, here is a proposal I posted on the
    python-ideas list.

    The idea is to make numbering formatting a little easier with the new
    format() builtin
    in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec


    -------------------------------------------------------------


    Motivation:

    Provide a simple, non-locale aware way to format a number
    with a thousands separator.

    Adding thousands separators is one of the simplest ways to
    improve the professional appearance and readability of
    output exposed to end users.

    In the finance world, output with commas is the norm. Finance
    users
    and non-professional programmers find the locale approach to be
    frustrating, arcane and non-obvious.

    It is not the goal to replace locale or to accommodate every
    possible convention. The goal is to make a common task easier
    for many users.


    Research so far:

    Scanning the web, I've found that thousands separators are
    usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The
    COMMA is used when a PERIOD is the decimal separator.

    James Knight observed that Indian/Pakistani numbering systems
    group by hundreds. Ben Finney noted that Chinese group by
    ten-thousands.

    Visual Basic and its brethren (like MS Excel) use a completely
    different style and have ultra-flexible custom format specifiers
    like: "_($* #,##0_)".



    Proposal I (from Nick Coghlan]:

    A comma will be added to the format() specifier mini-language:

    [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

    The ',' option indicates that commas should be included in the
    output as a
    thousands separator. As with locales which do not use a period as
    the
    decimal point, locales which use a different convention for digit
    separation will need to use the locale module to obtain
    appropriate
    formatting.

    The proposal works well with floats, ints, and decimals. It also
    allows easy substitution for other separators. For example:

    format(n, "6,f").replace(",", "_")

    This technique is completely general but it is awkward in the one
    case where the commas and periods need to be swapped.

    format(n, "6,f").replace(",", "X").replace(".", ",").replace
    ("X", ".")


    Proposal II (to meet Antoine Pitrou's request):

    Make both the thousands separator and decimal separator user
    specifiable
    but not locale aware. For simplicity, limit the choices to a
    comma, period,
    space, or underscore..

    [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
    [type]

    Examples:

    format(1234, "8.1f") --> ' 1234.0'
    format(1234, "8,1f") --> ' 1234,0'
    format(1234, "8T.,1f") --> ' 1.234,0'
    format(1234, "8T .f") --> ' 1 234,0'
    format(1234, "8d") --> ' 1234'
    format(1234, "8T,d") --> ' 1,234'

    This proposal meets mosts needs (except for people wanting
    grouping
    for hundreds or ten-thousands), but it comes at the expense of
    being a little more complicated to learn and remember. Also, it
    makes it
    more challenging to write custom __format__ methods that follow
    the
    format specification mini-language.

    For the locale module, just the "T" is necessary in a formatting
    string
    since the tool already has procedures for figuring out the actual
    separators from the local context.



    Comments and suggestions are welcome but I draw the line at supporting
    Mayan numbering conventions ;-)


    Raymond
     
    Raymond Hettinger, Mar 12, 2009
    #1
    1. Advertising

  2. Re: Rough draft: Proposed format specifier for a thousands separator

    > If anyone here is interested, here is a proposal I posted on the
    > python-ideas list.
    >
    > The idea is to make numbering formatting a little easier with
    > the new format() builtin:
    > http://docs.python.org/library/string.html#formatspec


    Here's a re-post (hopefully without the line wrapping problems
    in the previous post).

    Raymond

    -------------------------------------------------------------



    Motivation:
    -----------

    Provide a simple, non-locale aware way to format a number
    with a thousands separator.

    Adding thousands separators is one of the simplest ways to
    improve the professional appearance and readability of output
    exposed to end users.

    In the finance world, output with commas is the norm. Finance
    users and non-professional programmers find the locale
    approach to be frustrating, arcane and non-obvious.

    It is not the goal to replace locale or to accommodate every
    possible convention. The goal is to make a common task easier
    for many users.


    Research so far:
    ----------------

    Scanning the web, I've found that thousands separators are
    usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The
    COMMA is used when a PERIOD is the decimal separator.

    James Knight observed that Indian/Pakistani numbering systems
    group by hundreds. Ben Finney noted that Chinese group by
    ten-thousands.

    Visual Basic and its brethren (like MS Excel) use a completely
    different style and have ultra-flexible custom format
    specifiers like: "_($* #,##0_)".



    Proposal I (from Nick Coghlan):
    -------------------------------

    A comma will be added to the format() specifier mini-language:

    [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

    The ',' option indicates that commas should be included in the
    output as a thousands separator. As with locales which do not
    use a period as the decimal point, locales which use a
    different convention for digit separation will need to use the
    locale module to obtain appropriate formatting.

    The proposal works well with floats, ints, and decimals.
    It also allows easy substitution for other separators.
    For example:

    format(n, "6,f").replace(",", "_")

    This technique is completely general but it is awkward in the
    one case where the commas and periods need to be swapped:

    format(n, "6,f").replace(",", "X").replace(".", ",").replace("X",
    ".")


    Proposal II (to meet Antoine Pitrou's request):
    -----------------------------------------------

    Make both the thousands separator and decimal separator user
    specifiable but not locale aware. For simplicity, limit the
    choices to a comma, period, space, or underscore.

    [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type]

    Examples:

    format(1234, "8.1f") --> ' 1234.0'
    format(1234, "8,1f") --> ' 1234,0'
    format(1234, "8T.,1f") --> ' 1.234,0'
    format(1234, "8T .f") --> ' 1 234,0'
    format(1234, "8d") --> ' 1234'
    format(1234, "8T,d") --> ' 1,234'

    This proposal meets mosts needs (except for people wanting
    grouping for hundreds or ten-thousands), but iIt comes at the
    expense of being a little more complicated to learn and
    remember. Also, it makes it more challenging to write custom
    __format__ methods that follow the format specification
    mini-language.

    For the locale module, just the "T" is necessary in a
    formatting string since the tool already has procedures for
    figuring out the actual separators from the local context.
     
    Raymond Hettinger, Mar 12, 2009
    #2
    1. Advertising

  3. Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger wrote:
    >> The idea is to make numbering formatting a little easier with
    >> the new format() builtin:
    >> http://docs.python.org/library/string.html#formatspec

    [...]
    > Scanning the web, I've found that thousands separators are
    > usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The
    > COMMA is used when a PERIOD is the decimal separator.
    >
    > James Knight observed that Indian/Pakistani numbering systems
    > group by hundreds. Ben Finney noted that Chinese group by
    > ten-thousands.


    IIRC, some cultures use a non-uniform grouping, like e.g. "123 456 78.9".
    For that, there is also a grouping reserved in the locale (at least in
    those of C++ IOStreams, that is). Further, an that seems to also be one of
    your concerns, there are different ways to represent negative numbers like
    e.g. "(123)" or "-456".


    > Make both the thousands separator and decimal separator user
    > specifiable but not locale aware. For simplicity, limit the
    > choices to a comma, period, space, or underscore.
    >
    > [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type]
    >
    > Examples:
    >
    > format(1234, "8.1f") --> ' 1234.0'
    > format(1234, "8,1f") --> ' 1234,0'
    > format(1234, "8T.,1f") --> ' 1.234,0'
    > format(1234, "8T .f") --> ' 1 234,0'
    > format(1234, "8d") --> ' 1234'
    > format(1234, "8T,d") --> ' 1,234'



    How about this?
    format(1234, "8.1", tsep=",")
    --> ' 1,234.0'
    format(1234, "8.1", tsep=".", dsep=",")
    --> ' 1.234,0'
    format(123456, tsep=" ", grouping=(3, 2,))
    --> '1 234 56'

    IOW, why not explicitly say what you want using keyword arguments with
    defaults instead of inventing an IMHO cryptic, read-only mini-language?
    Seriously, the problem I see with this proposal is that its aim to be as
    short as possible actually makes the resulting format specifications
    unreadable. Could you even guess what "8T.,1f" should mean if you had not
    written this?

    > This proposal meets mosts needs (except for people wanting
    > grouping for hundreds or ten-thousands), but iIt comes at the
    > expense of being a little more complicated to learn and
    > remember.


    Too expensive for my taste.

    Uli

    --
    Sator Laser GmbH
    Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
     
    Ulrich Eckhardt, Mar 12, 2009
    #3
  4. Re: Rough draft: Proposed format specifier for a thousands separator

    [Ulrich Eckhardt]
    > IOW, why not explicitly say what you want using keyword arguments with
    > defaults instead of inventing an IMHO cryptic, read-only mini-language?


    That makes sense to me but I don't think that's the way the format()
    builtin was implemented (see PEP 3101 which was implemented Py2.6 and
    3.0).
    It is a simple pass-through to a __format__ method for each
    formattable
    object. I don't see how keywords would fit in that framework. What
    is
    proposed is similar to locale module's existing "n" specifier except
    that
    this lets you say exactly what you want instead of deferring to the
    locale
    settings.

    The mini-language seems to already be the way of things (just as it is
    many other languages including PHP, C, Fortran, and whatnot). I'm
    just
    proposing an addition "T," so you add commas as a thousands separator.


    Raymond
     
    Raymond Hettinger, Mar 12, 2009
    #4
  5. Raymond Hettinger

    John Machin Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    On Mar 12, 9:56 pm, Raymond Hettinger <> wrote:
    > [Ulrich Eckhardt]
    >
    > > IOW, why not explicitly say what you want using keyword arguments with
    > > defaults instead of inventing an IMHO cryptic, read-only mini-language?

    >
    > That makes sense to me but I don't think that's the way the format()
    > builtin was implemented (see PEP 3101 which was implemented Py2.6 and
    > 3.0).
    > It is a simple pass-through to a __format__ method for each
    > formattable
    > object.  I don't see how keywords would fit in that framework.  What
    > is
    > proposed is similar to locale module's existing "n" specifier except
    > that
    > this lets you say exactly what you want instead of deferring to the
    > locale
    > settings.
    >
    > The mini-language seems to already be the way of things (just as it is
    > many other languages including PHP, C, Fortran, and whatnot).  I'm
    > just
    > proposing an addition "T," so you add commas as a thousands separator.
    >


    .... and why not C (centum) for hundreds (can't have H(ollerith)) and W
    for wan (the Chinese word for 10 thousand)?
     
    John Machin, Mar 12, 2009
    #5
  6. Re: Rough draft: Proposed format specifier for a thousands separator

    "Ulrich Eckhardt" <eck...aser.com> wrote:

    >IOW, why not explicitly say what you want using keyword arguments with
    >defaults instead of inventing an IMHO cryptic, read-only mini-language?
    >Seriously, the problem I see with this proposal is that its aim to be as
    >short as possible actually makes the resulting format specifications
    >unreadable. Could you even guess what "8T.,1f" should mean if you had not
    >written this?


    +1

    Look back in history, and see how COBOL did it with the
    PICTURE - dead easy and easily understandable.
    Compared to that, even the C printf stuff and python's %
    are incomprehensible.

    - Hendrik
     
    Hendrik van Rooyen, Mar 12, 2009
    #6
  7. Raymond Hettinger

    MRAB Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger wrote:
    [snip]
    > Proposal I (from Nick Coghlan):
    > -------------------------------
    >
    > A comma will be added to the format() specifier mini-language:
    >
    > [[fill]align][sign][#][0][minimumwidth][,][.precision][type]
    >
    > The ',' option indicates that commas should be included in the
    > output as a thousands separator. As with locales which do not
    > use a period as the decimal point, locales which use a
    > different convention for digit separation will need to use the
    > locale module to obtain appropriate formatting.
    >
    > The proposal works well with floats, ints, and decimals.
    > It also allows easy substitution for other separators.
    > For example:
    >
    > format(n, "6,f").replace(",", "_")
    >
    > This technique is completely general but it is awkward in the
    > one case where the commas and periods need to be swapped:
    >
    > format(n, "6,f").replace(",", "X").replace(".", ",").replace("X",
    > ".")
    >
    >
    > Proposal II (to meet Antoine Pitrou's request):
    > -----------------------------------------------
    >
    > Make both the thousands separator and decimal separator user
    > specifiable but not locale aware. For simplicity, limit the
    > choices to a comma, period, space, or underscore.
    >
    > [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type]
    >
    > Examples:
    >
    > format(1234, "8.1f") --> ' 1234.0'
    > format(1234, "8,1f") --> ' 1234,0'
    > format(1234, "8T.,1f") --> ' 1.234,0'
    > format(1234, "8T .f") --> ' 1 234,0'
    > format(1234, "8d") --> ' 1234'
    > format(1234, "8T,d") --> ' 1,234'
    >
    > This proposal meets mosts needs (except for people wanting
    > grouping for hundreds or ten-thousands), but iIt comes at the
    > expense of being a little more complicated to learn and
    > remember. Also, it makes it more challenging to write custom
    > __format__ methods that follow the format specification
    > mini-language.
    >
    > For the locale module, just the "T" is necessary in a
    > formatting string since the tool already has procedures for
    > figuring out the actual separators from the local context.
    >

    [snip]
    I'd probably prefer Proposal I with "." representing the decimal point
    and "," representing the grouping (thousands) separator, although I'd
    add an "L" flag to indicate that it should use the locale to provide the
    actual characters to be used and even the number of digits for the
    grouping:

    [[fill]align][sign][#][0][minimumwidth][,][.precision][L][type]

    Examples:

    Assuming the locale has:

    decimal point: ","
    grouping separator: "."
    grouping spacing: 3

    format(123456, "10.1f") --> ' 123456.0'
    format(123456, "10.1Lf") --> ' 123.456,0'
    format(123456, "10,.1f") --> ' 123,456.0'
    format(123456, "10,.1Lf") --> ' 123.456,0'
     
    MRAB, Mar 12, 2009
    #7
  8. Raymond Hettinger

    Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    On Mar 12, 3:30 am, Raymond Hettinger <> wrote:
    > If anyone here is interested, here is a proposal I posted on the
    > python-ideas list.
    >
    > The idea is to make numbering formatting a little easier with the new
    > format() builtin
    > in Py2.6 and Py3.0:  http://docs.python.org/library/string.html#formatspec
    >
    > -------------------------------------------------------------
    >
    > Motivation:
    >
    >     Provide a simple, non-locale aware way to format a number
    >     with a thousands separator.
    >
    >     Adding thousands separators is one of the simplest ways to
    >     improve the professional appearance and readability of
    >     output exposed to end users.
    >
    >     In the finance world, output with commas is the norm.  Finance
    > users
    >     and non-professional programmers find the locale approach to be
    >     frustrating, arcane and non-obvious.
    >
    >     It is not the goal to replace locale or to accommodate every
    >     possible convention.  The goal is to make a common task easier
    >     for many users.
    >
    > Research so far:
    >
    >     Scanning the web, I've found that thousands separators are
    >     usually one of COMMA, PERIOD, SPACE, or UNDERSCORE.  The
    >     COMMA is used when a PERIOD is the decimal separator.
    >
    >     James Knight observed that Indian/Pakistani numbering systems
    >     group by hundreds.   Ben Finney noted that Chinese group by
    >     ten-thousands.
    >
    >     Visual Basic and its brethren (like MS Excel) use a completely
    >     different style and have ultra-flexible custom format specifiers
    >     like: "_($* #,##0_)".
    >
    > Proposal I (from Nick Coghlan]:
    >
    >     A comma will be added to the format() specifier mini-language:
    >
    >     [[fill]align][sign][#][0][minimumwidth][,][.precision][type]
    >
    >     The ',' option indicates that commas should be included in the
    > output as a
    >     thousands separator. As with locales which do not use a period as
    > the
    >     decimal point, locales which use a different convention for digit
    >     separation will need to use the locale module to obtain
    > appropriate
    >     formatting.
    >
    >     The proposal works well with floats, ints, and decimals.  It also
    >     allows easy substitution for other separators.  For example:
    >
    >         format(n, "6,f").replace(",", "_")
    >
    >     This technique is completely general but it is awkward in the one
    >     case where the commas and periods need to be swapped.
    >
    >         format(n, "6,f").replace(",", "X").replace(".", ",").replace
    > ("X", ".")
    >
    > Proposal II (to meet Antoine Pitrou's request):
    >
    >     Make both the thousands separator and decimal separator user
    > specifiable
    >     but not locale aware.  For simplicity, limit the choices to a
    > comma, period,
    >     space, or underscore..
    >
    >     [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
    > [type]
    >
    >     Examples:
    >
    >         format(1234, "8.1f")    -->     '  1234.0'
    >         format(1234, "8,1f")    -->     '  1234,0'
    >         format(1234, "8T.,1f")  -->     ' 1.234,0'
    >         format(1234, "8T .f")   -->     ' 1 234,0'
    >         format(1234, "8d")      -->     '    1234'
    >         format(1234, "8T,d")      -->   '   1,234'
    >
    >     This proposal meets mosts needs (except for people wanting
    > grouping
    >     for hundreds or ten-thousands), but it comes at the expense of
    >     being a little more complicated to learn and remember.  Also, it
    > makes it
    >     more challenging to write custom __format__ methods that follow
    > the
    >     format specification mini-language.
    >
    >     For the locale module, just the "T" is necessary in a formatting
    > string
    >     since the tool already has procedures for figuring out the actual
    >     separators from the local context.
    >
    > Comments and suggestions are welcome but I draw the line at supporting
    > Mayan numbering conventions ;-)
    >
    > Raymond


    As far as I am concerned the most simple version plus a way to swap
    around commas and period is all that is needed. The rest can be done
    using one replace (because the decimal separator is always one of two
    options). This should cover everywhere but the far east. 80% of cases
    for 20% of implementation complexity.

    For example:

    [[fill]align][sign][#][0][,|.][minimumwidth][.precision][type]

    > format(1234, ".8.1f") --> ' 1.234,0'
    > format(1234, ",8.1f") --> ' 1,234.0'
     
    , Mar 12, 2009
    #8
  9. Re: Rough draft: Proposed format specifier for a thousands separator

    On Mar 12, 7:51 am, wrote:
    > On Mar 12, 3:30 am, Raymond Hettinger <> wrote:
    >
    >
    >
    > > If anyone here is interested, here is a proposal I posted on the
    > > python-ideas list.

    >
    > > The idea is to make numbering formatting a little easier with the new
    > > format() builtin
    > > in Py2.6 and Py3.0:  http://docs.python.org/library/string.html#formatspec

    >
    > > -------------------------------------------------------------

    >
    > > Motivation:

    >
    > >     Provide a simple, non-locale aware way to format a number
    > >     with a thousands separator.

    >
    > >     Adding thousands separators is one of the simplest ways to
    > >     improve the professional appearance and readability of
    > >     output exposed to end users.

    >
    > >     In the finance world, output with commas is the norm.  Finance
    > > users
    > >     and non-professional programmers find the locale approach to be
    > >     frustrating, arcane and non-obvious.

    >
    > >     It is not the goal to replace locale or to accommodate every
    > >     possible convention.  The goal is to make a common task easier
    > >     for many users.

    >
    > > Research so far:

    >
    > >     Scanning the web, I've found that thousands separators are
    > >     usually one of COMMA, PERIOD, SPACE, or UNDERSCORE.  The
    > >     COMMA is used when a PERIOD is the decimal separator.

    >
    > >     James Knight observed that Indian/Pakistani numbering systems
    > >     group by hundreds.   Ben Finney noted that Chinese group by
    > >     ten-thousands.

    >
    > >     Visual Basic and its brethren (like MS Excel) use a completely
    > >     different style and have ultra-flexible custom format specifiers
    > >     like: "_($* #,##0_)".

    >
    > > Proposal I (from Nick Coghlan]:

    >
    > >     A comma will be added to the format() specifier mini-language:

    >
    > >     [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

    >
    > >     The ',' option indicates that commas should be included in the
    > > output as a
    > >     thousands separator. As with locales which do not use a period as
    > > the
    > >     decimal point, locales which use a different convention for digit
    > >     separation will need to use the locale module to obtain
    > > appropriate
    > >     formatting.

    >
    > >     The proposal works well with floats, ints, and decimals.  It also
    > >     allows easy substitution for other separators.  For example:

    >
    > >         format(n, "6,f").replace(",", "_")

    >
    > >     This technique is completely general but it is awkward in the one
    > >     case where the commas and periods need to be swapped.

    >
    > >         format(n, "6,f").replace(",", "X").replace(".", ",").replace
    > > ("X", ".")

    >
    > > Proposal II (to meet Antoine Pitrou's request):

    >
    > >     Make both the thousands separator and decimal separator user
    > > specifiable
    > >     but not locale aware.  For simplicity, limit the choices to a
    > > comma, period,
    > >     space, or underscore..

    >
    > >     [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision]
    > > [type]

    >
    > >     Examples:

    >
    > >         format(1234, "8.1f")    -->     '  1234.0'
    > >         format(1234, "8,1f")    -->     '  1234,0'
    > >         format(1234, "8T.,1f")  -->     ' 1.234,0'
    > >         format(1234, "8T .f")   -->     ' 1 234,0'
    > >         format(1234, "8d")      -->     '    1234'
    > >         format(1234, "8T,d")      -->   '   1,234'

    >
    > >     This proposal meets mosts needs (except for people wanting
    > > grouping
    > >     for hundreds or ten-thousands), but it comes at the expense of
    > >     being a little more complicated to learn and remember.  Also, it
    > > makes it
    > >     more challenging to write custom __format__ methods that follow
    > > the
    > >     format specification mini-language.

    >
    > >     For the locale module, just the "T" is necessary in a formatting
    > > string
    > >     since the tool already has procedures for figuring out the actual
    > >     separators from the local context.

    >
    > > Comments and suggestions are welcome but I draw the line at supporting
    > > Mayan numbering conventions ;-)

    >
    > > Raymond

    >
    > As far as I am concerned the most simple version plus a way to swap
    > around commas and period is all that is needed.


    Thanks for the feedback.

    FWIW, posted a cleaned-up version of the proposal at
    http://www.python.org/dev/peps/pep-0378/


    Raymond
     
    Raymond Hettinger, Mar 12, 2009
    #9
  10. Raymond Hettinger

    Paul Rubin Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger <> writes:
    > FWIW, posted a cleaned-up version of the proposal at
    > http://www.python.org/dev/peps/pep-0378/


    It would be nice if the PEP included a comparison between the proposed
    scheme and how it is done in other programs and languages. For
    example, I think Common Lisp has a feature for formatting thousands.
    Spreadsheets like Excel probably have something similar. Those
    programs are pretty well evolved and probably address the important
    real use cases by now. It might be best to follow an existing example
    (with adjustments for Pythonification as necessary) to the extent
    possible.
     
    Paul Rubin, Mar 12, 2009
    #10
  11. Re: Rough draft: Proposed format specifier for a thousands separator

    [Paul Rubin]
    > It would be nice if the PEP included a comparison between the proposed
    > scheme and how it is done in other programs and languages.


    Good idea. I'm hoping that people will post those here.
    In my quick research, it looks like many languages offer
    nothing more than the usual C style % formatting and defer
    the rest for a local aware module.


    >  For
    > example, I think Common Lisp has a feature for formatting thousands.


    Do you have more detail?


    > Spreadsheets like Excel probably have something similar.


    I addressed that in the PEP in the section on VB and relatives. Their
    approach doesn't graft-on to our existing approach. They use format
    specifiers like: "_($* #,##0_)".


    Raymond
     
    Raymond Hettinger, Mar 12, 2009
    #11
  12. Re: Rough draft: Proposed format specifier for a thousands separator

    [Paul Rubin]
    > I think Common Lisp has a feature for formatting thousands.


    I found the Common Lisp spec for this and added it to the PEP.


    Raymond
     
    Raymond Hettinger, Mar 12, 2009
    #12
  13. Raymond Hettinger

    Paul Rubin Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger <> writes:
    > In my quick research, it looks like many languages offer
    > nothing more than the usual C style % formatting and defer
    > the rest for a local aware module.


    Hendrik van Rooyen's mention of Cobol's "picture" (aka PIC)
    specifications might be added to the list. Cautionary tale: I once
    had a similar idea and suggested including a bastardized version of
    PIC in an extension language for something I worked on once. Another
    programmer then coded a reasonable PIC subset and we shipped it.
    Turned out that a number of our users were Cobol experts and once we
    had anything like PIC, they expected the weirdest and most obscure
    features (of which there were quite a few) of real Cobol PIC to work.
    We ended up having to assign someone a fairly lengthy task of figuring
    out the Cobol spec and implementing every last damn PIC feature. But
    I digress.


    > > example, I think Common Lisp has a feature for formatting thousands.

    > Do you have more detail?


    http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node200.html

    gives as an example:

    (format nil "The answer is ~:D." (expt 47 x))
    => "The answer is 229,345,007."
     
    Paul Rubin, Mar 12, 2009
    #13
  14. Raymond Hettinger

    Paul Rubin Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger <> writes:
    > I found the Common Lisp spec for this and added it to the PEP.


    Ah, cool, I simultaneously looked for it and posted about it.
     
    Paul Rubin, Mar 12, 2009
    #14
  15. Raymond Hettinger

    Lie Ryan Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Hendrik van Rooyen wrote:
    > "Ulrich Eckhardt" <eck...aser.com> wrote:
    >
    >> IOW, why not explicitly say what you want using keyword arguments with
    >> defaults instead of inventing an IMHO cryptic, read-only mini-language?
    >> Seriously, the problem I see with this proposal is that its aim to be as
    >> short as possible actually makes the resulting format specifications
    >> unreadable. Could you even guess what "8T.,1f" should mean if you had not
    >> written this?

    >
    > +1
    >
    > Look back in history, and see how COBOL did it with the
    > PICTURE - dead easy and easily understandable.
    > Compared to that, even the C printf stuff and python's %
    > are incomprehensible.
    >
    > - Hendrik


    Seeing how many people complained for the proposal being unreadable
    (although it tries to be simple by not including too much features), why
    not go all the way to unreadability and teach people to always use some
    sort of convenience function and never use the microlanguage except of
    very simple cases (or extremely complex cases, in which case you might
    actually be better served with writing your own formatting function).

    A hyphotetical code using conv function and the microlanguage could look
    like this:

    >>> num = 213210.3242
    >>> fmt = create_format(sep='-', decsep='@')
    >>> print fmt

    50|\/|3_v3ry_R34D4|3L3_C0D3
    >>> '{0!{1}}'.format(num, fmt)

    '213-210@3242'
     
    Lie Ryan, Mar 13, 2009
    #15
  16. Re: Rough draft: Proposed format specifier for a thousands separator

    [Lie Ryan]
    > A hyphotetical code using conv function and the microlanguage could look
    > like this:
    >
    >  >>> num = 213210.3242
    >  >>> fmt = create_format(sep='-', decsep='@')
    >  >>> print fmt
    > 50|\/|3_v3ry_R34D4|3L3_C0D3
    >  >>> '{0!{1}}'.format(num, fmt)
    > '213-210@3242'


    LOL, it's like APL all over again ;-)

    FWIW, the latest version of the proposal is dirt simple:

    >>> format(1234567, 'd') # what we have now

    '1234567'
    >>> format(1234567, ',d') # proposed new option

    '1,234,567'
    >>> format(1234.5, '.2f') # what we have now

    '1234.50'
    >>> format(1234.5, ',.2f') # proposed new option

    '1,234.50'


    The proposal is roughly:
    If you want commas in the output,
    put a comma in the format string.
    It's not rocket science.

    What is rocket science is what you have to do now
    to achieve the same effect. If someone finds the
    above to be baffling, how the heck are they going
    to do the same thing using the locale module?


    Raymond
     
    Raymond Hettinger, Mar 13, 2009
    #16
  17. Raymond Hettinger

    Tim Rowe Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    2009/3/12 Raymond Hettinger <>:
    > If anyone here is interested, here is a proposal I posted on the
    > python-ideas list.
    >
    > The idea is to make numbering formatting a little easier with the new
    > format() builtin
    > in Py2.6 and Py3.0:  http://docs.python.org/library/string.html#formatspec


    As far as I can see you're proposing an amendment to *encourage*
    writing code that is not locale aware, with the amendment itself being
    locale specific, which surely has to be a regressive move in the 21st
    century. Frankly, I'd sooner see it made /harder/ to write code that
    is not locale aware (warnings, like FxCop gives on .net code?) tnan
    /easier/. Perhaps that's because I'm British, not American and I'm
    sick of having date fields get the date wrong because the programmer
    thinks the USA is the world. It makes me sympathetic to the problems
    caused to others by programmers who think the English-speaking world
    is the world.

    By the way, to others who think that 123,456.7 and 123.456,7 are the
    only conventions in common use in the West, no they're not. 123 456.7
    is in common use in engineering, at least in Europe, precisely to
    reduce (though not eliminate) problems caused by dot and comma
    confusion..

    --
    Tim Rowe
     
    Tim Rowe, Mar 13, 2009
    #17
  18. Raymond Hettinger

    Paul Rubin Guest

    Re: Rough draft: Proposed format specifier for a thousands separator

    Raymond Hettinger <> writes:
    > The proposal is roughly:
    > If you want commas in the output,
    > put a comma in the format string.
    > It's not rocket science.


    What if you want to change the separator? Europeans usually
    use periods instead of commas: one thousand = 1.000.
     
    Paul Rubin, Mar 13, 2009
    #18
  19. Re: Rough draft: Proposed format specifier for a thousands separator

    [andrew cooke]
    > would it break anything to also allow
    >
    > >>> format(1234567, 'd')       # what we have now

    >  '1234567'
    > >>> format(1234567, '.d')      # proposed new option

    >  '1.234.567'
    > >>> format(1234.5, ',2f')      # proposed new option

    >  '1234,50'
    > >>> format(1234.5, '.,2f')     # proposed new option


    Yes, that's allowed too! The separators can be any one of COMMA,
    SPACE, DOT, UNDERSCORE, or NON-BREAKING-SPACE.
     
    Raymond Hettinger, Mar 13, 2009
    #19
  20. Re: Rough draft: Proposed format specifier for a thousands separator

    [Paul Rubin]
    > What if you want to change the separator?  Europeans usually
    > use periods instead of commas: one thousand = 1.000.


    That is supported also.
     
    Raymond Hettinger, Mar 13, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Carlos Nepomuceno
    Replies:
    0
    Views:
    102
    Carlos Nepomuceno
    May 21, 2013
  2. Ned Deily
    Replies:
    0
    Views:
    92
    Ned Deily
    May 21, 2013
  3. Carlos Nepomuceno
    Replies:
    1
    Views:
    109
    88888 Dihedral
    May 24, 2013
  4. Chris “Kwpolska†Warrick

    Re: PEP 378: Format Specifier for Thousands Separator

    Chris “Kwpolska†Warrick, May 21, 2013, in forum: Python
    Replies:
    0
    Views:
    100
    Chris “Kwpolska†Warrick
    May 21, 2013
  5. Skip Montanaro
    Replies:
    0
    Views:
    109
    Skip Montanaro
    May 21, 2013
Loading...

Share This Page