Re: Annoying octal notation

Discussion in 'Python' started by Derek Martin, Aug 21, 2009.

  1. Derek Martin

    Derek Martin Guest

    John Nagle wrote:
    > Yes, and making lead zeros an error as suggested in PEP 3127 is a
    > good idea. It will be interesting to see what bugs that flushes
    > out.


    James Harris wrote:
    > It maybe made sense once but this relic of the past should have been
    > consigned to the waste bin of history long ago.


    Sigh. Nonsense. I use octal notation *every day*, for two extremely
    prevalent purposes: file creation umask, and Unix file permissions
    (i.e. the chmod() function/command).

    I fail to see how 0O012, or even 0o012 is more intelligible than 012.
    The latter reads like a typo, and the former is virtually
    indistinguishable from 00012, O0012, or many other combinations that
    someone might accidentally type (or intentionally type, having to do
    this in dozens of other programming languages). I can see how 012 can
    be confusing to new programmers, but at least it's legible, and the
    great thing about humans is that they can be taught (usually). I for
    one think this change is completely misguided. More than flushing out
    bugs, it will *cause* them in ubiquity, requiring likely terabytes of
    code to be poured over and fixed. Changing decades-old behaviors
    common throughout a community for the sake of avoiding a minor
    inconvenience of the n00b is DUMB.

    --
    Derek D. Martin
    http://www.pizzashack.org/
    GPG Key ID: 0x81CFE75D


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (GNU/Linux)

    iD8DBQFKjvopdjdlQoHP510RAlw1AJ9Adaa4HfSOIzRNc+cbDj3hFDmrvgCgiyoa
    VvbNINYG2YAsjbtoKO4kPGA=
    =q2MS
    -----END PGP SIGNATURE-----
     
    Derek Martin, Aug 21, 2009
    #1
    1. Advertising

  2. >>>>> Derek Martin <> (DM) wrote:

    >DM> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
    >DM> The latter reads like a typo, and the former is virtually
    >DM> indistinguishable from 00012, O0012, or many other combinations that
    >DM> someone might accidentally type (or intentionally type, having to do
    >DM> this in dozens of other programming languages).


    You're right. Either hexadecimal should have been 0h or octal should
    have been 0t :=)
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 21, 2009
    #2
    1. Advertising

  3. Derek Martin

    MRAB Guest

    Piet van Oostrum wrote:
    >>>>>> Derek Martin <> (DM) wrote:

    >
    >> DM> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
    >> DM> The latter reads like a typo, and the former is virtually
    >> DM> indistinguishable from 00012, O0012, or many other combinations that
    >> DM> someone might accidentally type (or intentionally type, having to do
    >> DM> this in dozens of other programming languages).

    >
    > You're right. Either hexadecimal should have been 0h or octal should
    > have been 0t :=)


    I have seen the use of Q/q instead in order to make it clearer. I still
    prefer Smalltalk's 16rFF and 8r377.
     
    MRAB, Aug 21, 2009
    #3
  4. Derek Martin

    James Harris Guest

    On 21 Aug, 20:48, Derek Martin <> wrote:

    ....

    > James Harris wrote:
    > > It maybe made sense once but this relic of the past should have been
    > > consigned to the waste bin of history long ago.

    >
    > Sigh.  Nonsense.  I use octal notation *every day*, for two extremely
    > prevalent purposes: file creation umask, and Unix file permissions
    > (i.e. the chmod() function/command).  


    You misunderstand. I was saying that taking a leading zero as
    indicating octal is archaic. Octal itself is fine where appropriate.

    The chmod command doesn't require a leading zero.

    James
     
    James Harris, Aug 22, 2009
    #4
  5. Derek Martin

    James Harris Guest

    On 21 Aug, 22:18, MRAB <> wrote:
    > Piet van Oostrum wrote:
    > >>>>>> Derek Martin <> (DM) wrote:

    >
    > >> DM> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
    > >> DM> The latter reads like a typo, and the former is virtually
    > >> DM> indistinguishable from 00012, O0012, or many other combinations that
    > >> DM> someone might accidentally type (or intentionally type, having to do
    > >> DM> this in dozens of other programming languages).  

    >
    > > You're right. Either hexadecimal should have been 0h or octal should
    > > have been 0t :=)

    >
    > I have seen the use of Q/q instead in order to make it clearer. I still
    > prefer Smalltalk's 16rFF and 8r377.


    Two interesting options. In a project I have on I have also considered
    using 0q as indicating octal. I maybe saw it used once somewhere else
    but I have no idea where. 0t was a second choice and 0c third choice
    (the other letters of oct). 0o should NOT be used for obvious reasons.

    So you are saying that Smalltalk has <base in decimal>r<number> where
    r is presumably for radix? That's maybe best of all. It preserves the
    syntactic requirement of starting a number with a digit and seems to
    have greatest flexibility. Not sure how good it looks but it's
    certainly not bad.

    0xff & 0x0e | 0b1101
    16rff & 16r0e | 2r1101

    Hmm. Maybe a symbol would be better than a letter.

    James
     
    James Harris, Aug 22, 2009
    #5
  6. On Fri, 21 Aug 2009 14:48:57 -0500, Derek Martin wrote:

    >> It maybe made sense once but this relic of the past should have been
    >> consigned to the waste bin of history long ago.

    >
    > Sigh. Nonsense. I use octal notation *every day*, for two extremely
    > prevalent purposes: file creation umask, and Unix file permissions (i.e.
    > the chmod() function/command).


    And you will still be able to, by explicitly using octal notation.


    > I fail to see how 0O012, or even 0o012 is more intelligible than 012.


    The first is wrong, bad, wicked, and if I catch anyone using it, they
    will be soundly slapped with a halibut. *wink*

    Using O instead of o for octal is so unreadable that I think it should be
    prohibited by the language, no matter that hex notation accepts both x
    and X.


    > The latter reads like a typo,


    *Everything* reads like a typo if you're unaware of the syntax being used.


    > and the former is virtually
    > indistinguishable from 00012, O0012, or many other combinations that
    > someone might accidentally type (or intentionally type, having to do
    > this in dozens of other programming languages).


    Agreed.


    > I can see how 012 can
    > be confusing to new programmers, but at least it's legible, and the
    > great thing about humans is that they can be taught (usually).


    And the great thing is that now you get to teach yourself to stop writing
    octal numbers implicitly and be write them explicitly with a leading 0o
    instead :)

    It's not just new programmers -- it's any programmer who is unaware of
    the notation (possibly derived from C) that a leading 0 means "octal".
    That's a strange and bizarre notation to use, because 012 is a perfectly
    valid notation for decimal 12, as are 0012, 00012, 000012 and so forth.
    Anyone who has learnt any mathematics beyond the age of six will almost
    certainly expect 012 to equal 12. Having 012 equal 10 comes as a surprise
    even to people who are familiar with octal.


    > I for
    > one think this change is completely misguided. More than flushing out
    > bugs, it will *cause* them in ubiquity, requiring likely terabytes of
    > code to be poured over and fixed. Changing decades-old behaviors common
    > throughout a community for the sake of avoiding a minor inconvenience of
    > the n00b is DUMB.


    Use of octal isn't common. You've given two cases were octal notation is
    useful, but for every coder who frequently writes umasks on Unix systems,
    there are a thousand who don't.

    It's no hardship to write 0o12 instead of 012.


    --
    Steven
     
    Steven D'Aprano, Aug 22, 2009
    #6
  7. On Fri, 21 Aug 2009 16:52:29 -0700 (PDT), James Harris
    <> declaimed the following in
    gmane.comp.python.general:

    > So you are saying that Smalltalk has <base in decimal>r<number> where
    > r is presumably for radix? That's maybe best of all. It preserves the
    > syntactic requirement of starting a number with a digit and seems to
    > have greatest flexibility. Not sure how good it looks but it's
    > certainly not bad.
    >
    > 0xff & 0x0e | 0b1101
    > 16rff & 16r0e | 2r1101
    >
    > Hmm. Maybe a symbol would be better than a letter.
    >

    Or Ada's 16#FF#, 8#377#...

    I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or
    'FF'x, and o'377' or '377'o
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Aug 22, 2009
    #7
  8. Derek Martin

    MRAB Guest

    Dennis Lee Bieber wrote:
    > On Fri, 21 Aug 2009 16:52:29 -0700 (PDT), James Harris
    > <> declaimed the following in
    > gmane.comp.python.general:
    >
    >> So you are saying that Smalltalk has <base in decimal>r<number> where
    >> r is presumably for radix? That's maybe best of all. It preserves the
    >> syntactic requirement of starting a number with a digit and seems to
    >> have greatest flexibility. Not sure how good it looks but it's
    >> certainly not bad.
    >>
    >> 0xff & 0x0e | 0b1101
    >> 16rff & 16r0e | 2r1101
    >>
    >> Hmm. Maybe a symbol would be better than a letter.
    >>

    > Or Ada's 16#FF#, 8#377#...
    >

    '#' starts a comment, so that's right out! :)

    > I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or
    > 'FF'x, and o'377' or '377'o
     
    MRAB, Aug 22, 2009
    #8
  9. Derek Martin

    MRAB Guest

    David wrote:
    > Il Fri, 21 Aug 2009 16:52:29 -0700 (PDT), James Harris ha scritto:
    >
    >
    >> 0xff & 0x0e | 0b1101
    >> 16rff & 16r0e | 2r1101
    >>
    >> Hmm. Maybe a symbol would be better than a letter.

    >
    > What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
    >

    '_': what if in the future we want to allow them in numbers for clarity?

    ';': used to separate multiple statements on a line (but not used that
    often).
     
    MRAB, Aug 22, 2009
    #9
  10. Derek Martin

    Derek Martin Guest

    On Sat, Aug 22, 2009 at 10:03:35AM +1000, Ben Finney wrote:
    > > and the former is virtually indistinguishable from 00012, O0012, or
    > > many other combinations that someone might accidentally type (or
    > > intentionally type, having to do this in dozens of other programming
    > > languages).

    >
    > Only if you type the letter in uppercase. The lower-case ‘o’ is much
    > easier to distinguish.


    It is easier, but I dispute that it is much easier.

    > Whether or not you find ‘0o012’ easily distinguishable as a non-decimal
    > notation, it's undoubtedly easier to distinguish than ‘012’.


    012 has meant decimal 10 in octal to me for so long, from its use in
    MANY other programming languages, that I disagree completely.

    > > I can see how 012 can be confusing to new programmers, but at least
    > > it's legible, and the great thing about humans is that they can be
    > > taught (usually). I for one think this change is completely misguided.

    >
    > These human programmers, whether newbies or long-experienced, also deal
    > with decimal numbers every day, many of which are presented as a
    > sequence of digits with leading zeros — and we continue to think of them
    > as decimal numbers regardless. Having the language syntax opposed to
    > that is


    ...consistent with virtually every other popular programming language.

    --
    Derek D. Martin
    http://www.pizzashack.org/
    GPG Key ID: 0x81CFE75D


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (GNU/Linux)

    iD8DBQFKkEExdjdlQoHP510RAr6tAJ9Ik6uKD0i1h04LNR9ZaHgFLwO9RQCfe+kx
    jvvtN0HScEgH3wO+6WBPeAQ=
    =tnaa
    -----END PGP SIGNATURE-----
     
    Derek Martin, Aug 22, 2009
    #10
  11. 22-08-2009 o 21:04:17 Derek Martin <> wrote:

    > On Sat, Aug 22, 2009 at 10:03:35AM +1000, Ben Finney wrote:


    >> These human programmers, whether newbies or long-experienced, also deal
    >> with decimal numbers every day, many of which are presented as a
    >> sequence of digits with leading zeros — and we continue to think of them
    >> as decimal numbers regardless. Having the language syntax opposed to
    >> that is


    > ...consistent with virtually every other popular programming language.


    Probably not every other...

    Anyway -- being (as it was said) inconsistent with every-day-convention --
    it'd be also inconsistent with *Python* conventions, i.e.:

    0x <- hex prefix
    0b <- bin prefix

    Cheers,
    *j

    --
    Jan Kaliszewski (zuo) <>
     
    Jan Kaliszewski, Aug 22, 2009
    #11
  12. Derek Martin

    James Harris Guest

    Numeric literals in other than base 10 - was Annoying octal notation

    On 22 Aug, 10:27, David <> wrote:

    .... (snipped a discussion on languages and other systems interpreting
    numbers with a leading zero as octal)

    > > Either hexadecimal should have been 0h or octal should
    > > have been 0t :=)

    >
    >
    > I have seen the use of Q/q instead in order to make it clearer. I still
    > prefer Smalltalk's 16rFF and 8r377.
    >
    >
    > Two interesting options. In a project I have on I have also considered
    > using 0q as indicating octal. I maybe saw it used once somewhere else
    > but I have no idea where. 0t was a second choice and 0c third choice
    > (the other letters of oct). 0o should NOT be used for obvious reasons.
    >
    > So you are saying that Smalltalk has <base in decimal>r<number> where
    > r is presumably for radix? That's maybe best of all. It preserves the
    > syntactic requirement of starting a number with a digit and seems to
    > have greatest flexibility. Not sure how good it looks but it's
    > certainly not bad.
    >
    >
    > >   0xff & 0x0e | 0b1101
    > >   16rff & 16r0e | 2r1101

    >
    > > Hmm. Maybe a symbol would be better than a letter.


    ....

    > > Or Ada's 16#FF#, 8#377#...


    > > I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or
    > > 'FF'x, and o'377' or '377'o


    ....

    >
    > What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?


    They look good - which is important. The trouble (for me) is that I
    want the notation for a new programming language and already use these
    characters. I have underscore as an optional separator for groups of
    digits - 123000 and 123_000 mean the same. The semicolon terminates a
    statement. Based on your second idea, though, maybe a colon could be
    used instead as in

    2:1011, 8:7621, 16:c26b

    I don't (yet) use it as a range operator.

    I could also use a hash sign as although I allow hash to begin
    comments it cannot be preceded by anything other than whitespace so
    these would be usable

    2#1011, 8#7621, 16#c26b

    I have no idea why Ada which uses the # also apparently uses it to end
    a number

    2#1011#, 8#7621#, 16#c26b#

    Copying this post also to comp.lang.misc. Folks there may either be
    interested in the discussion or have comments to add.

    James
     
    James Harris, Aug 22, 2009
    #12
  13. Derek Martin

    Mel Guest

    Re: Numeric literals in other than base 10 - was Annoying octal notation

    James Harris wrote:

    > I have no idea why Ada which uses the # also apparently uses it to end
    > a number
    >
    > 2#1011#, 8#7621#, 16#c26b#


    Interesting. They do it because of this example from
    <http://archive.adaic.com/standards/83rat/html/ratl-02-01.html#2.1>:

    2#1#E8 -- an integer literal of value 256

    where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
    the radix. That is to say

    16#1#E2

    would also equal 256, since it's 1*16**2 .


    Mel.
     
    Mel, Aug 23, 2009
    #13
  14. Derek Martin

    Carl Banks Guest

    On Aug 22, 12:04 pm, Derek Martin <> wrote:
    > On Sat, Aug 22, 2009 at 10:03:35AM +1000, Ben Finney wrote:
    > > These human programmers, whether newbies or long-experienced, also deal
    > > with decimal numbers every day, many of which are presented as a
    > > sequence of digits with leading zeros — and we continue to think of them
    > > as decimal numbers regardless. Having the language syntax opposed to
    > > that is

    >
    > ...consistent with virtually every other popular programming language.



    If you know anything about Python, you should know that "consistent
    with virtually every other programming langauge" is, at most, a polite
    suggestion for how Python should do it.


    Carl Banks
     
    Carl Banks, Aug 23, 2009
    #14
  15. Derek Martin

    Carl Banks Guest

    On Aug 21, 12:48 pm, Derek Martin <> wrote:
    > John Nagle wrote:
    > > Yes, and making lead zeros an error as suggested in PEP 3127 is a
    > > good idea.  It will be interesting to see what bugs that flushes
    > > out.

    > James Harris wrote:
    > > It maybe made sense once but this relic of the past should have been
    > > consigned to the waste bin of history long ago.

    >
    > Sigh.  Nonsense.  I use octal notation *every day*, for two extremely
    > prevalent purposes: file creation umask, and Unix file permissions
    > (i.e. the chmod() function/command).


    Unix file permissions maybe made sense once but this relic of the past
    should have been consigned to the waste bin of history long ago. :)


    Carl Banks
     
    Carl Banks, Aug 23, 2009
    #15
  16. Derek Martin

    Derek Martin Guest

    On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
    > > I can see how 012 can
    > > be confusing to new programmers, but at least it's legible, and the
    > > great thing about humans is that they can be taught (usually).

    >
    > And the great thing is that now you get to teach yourself to stop writing
    > octal numbers implicitly and be write them explicitly with a leading 0o
    > instead :)


    Sorry, I don't write them implicitly. A leading zero explicitly
    states that the numeric constant that follows is octal. It is so in 6
    out of 7 computer languages I have more than a passing familiarity
    with (the 7th being scheme, which is a thing unto itself), including
    Python. It's that way on Bourne-compatible and POSIX-compatible Unix
    shells (though it requires a leading backslash before the leading zero
    there). I'm quite certain it can not be the case on only those 6
    languages that I happen to be familiar with...

    While it may be true that people commonly write decimal numbers with
    leading zeros (I dispute even this, having to my recollection only
    recently seen it as part of some serial number, which in practice is
    really more of a string identifier than a number, often containing
    characters other than numbers), it's also true that in the context of
    computer programming languages, for the last 40+ years, a number
    represented with a leading zero is most often an octal number. This
    has been true even in Python for nearly *twenty years*. Why the
    sudden need to change it?

    So no, I don't get to teach myself to stop writing octal numbers with
    a leading zero. Instead, I have to remember an exception to the rule.

    Also I don't think it's exactly uncommon for computer languages to do
    things differently than they are done in non-CS circles. A couple of
    easy examples: we do not write x+=y except in computer languages. The
    POSIX system call to create a file is called creat(). If you think
    about it, I'm sure you can come up with lots of examples where even
    Python takes liberties. Is this a bad thing? Not inherently, no.
    Will it be confusing to people who aren't familiar with the usage?
    Quite possibly, but that is not inherently bad either. It's all about
    context.

    > Use of octal isn't common.


    It's common enough. Peruse the include files for your C libraries, or
    the source for your operating system's kernel, or system libraries,
    and I bet you'll find plenty of octal. I did. [Note that it is
    irrelevant that these are C/C++ files; here we are only concerned with
    whether they use octal, not how it is represented therein.] I'd guess
    there's a fair chance that any math or scientific software package
    uses octal. Octal is a convenient way to represent bit fields that
    naturally occur in groups of 3, of which there are potentially
    limitless cases.

    > You've given two cases were octal notation is useful, but for every
    > coder who frequently writes umasks on Unix systems, there are a
    > thousand who don't.


    I gave two cases that I use *daily*, or very nearly daily. My hats
    currently include system admin, SQA, and software development, and I
    find it convenient to use octal in each of those. But those are
    hardly the only places where octal is useful. Have a look at the
    ncurses library, for example. Given that Python has an ncurses
    interface, I'm guessing it's used there too. In fact if the Python
    source had no octal in it, I would find that very surprising.

    > It's no hardship to write 0o12 instead of 012.


    Computer languages are not write-only, excepting maybe Perl. ;-)
    Writing 0o12 presents no hardship; but I assert, with at least some
    support from others here, that *reading* it does.

    --
    Derek D. Martin
    http://www.pizzashack.org/
    GPG Key ID: 0x81CFE75D


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (GNU/Linux)

    iD8DBQFKkLUldjdlQoHP510RAp5TAJ9HXamZE/VTqVBhmQYH6vs4Gzdy9ACeNtVw
    JD2Y+UMjKr4tUOuSk10umcA=
    =nST7
    -----END PGP SIGNATURE-----
     
    Derek Martin, Aug 23, 2009
    #16
  17. Derek Martin

    Derek Martin Guest

    On Fri, Aug 21, 2009 at 04:23:57PM -0700, James Harris wrote:
    > You misunderstand. I was saying that taking a leading zero as
    > indicating octal is archaic. Octal itself is fine where appropriate.


    I don't see that the leading zero is any more archaic than the use of
    octal itself... Both originate from around the same time period, and
    are used in the same cases. We should just prohibit octal entirely
    then.

    But I suppose it depends on which definition of "archaic" you use. In
    the other common sense of the word, the leading zero is no more
    archaic than the C programming language. Let's ban the use of all
    three. :) (I believe C is still the language in which the largest
    number of lines of new code are written, but if not, it's way up
    there.)

    > The chmod command doesn't require a leading zero.


    No, but it doesn't need any indicator that the number given to it is
    in octal; in the case of the command line tool, octal is *required*,
    and the argument is *text*. However, the chmod() system call, and the
    interfaces to it in every language I'm familiar with that has one, do
    require the leading zero (because that's how you represent octal).
    Including Python, for some 20 years or so.

    --
    Derek D. Martin
    http://www.pizzashack.org/
    GPG Key ID: 0x81CFE75D


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (GNU/Linux)

    iD8DBQFKkLgzdjdlQoHP510RAry1AJ4xxdyE1yd9Hllk2Y3aLQXkeZ3LSQCgvk0D
    5qHbgHyof7eNSt/1XlzmpR0=
    =hVF3
    -----END PGP SIGNATURE-----
     
    Derek Martin, Aug 23, 2009
    #17
  18. On Sat, 22 Aug 2009 22:19:01 -0500, Derek Martin <>
    declaimed the following in gmane.comp.python.general:


    > While it may be true that people commonly write decimal numbers with
    > leading zeros (I dispute even this, having to my recollection only
    > recently seen it as part of some serial number, which in practice is
    > really more of a string identifier than a number, often containing
    > characters other than numbers), it's also true that in the context of
    > computer programming languages, for the last 40+ years, a number
    > represented with a leading zero is most often an octal number. This
    > has been true even in Python for nearly *twenty years*. Why the
    > sudden need to change it?
    >

    About the only place one commonly sees leading zeros on decimal
    numbers, in my experience, is zero-filled COBOL data decks (and since
    classic COBOL stores in BCD anyway... binary (usage is
    computational/comp-1) was a later add-on to the data specification model
    as I recall...)
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Aug 23, 2009
    #18
  19. On Sat, 22 Aug 2009 14:04:17 -0500, Derek Martin wrote:

    >> These human programmers, whether newbies or long-experienced, also deal
    >> with decimal numbers every day, many of which are presented as a
    >> sequence of digits with leading zeros — and we continue to think of
    >> them as decimal numbers regardless. Having the language syntax opposed
    >> to that is

    >
    > ...consistent with virtually every other popular programming language.


    A mistake is still a mistake even if it shared with others.

    Treating its with a lead zero as octal was a design error when it was
    first thought up (possibly in C?) and it remains a design error no matter
    how many languages copy it. I feel your pain of having to unlearn
    something you have learned, but just because you have been lead astray by
    the languages you use doesn't mean we should compound the error by
    leading the next generation of coders astray too.

    Octal is of little importance today, as near as I can tell it only has
    two common uses in high level languages: file umasks and permissions on
    Unix systems. It simply isn't special enough to justify implicit notation
    that surprises people, leads to silent errors, and is inconsistent with
    standard mathematical notation and treatment of floats:

    >>> 123.2000 # insignificant trailing zeroes don't matter

    123.2
    >>> 000123.2 # neither do insignificant leading zeroes

    123.2
    >>> 001.23e0023 # not even if it is an integer

    1.23e+23
    >>> 000123 # but here is matters

    83


    --
    Steven
     
    Steven D'Aprano, Aug 23, 2009
    #19
  20. On Sat, 22 Aug 2009 22:19:01 -0500, Derek Martin wrote:

    > On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
    >> > I can see how 012 can
    >> > be confusing to new programmers, but at least it's legible, and the
    >> > great thing about humans is that they can be taught (usually).

    >>
    >> And the great thing is that now you get to teach yourself to stop
    >> writing octal numbers implicitly and be write them explicitly with a
    >> leading 0o instead :)

    >
    > Sorry, I don't write them implicitly. A leading zero explicitly states
    > that the numeric constant that follows is octal.


    That is incorrect.

    Decimal numbers implicitly use base 10, because there's nothing in the
    literal 12340 (say) to indicate the base is ten, rather than 16 or 9 or
    23. Although implicit is usually bad, when it's as common and expected as
    decimal notation, it's acceptable.

    Hex decimals explicitly use base 16, because the leading 0x is defined to
    mean "base 16". 0x is otherwise not a legal decimal number, or hex number
    for that matter. (It would be legal in base 34 or greater, but that's
    rare enough that we can ignore this.) For the bases we care about, a
    leading 0x can't have any other meaning -- there's no ambiguity, so we
    can treat it as a synonym for "base 16".

    (Explicitness isn't a binary state, and it would be even more explicit if
    the base was stated in full, as in e.g. Ada where 16#FF# = decimal 255.)

    However, octal numbers are defined implicitly: 012 is a legal base 10
    number, or base 3, or base 9, or base 16. There's nothing about a leading
    zero that says "base 8" apart from familiarity. We can see the difference
    between leading 0x and leading 0 if you repeat it: repeating an explicit
    0x, as in 0x0xFF, is a syntax error, while repeating an implicit 0
    silently does nothing different:

    >>> 0x0xFF

    File "<stdin>", line 1
    0x0xFF
    ^
    SyntaxError: invalid syntax
    >>> 0077

    63


    > It is so in 6 out of 7
    > computer languages I have more than a passing familiarity with (the 7th
    > being scheme, which is a thing unto itself), including Python. It's
    > that way on Bourne-compatible and POSIX-compatible Unix shells (though
    > it requires a leading backslash before the leading zero there). I'm
    > quite certain it can not be the case on only those 6 languages that I
    > happen to be familiar with...


    No, of course not. There are a bunch of languages, pretty much all
    heavily influenced by C, which treat integer literals with leading 0s as
    oct: C++, Javascript, Python 2.x, Ruby, Perl, Java. As so often is the
    case, C's design mistakes become common practice. Sigh.

    However, there are many, many languages that don't, or otherwise do
    things differently to C. Even some modern C-derived languages reject the
    convention:

    C# doesn't have octal literals at all.

    As far as I can tell, Objective-C and Cocoa requires you to explicitly
    enable support for octal literals before you use them.

    In D, at least some people want to follow Python's lead and either drop
    support for oct literals completely, or require a 0o prefix:
    http://d.puremagic.com/issues/show_bug.cgi?id=2656

    E makes a leading 0 a syntax error.


    As far as other, non-C languages go, leading 0 = octal seems to be rare
    or non-existent:

    Basic and VB use a leading &O for octal.

    FORTRAN 90 uses a leading O (uppercase o) for octal, and surrounds the
    literal in quotation marks: O"12" would be ten in octal. 012 would be
    decimal 12.

    As far as I can tell, COBOL also ignores leading zeroes.

    Forth interprets literals according to the current value of BASE (which
    defaults to 10). There's no special syntax for it.To enter ten in octal,
    you might say:

    8 BASE ! 12

    or if your system provides it:

    OCT 12

    Standard Pascal ignores leading 0s in integers, and doesn't support octal
    at all. A leading $ is used for hex. At least one non-standard Pascal
    uses leading zero for octal.

    Haskell requires an explicit 0o:
    http://www.haskell.org/onlinereport/lexemes.html#lexemes-numeric

    So does OCaml.

    Ada uses decimal unless you explicitly give the base:
    http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html

    Leading zeroes are insignificant in bc:

    [steve@sylar ~]$ bc
    bc 1.06
    Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
    This is free software with ABSOLUTELY NO WARRANTY.
    For details type `warranty'.
    012 + 011
    23

    Leading zeroes are also insignificant in Hewlett-Packard RPN language
    (e.g. HP-48GX calculators), Hypertalk and languages derived from it.

    I'm not sure, but it looks to me like Boo doesn't support octal literals,
    although it supports hex with 0x and binary with 0b.

    Algol uses an explicit base: 8r12 to indicate octal 10.

    Common Lisp and Scheme use a #o prefix.

    As far as *languages* go, 0-based octal literals are in the tiny
    minority. As far as *programmers* go, it may be in a plurality, perhaps
    even a small minority, but remember there are still millions of VB
    programmers out there who are just as unfamiliar with C conventions.

    > While it may be true that people commonly write decimal numbers with
    > leading zeros (I dispute even this

    [...]

    Leading zeroes in decimal numbers are *very* common in dates and times.


    [...]
    > Given that Python has an ncurses interface, I'm
    > guessing it's used there too. In fact if the Python source had no octal
    > in it, I would find that very surprising.


    I can't see any oct literals in the standard library, not even in the
    ncurses interface, but then my grep-foo is weak and I may have made a
    mistake. I encourage you to look for yourself.


    >> It's no hardship to write 0o12 instead of 012.

    >
    > Computer languages are not write-only, excepting maybe Perl. ;-) Writing
    > 0o12 presents no hardship; but I assert, with at least some support from
    > others here, that *reading* it does.


    No more so than 0x or 0b literals. If anything, 0o12 stands out as "not
    twelve" far more than 012 does.



    --
    Steven
     
    Steven D'Aprano, Aug 23, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lucas Branca

    reg exp and octal notation

    Lucas Branca, Mar 5, 2004, in forum: Python
    Replies:
    5
    Views:
    500
    Lucas Branca
    Mar 5, 2004
  2. Grey Squirrel

    Hungarian Notation Vs. Pascal Notation?

    Grey Squirrel, Mar 19, 2007, in forum: ASP .Net
    Replies:
    6
    Views:
    1,314
    Steve C. Orr [MCSD, MVP, CSM, ASP Insider]
    Mar 21, 2007
  3. Simon Forman

    Re: Annoying octal notation

    Simon Forman, Aug 20, 2009, in forum: Python
    Replies:
    4
    Views:
    485
  4. James Harris

    Re: Annoying octal notation

    James Harris, Aug 21, 2009, in forum: Python
    Replies:
    1
    Views:
    418
    James Harris
    Aug 23, 2009
  5. Tameem
    Replies:
    454
    Views:
    12,013
Loading...

Share This Page