negative integer division

Discussion in 'Python' started by Imbaud Pierre, Feb 8, 2005.

  1. integer division and modulo gives different results in c and python,
    when negative numbers
    are involved. take gdb as a widely available c interpreter
    print -2 /3
    0 for c, -1 for python.
    more amazing, modulos of negative number give negative values! (in c).
    from an algebraic point of view, python seems right, but I thought
    python conformity to the underlying c compiler was a strong commitment,
    obviously not here, and I found no explanation whatsoever in python doc.
    no actual programming challenge for me here, I just had this bug, after
    confidently translating from python to c.
     
    Imbaud Pierre, Feb 8, 2005
    #1
    1. Advertisements

  2. Skip Montanaro, Feb 8, 2005
    #2
    1. Advertisements

  3. Imbaud Pierre

    Robert Kern Guest

    I don't think there's much *commitment* to follow the behaviour of your
    C compiler. It's more of a default stance when the issues get too
    complicated for Python to implement itself. Thus, much of the floating
    point behaviour defaults to the behaviour of your platform because
    floating point math is hard and getting consistent behaviour on lots of
    platforms is really hard.

    Integers are a piece of cake, relatively speaking, and I think that it's
    worth fixing up a few warts from the C behaviour.

    As for documentation, see
    http://www.python.org/doc/current/lib/typesnumeric.html

    --
    Robert Kern


    "In the fields of hell where the grass grows high
    Are the graves of dreams allowed to die."
    -- Richard Harter
     
    Robert Kern, Feb 8, 2005
    #3
  4. Imbaud Pierre

    Mark Jackson Guest

    AIUI the C standard is silent on the issue, and hence the C behavior is
    implementation-dependent. Anyway back in 2000 I found and fixed a
    Y2K-related problem in an open-source C program (xvtdl) which was down
    to precisely this misbehavior. While diagnosing the problem I
    implemented the algorithm in Python for test purposes, and was led
    astray for a while by the fact that it *didn't* fail!

    A: 42

    Q: What multiple of 7 did I add to the critical expression in the Zeller
    algorithm so it would remain nonnegative for the next few centuries?
     
    Mark Jackson, Feb 8, 2005
    #4
  5. If you read the C standard, you can see that the C compiler is free to
    make -1 / 3 either -1 or 0 (but it must describe what it does). The
    decision is usually to do whatever the underlying CPU does. The
    rationale is that often you will be dividing positive by positive,
    and you don't want the compiler sprinkling around a lot of test code.

    While the difference in compiled C is a substantial time penalty
    (DIVIDE vs. DIVIDE; COMPARE; JUMPGE), in Python the trade-off is
    adding two instructions to many (50? 100?). Python doesn't really
    mind paying for a few extra instructions to get the canonical value,
    and the guarantee simplifies the user's code.

    --Scott David Daniels
     
    Scott David Daniels, Feb 8, 2005
    #5
  6. Imbaud Pierre

    Jive Dadson Guest

    Python does it right. C is allowed to do it anyway it likes, which was
    a stupifyingly horrible decision, IMHO.

    Way back when, there was a language named Pascal. I lobbied the Pascal
    standards committee to define the modulus operator correctly, which they
    eventually did. To my astonishment, they proceded to define the
    division operator backwards to the modulus operator, so that division
    and mod did not play together correctly. Duh.
     
    Jive Dadson, Feb 9, 2005
    #6
  7. Imbaud Pierre

    Dan Bishop Guest

    The *original* C standard is silent on the issue, but the current
    standard requires the wrong behavior.
     
    Dan Bishop, Feb 9, 2005
    #7
  8. Imbaud Pierre

    Mike Meyer Guest

    C only does it wrong if you think that C is a high level language. It
    isn't - it's a portable assembler. As such, low level things (like
    this, or what happens on integer overflow, or ...) are left up to the
    implementation, so it can do what's most natural for the underlying
    hardware. This means that when you don't care - which I'd argue is
    most of the time - you get the fastest thing the machine will do. When
    you do care, you have to take care of it yourself. Of course, if you
    care, you probably shouldn't be writing in assembler, you should
    probably be writing in a high level language - which will make sure
    the low level things get done right, irregardless of what the
    underlying machine does.

    Now, I'll agree with you if you want to argue that some machines do
    negative integer division in stupifyingly horrible ways.

    <mike
     
    Mike Meyer, Feb 9, 2005
    #8
  9. Imbaud Pierre

    Jive Dadson Guest

    I didn't say it does it wrong. I said it does it anyway it likes --
    maybe right, maybe wrong. There *is* a right way, IMHO. Python does it
    that way.
    I've heard that many times, but it makes no sense to me. By definition,
    the syntax of an assembly language closely resembles the format of
    individual hardware instructions for a particular processor. An
    assembler assembles individual hardware instructions. Back in the day,
    Unix (written largely in C) and Steve Johnson's pcc (the *portable* C
    compiler) together constituted a big leap forward. Implementing Unix on
    new processors was infinitely easier than porting OS's written in
    assembly language. So I disagree on two counts: C code is not entirely
    portable, (for example, division and mod may not work the same on two
    different machines), and a C compiler is most certainly not an
    assembler.

    That's why I think it was a stupifyingly horrible decision.
    Understandable, but in the end an s.h.d. nonetheless. It would have
    been okay to define / and % correctly, in the mathematical sense, but
    also provide functions quick_div() and quick_mod() that were guaranteed
    to work correctly only when both arguments were positive. The way they
    did it is just too error prone, akin to early optimization. It's bitten
    me before, when I was called on to port code (which I did not write)
    from one machine to another. Having standard operators with
    under-defined behavior is just inviting trouble: long debugging
    sessions, or worse, unexplained failures in the field. Of course you
    and I would avoid all the pitfalls at the start. :)

    .... and now back to your regularly scheduled Python newsgroup.
     
    Jive Dadson, Feb 9, 2005
    #9
  10. I think the point is that C is a low-level, hardware twiddling
    language to be used by people writing things like kernel code --
    something that was always done in assembler before C came
    along.
    For a language meant to write user-space applications where one
    probably cares what happens when a division results in a
    negative integer, it is a horrible decision. For a "portable
    assembler" used to write device drivers it makes sense. People
    writing that sort of code presumably know how their hardware
    behaves, don't expect that everything write is portable, and
    just don't do division with negative numbers. When they do
    division, it's with postive numbers and they don't want to
    waste the extra clock cycles to do it in a way that's
    machine-independant for negative numbers.

    The fact that C ended up in the rather inappropriate role of
    a user-land application language is different problem.
     
    Grant Edwards, Feb 9, 2005
    #10
  11. Imbaud Pierre

    Jive Dadson Guest

    intentionally
     
    Jive Dadson, Feb 9, 2005
    #11
  12. Imbaud Pierre

    Jive Dadson Guest

    And Python interpreters?
    In the early 80's, either C was the "appropriate language" or there was
    none ... and that's coming from someone who wrote a commercial
    Pascal compiler, runtime support, and debugger. I did it all in C.
    Pascal, as we all know, was ill-conceived. C++ was a momentous
    advance, but it intensionally inherited many of C's warts.

    I've forgotten what we are arguing about, but I'm sure I'm right.

    J.
     
    Jive Dadson, Feb 9, 2005
    #12
  13. No. That's my point. C isn't a good language in which to write
    user applications such as a Python language. One uses C when
    the other choice is assembly, not when the other choice is
    "real" high level language like Python, or Modula-3, or
    Smalltalk, or whatever.
    I never thought so. I did embedded systems development using
    Pascal and quite enjoyed it. Later when Pascal waned and C
    waxed, I thought C was a definite step backwards. I thought
    Modula-2 and Modula-3 were both good languages as well.
    Now C++ _was_ ill-conceived. It's more of an agglomeration of
    features than Perl. And dangerous to boot: at least Perl tries
    to protect you from memory leaks and stray pointers.
    And added a bunch of it's own.

    This is pretty much completely off-topic now. :)
     
    Grant Edwards, Feb 9, 2005
    #13
  14. I disagree!
     
    Grant Edwards, Feb 9, 2005
    #14
  15. Imbaud Pierre

    Peter Hansen Guest

    No discussion of how lame other languages are is ever
    completely off-topic in comp.lang.python. After all,
    these discussions continue to remind us how lucky we
    all are to be able to program in Python, and that
    can only be a good thing. ;-)

    -Peter
     
    Peter Hansen, Feb 9, 2005
    #15
  16. ^^^ QOTW

    --Scott David Daniels
     
    Scott David Daniels, Feb 10, 2005
    #16
  17. Imbaud Pierre

    Carl Banks Guest


    C language is chock-full of things that are stupidly horrible
    decisions. This is one of very least of them.

    The philosophy of C was to turn opeations into one or two instructions,
    if they could. Just about any operator in C can be, and having /
    invoke a small subprogram would have been the absolute wrong thing to
    do. The other philosophy of C was to make it easy for the programmer
    to hand-optimize stuff. Which means no way was it going to force the
    programmer to use quick_div to get the division in two instructions.
    It was the right decision.

    I would say that a better thing to call this is a stupidly horrible
    circumstance. The circumstance is that an archaic language designed to
    be hand-optimized and has unfortuntate importabilities has ever became
    everyone's shizzle.

    (I'm hating C today; I was asked to write something in C and I can't
    use anything else because someone has to use to code.)
     
    Carl Banks, Feb 10, 2005
    #17
  18. Imbaud Pierre

    John Machin Guest

    What are you calling "the Zeller algorithm", and what is the "critical
    expression"?

    There _is_ something called "Zeller's congruence":

    if m is the month number in an adjusted or computational year (m == 0
    => March, m == 11 => Feb), then the number of days in the adjusted
    year before month m is [Python notation] (13*m + 2) // 5 + m * 28

    Alternatively (153*m + 2) // 5

    I believe that strictly the term "Zeller's congruence" applies only to
    the (13*m + 2)//5 part. In any case, all of the above is quite
    independent of the year, and there is no possibility of going
    negative.

    I've no doubt you came across a stuffed-up date-to-days calculation
    routine and fixed it, but it's a bit unfair to lumber Zeller with the
    blame. If it was a days-to-date routine, then Zeller is not even
    standing next to the real target.

    Cheers,
    John
     
    John Machin, Feb 10, 2005
    #18
  19. Imbaud Pierre

    Mike Meyer Guest

    Um, no. The syntax of an assembly language is totally unrelated to the
    format of the individual hardware instructions for a particular
    processor. Typically, the syntax of an assembly language is such that
    one obvious syntactical element (for instance, a line) generates one
    hardware instruction. Usually, mnemonics of some kind were used to
    denote which instruction to generate. However, it doesn't have to be
    that way. Whitesmith had a z80/8080 assembler in which you wrote "a +=
    b" to add the contents of register b to register a.
    A big leap forward *from assembler*. The development space that C
    occupied at that time was the kernel and system utilities - things
    that were usually (though by no means always) done in assembler before
    the arrival of C. As a tool for producing robust, reliable code it
    pretty much sucks, because it has so many of the flaws that assembler
    has. Calling it a high level language is a disservice to the HLLs of
    the era.
    No language is entirely portable. C is *much* more portable than the
    assembler languages that it displaced.

    Note that C has displaced assembler in other areas - for instance,
    it's not uncommon to find compilers that use "ANSI C" as the target
    machine. By using C as a portable assembler instead of generating
    machine code, the number of supported platforms increases
    dramatically.
    The way they did it was exactly right for the target applications
    (i.e. - the v6 Unix kernel and system utilities).
    There is no such thing as a portable program - merely ported
    programs. I've been bitten by the code assuming that ints were two
    bytes long, and know of cases where people were bitten by code that
    assumed that chars were 8 bits. The simple fact of the matter is that
    these things - like the behavior of the low-level div and mod
    instructions - change from machine to machine. You have to deal with
    such things when you are writing what is basically assembler. If you
    were using a real HLL, none of these things would be a problem.
    It's no worse than having standard types with loosely defined
    sizes. That's part of the price for working that close to the machine.

    <mike
     
    Mike Meyer, Feb 10, 2005
    #19
  20. Imbaud Pierre

    Mark Jackson Guest

    A C function in calendar.c, encountered in the source code for xvtdl:

    int zeller (month, day, year)
    int month, day, year;
    {
    int century;
    month -= 2; /* Years start on March 1 so adjust standard date */
    if (month < 1) {
    month += 12;
    year--;
    }
    century = year / 100;
    year = (int)year % (int)100;
    return ((int)((2.6 * month - 0.1) + day + year + year / 4 + century / 4 - century * 2) % 7);
    }

    The expression upon which "% 7" acts is negative when "year" is small.
    This caused errors beginning in March 2000; which could be deferred by
    adding a suitably-large multiple of 7 to the expression. The choice
    was obvious. :)
    Fair enough, although I'm not responsible for having named the function
    (which appears to date from 1991). The original author is identified
    in the code (available at
    http://www.alumni.caltech.edu/~mjackson/xvtdl.html) and is findable via
    the Web; you might take the matter up with him.
     
    Mark Jackson, Feb 10, 2005
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.