negative integer division

I

Imbaud Pierre

integer division and modulo gives different results in c and python,
when negative numbers
are involved. take gdb as a widely available c interpreter
print -2 /3
0 for c, -1 for python.
more amazing, modulos of negative number give negative values! (in c).
from an algebraic point of view, python seems right, but I thought
python conformity to the underlying c compiler was a strong commitment,
obviously not here, and I found no explanation whatsoever in python doc.
no actual programming challenge for me here, I just had this bug, after
confidently translating from python to c.
 
R

Robert Kern

Imbaud said:
integer division and modulo gives different results in c and python,
when negative numbers
are involved. take gdb as a widely available c interpreter
print -2 /3
0 for c, -1 for python.
more amazing, modulos of negative number give negative values! (in c).
from an algebraic point of view, python seems right, but I thought
python conformity to the underlying c compiler was a strong commitment,
obviously not here, and I found no explanation whatsoever in python doc.
no actual programming challenge for me here, I just had this bug, after
confidently translating from python to c.

I don't think there's much *commitment* to follow the behaviour of your
C compiler. It's more of a default stance when the issues get too
complicated for Python to implement itself. Thus, much of the floating
point behaviour defaults to the behaviour of your platform because
floating point math is hard and getting consistent behaviour on lots of
platforms is really hard.

Integers are a piece of cake, relatively speaking, and I think that it's
worth fixing up a few warts from the C behaviour.

As for documentation, see
http://www.python.org/doc/current/lib/typesnumeric.html

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
M

Mark Jackson

Imbaud Pierre said:
integer division and modulo gives different results in c and python,
when negative numbers
are involved. take gdb as a widely available c interpreter
print -2 /3
0 for c, -1 for python.
more amazing, modulos of negative number give negative values! (in c).
from an algebraic point of view, python seems right, but I thought
python conformity to the underlying c compiler was a strong commitment,

AIUI the C standard is silent on the issue, and hence the C behavior is
implementation-dependent. Anyway back in 2000 I found and fixed a
Y2K-related problem in an open-source C program (xvtdl) which was down
to precisely this misbehavior. While diagnosing the problem I
implemented the algorithm in Python for test purposes, and was led
astray for a while by the fact that it *didn't* fail!

A: 42

Q: What multiple of 7 did I add to the critical expression in the Zeller
algorithm so it would remain nonnegative for the next few centuries?
 
S

Scott David Daniels

Imbaud said:
integer division and modulo gives different results in c and python,
when negative numbers
are involved. take gdb as a widely available c interpreter
print -2 /3
0 for c, -1 for python.
more amazing, modulos of negative number give negative values! (in c).
from an algebraic point of view, python seems right, but I thought
python conformity to the underlying c compiler was a strong commitment,
obviously not here, and I found no explanation whatsoever in python doc.
no actual programming challenge for me here, I just had this bug, after
confidently translating from python to c.
If you read the C standard, you can see that the C compiler is free to
make -1 / 3 either -1 or 0 (but it must describe what it does). The
decision is usually to do whatever the underlying CPU does. The
rationale is that often you will be dividing positive by positive,
and you don't want the compiler sprinkling around a lot of test code.

While the difference in compiled C is a substantial time penalty
(DIVIDE vs. DIVIDE; COMPARE; JUMPGE), in Python the trade-off is
adding two instructions to many (50? 100?). Python doesn't really
mind paying for a few extra instructions to get the canonical value,
and the guarantee simplifies the user's code.

--Scott David Daniels
(e-mail address removed)
 
J

Jive Dadson

Python does it right. C is allowed to do it anyway it likes, which was
a stupifyingly horrible decision, IMHO.

Way back when, there was a language named Pascal. I lobbied the Pascal
standards committee to define the modulus operator correctly, which they
eventually did. To my astonishment, they proceded to define the
division operator backwards to the modulus operator, so that division
and mod did not play together correctly. Duh.
 
D

Dan Bishop

Mark said:
commitment,

AIUI the C standard is silent on the issue, and hence the C behavior is
implementation-dependent.

The *original* C standard is silent on the issue, but the current
standard requires the wrong behavior.
 
M

Mike Meyer

Jive Dadson said:
Python does it right. C is allowed to do it anyway it likes, which was
a stupifyingly horrible decision, IMHO.

C only does it wrong if you think that C is a high level language. It
isn't - it's a portable assembler. As such, low level things (like
this, or what happens on integer overflow, or ...) are left up to the
implementation, so it can do what's most natural for the underlying
hardware. This means that when you don't care - which I'd argue is
most of the time - you get the fastest thing the machine will do. When
you do care, you have to take care of it yourself. Of course, if you
care, you probably shouldn't be writing in assembler, you should
probably be writing in a high level language - which will make sure
the low level things get done right, irregardless of what the
underlying machine does.

Now, I'll agree with you if you want to argue that some machines do
negative integer division in stupifyingly horrible ways.

<mike
 
J

Jive Dadson

Mike said:
C only does it wrong if you think that C is a high level language.

I didn't say it does it wrong. I said it does it anyway it likes --
maybe right, maybe wrong. There *is* a right way, IMHO. Python does it
that way.
[C] isn't - it's a portable assembler.

I've heard that many times, but it makes no sense to me. By definition,
the syntax of an assembly language closely resembles the format of
individual hardware instructions for a particular processor. An
assembler assembles individual hardware instructions. Back in the day,
Unix (written largely in C) and Steve Johnson's pcc (the *portable* C
compiler) together constituted a big leap forward. Implementing Unix on
new processors was infinitely easier than porting OS's written in
assembly language. So I disagree on two counts: C code is not entirely
portable, (for example, division and mod may not work the same on two
different machines), and a C compiler is most certainly not an
assembler.

Now, I'll agree with you if you want to argue that some machines do
negative integer division in stupifyingly horrible ways.

That's why I think it was a stupifyingly horrible decision.
Understandable, but in the end an s.h.d. nonetheless. It would have
been okay to define / and % correctly, in the mathematical sense, but
also provide functions quick_div() and quick_mod() that were guaranteed
to work correctly only when both arguments were positive. The way they
did it is just too error prone, akin to early optimization. It's bitten
me before, when I was called on to port code (which I did not write)
from one machine to another. Having standard operators with
under-defined behavior is just inviting trouble: long debugging
sessions, or worse, unexplained failures in the field. Of course you
and I would avoid all the pitfalls at the start. :)

.... and now back to your regularly scheduled Python newsgroup.
 
G

Grant Edwards

[C] isn't - it's a portable assembler.

I've heard that many times, but it makes no sense to me.

I think the point is that C is a low-level, hardware twiddling
language to be used by people writing things like kernel code --
something that was always done in assembler before C came
along.
That's why I think it was a stupifyingly horrible decision.

For a language meant to write user-space applications where one
probably cares what happens when a division results in a
negative integer, it is a horrible decision. For a "portable
assembler" used to write device drivers it makes sense. People
writing that sort of code presumably know how their hardware
behaves, don't expect that everything write is portable, and
just don't do division with negative numbers. When they do
division, it's with postive numbers and they don't want to
waste the extra clock cycles to do it in a way that's
machine-independant for negative numbers.

The fact that C ended up in the rather inappropriate role of
a user-land application language is different problem.
 
J

Jive Dadson

Grant said:
[C] isn't - it's a portable assembler.

I've heard that many times, but it makes no sense to me.

I think the point is that C is a low-level, hardware twiddling
language to be used by people writing things like kernel code --

And Python interpreters?
The fact that C ended up in the rather inappropriate role of
a user-land application language is different problem.

In the early 80's, either C was the "appropriate language" or there was
none ... and that's coming from someone who wrote a commercial
Pascal compiler, runtime support, and debugger. I did it all in C.
Pascal, as we all know, was ill-conceived. C++ was a momentous
advance, but it intensionally inherited many of C's warts.

I've forgotten what we are arguing about, but I'm sure I'm right.

J.
 
G

Grant Edwards

[C] isn't - it's a portable assembler.

I've heard that many times, but it makes no sense to me.

I think the point is that C is a low-level, hardware twiddling
language to be used by people writing things like kernel code --

And Python interpreters?

No. That's my point. C isn't a good language in which to write
user applications such as a Python language. One uses C when
the other choice is assembly, not when the other choice is
"real" high level language like Python, or Modula-3, or
Smalltalk, or whatever.
In the early 80's, either C was the "appropriate language" or
there was none ... and that's coming from someone who wrote a
commercial Pascal compiler, runtime support, and debugger. I
did it all in C. Pascal, as we all know, was ill-conceived.

I never thought so. I did embedded systems development using
Pascal and quite enjoyed it. Later when Pascal waned and C
waxed, I thought C was a definite step backwards. I thought
Modula-2 and Modula-3 were both good languages as well.
C++ was a momentous advance,

Now C++ _was_ ill-conceived. It's more of an agglomeration of
features than Perl. And dangerous to boot: at least Perl tries
to protect you from memory leaks and stray pointers.
but it intensionally inherited many of C's warts.

And added a bunch of it's own.

This is pretty much completely off-topic now. :)
 
P

Peter Hansen

Grant said:
This is pretty much completely off-topic now. :)

No discussion of how lame other languages are is ever
completely off-topic in comp.lang.python. After all,
these discussions continue to remind us how lucky we
all are to be able to program in Python, and that
can only be a good thing. ;-)

-Peter
 
C

Carl Banks

Jive said:
That's why I think it was a stupifyingly horrible decision.
Understandable, but in the end an s.h.d. nonetheless.


C language is chock-full of things that are stupidly horrible
decisions. This is one of very least of them.

The philosophy of C was to turn opeations into one or two instructions,
if they could. Just about any operator in C can be, and having /
invoke a small subprogram would have been the absolute wrong thing to
do. The other philosophy of C was to make it easy for the programmer
to hand-optimize stuff. Which means no way was it going to force the
programmer to use quick_div to get the division in two instructions.
It was the right decision.

I would say that a better thing to call this is a stupidly horrible
circumstance. The circumstance is that an archaic language designed to
be hand-optimized and has unfortuntate importabilities has ever became
everyone's shizzle.

(I'm hating C today; I was asked to write something in C and I can't
use anything else because someone has to use to code.)
 
J

John Machin

A: 42

Q: What multiple of 7 did I add to the critical expression in the Zeller
algorithm so it would remain nonnegative for the next few centuries?

What are you calling "the Zeller algorithm", and what is the "critical
expression"?

There _is_ something called "Zeller's congruence":

if m is the month number in an adjusted or computational year (m == 0
=> March, m == 11 => Feb), then the number of days in the adjusted
year before month m is [Python notation] (13*m + 2) // 5 + m * 28

Alternatively (153*m + 2) // 5

I believe that strictly the term "Zeller's congruence" applies only to
the (13*m + 2)//5 part. In any case, all of the above is quite
independent of the year, and there is no possibility of going
negative.

I've no doubt you came across a stuffed-up date-to-days calculation
routine and fixed it, but it's a bit unfair to lumber Zeller with the
blame. If it was a days-to-date routine, then Zeller is not even
standing next to the real target.

Cheers,
John
 
M

Mike Meyer

Jive Dadson said:
Mike said:
[C] isn't - it's a portable assembler.

I've heard that many times, but it makes no sense to me. By definition,
the syntax of an assembly language closely resembles the format of
individual hardware instructions for a particular processor.

Um, no. The syntax of an assembly language is totally unrelated to the
format of the individual hardware instructions for a particular
processor. Typically, the syntax of an assembly language is such that
one obvious syntactical element (for instance, a line) generates one
hardware instruction. Usually, mnemonics of some kind were used to
denote which instruction to generate. However, it doesn't have to be
that way. Whitesmith had a z80/8080 assembler in which you wrote "a +=
b" to add the contents of register b to register a.
An
assembler assembles individual hardware instructions. Back in the day,
Unix (written largely in C) and Steve Johnson's pcc (the *portable* C
compiler) together constituted a big leap forward.

A big leap forward *from assembler*. The development space that C
occupied at that time was the kernel and system utilities - things
that were usually (though by no means always) done in assembler before
the arrival of C. As a tool for producing robust, reliable code it
pretty much sucks, because it has so many of the flaws that assembler
has. Calling it a high level language is a disservice to the HLLs of
the era.
Implementing Unix on
new processors was infinitely easier than porting OS's written in
assembly language. So I disagree on two counts: C code is not entirely
portable, (for example, division and mod may not work the same on two
different machines), and a C compiler is most certainly not an
assembler.

No language is entirely portable. C is *much* more portable than the
assembler languages that it displaced.

Note that C has displaced assembler in other areas - for instance,
it's not uncommon to find compilers that use "ANSI C" as the target
machine. By using C as a portable assembler instead of generating
machine code, the number of supported platforms increases
dramatically.
That's why I think it was a stupifyingly horrible decision.
Understandable, but in the end an s.h.d. nonetheless. It would have
been okay to define / and % correctly, in the mathematical sense,
but also provide functions quick_div() and quick_mod() that were
guaranteed to work correctly only when both arguments were positive.
The way they did it is just too error prone, akin to early
optimization.

The way they did it was exactly right for the target applications
(i.e. - the v6 Unix kernel and system utilities).
It's bitten me before, when I was called on to port
code (which I did not write) from one machine to another.

There is no such thing as a portable program - merely ported
programs. I've been bitten by the code assuming that ints were two
bytes long, and know of cases where people were bitten by code that
assumed that chars were 8 bits. The simple fact of the matter is that
these things - like the behavior of the low-level div and mod
instructions - change from machine to machine. You have to deal with
such things when you are writing what is basically assembler. If you
were using a real HLL, none of these things would be a problem.
Having standard operators with
under-defined behavior is just inviting trouble: long debugging
sessions, or worse, unexplained failures in the field. Of course you
and I would avoid all the pitfalls at the start. :)

It's no worse than having standard types with loosely defined
sizes. That's part of the price for working that close to the machine.

<mike
 
M

Mark Jackson

What are you calling "the Zeller algorithm", and what is the "critical
expression"?

A C function in calendar.c, encountered in the source code for xvtdl:

int zeller (month, day, year)
int month, day, year;
{
int century;
month -= 2; /* Years start on March 1 so adjust standard date */
if (month < 1) {
month += 12;
year--;
}
century = year / 100;
year = (int)year % (int)100;
return ((int)((2.6 * month - 0.1) + day + year + year / 4 + century / 4 - century * 2) % 7);
}

The expression upon which "% 7" acts is negative when "year" is small.
This caused errors beginning in March 2000; which could be deferred by
adding a suitably-large multiple of 7 to the expression. The choice
was obvious. :)
I've no doubt you came across a stuffed-up date-to-days calculation
routine and fixed it, but it's a bit unfair to lumber Zeller with the
blame. If it was a days-to-date routine, then Zeller is not even
standing next to the real target.

Fair enough, although I'm not responsible for having named the function
(which appears to date from 1991). The original author is identified
in the code (available at
http://www.alumni.caltech.edu/~mjackson/xvtdl.html) and is findable via
the Web; you might take the matter up with him.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top