Alternative to Decimal type

F

Frank Millman

Hi all

I have a standard requirement for a 'decimal' type, to instantiate and
manipulate numeric data that is stored in a database. I came up with a
solution long before the introduction of the Decimal type, which has
been working well for me. I know the 'scale' (number of decimal
places) of the number in advance. When I read the number in from the
database I scale it up to an integer. When I write it back I scale it
down again. All arithmetic is done using integers, so I do not lose
accuracy.

There is one inconvenience with this approach. For example, if I have
a product quantity with a scale of 4, and a price with a scale of 2,
and I want to multiply them to get a value with a scale of 2, I have
to remember to scale the result down by 4. This is a minor chore, and
errors are quickly picked up by testing, but it does make the code a
bit messy, so it would be nice to find a solution.

I am now doing some refactoring, and decided to take a look at the
Decimal type. My initial impressions are that it is quite awkward to
use, that I do not need its advanced features, and that it does not
help solve the one problem I have mentioned above.

I therefore spent a bit of time experimenting with a Number type that
suits my particular requirements. I have come up with something that
seems to work, which I show below.

I have two questions.

1. Are there any obvious problems in what I have done?

2. Am I reinventing the wheel unnecessarily? i.e. can I do the
equivalent quite easily using the Decimal type?

--------------------
from __future__ import division

class Number(object):
def __init__(self,value,scale):
self.factor = 10.0**scale
if isinstance(value,Number):
value = value.value / value.factor
self.value = long(round(value * self.factor))
self.scale = scale

def __add__(self,other):
if isinstance(other,Number):
other = other.value / other.factor
return Number((self.value/self.factor)+other,self.scale)

def __sub__(self,other):
if isinstance(other,Number):
other = other.value / other.factor
return Number((self.value/self.factor)-other,self.scale)

def __mul__(self,other):
if isinstance(other,Number):
other = other.value / other.factor
return Number((self.value/self.factor)*other,self.scale)

def __truediv__(self,other):
if isinstance(other,Number):
other = other.value / other.factor
return Number((self.value/self.factor)/other,self.scale)

def __radd__(self,other):
return self.__add__(other)

def __rsub__(self,other):
return Number(other-(self.value/self.factor),self.scale)

def __rmul__(self,other):
return self.__mul__(other)

def __rtruediv__(self,other):
return Number(other/(self.value/self.factor),self.scale)

def __cmp__(self,other):
if isinstance(other,Number):
other = other.value / other.factor
this = self.value / self.factor
if this < other:
return -1
elif this > other:
return 1
else:
return 0

def __str__(self):
s = str(self.value)
if s[0] == '-':
minus = '-'
s = s[1:].zfill(self.scale+1)
else:
minus = ''
s = s.zfill(self.scale+1)
return '%s%s.%s' % (minus, s[:-self.scale], s[-self.scale:])
--------------------

Example usage -
1543.13 [scale is taken from left-hand operand]
1543.1250 [scale is taken from left-hand operand]
1543.13 [scale is taken from Number instance]
--------------------

At this stage I have not built in any rounding options, but this can
be done later if I find that I need it.

Any comments will be welcome.

Thanks

Frank Millman
 
P

Paul Hankin

I have a standard requirement for a 'decimal' type, to instantiate and
manipulate numeric data that is stored in a database. I came up with a
solution long before the introduction of the Decimal type, which has
been working well for me. I know the 'scale' (number of decimal
places) of the number in advance. When I read the number in from the
database I scale it up to an integer. When I write it back I scale it
down again. All arithmetic is done using integers, so I do not lose
accuracy.

There is one inconvenience with this approach. For example, if I have
a product quantity with a scale of 4, and a price with a scale of 2,
and I want to multiply them to get a value with a scale of 2, I have
to remember to scale the result down by 4. This is a minor chore, and
errors are quickly picked up by testing, but it does make the code a
bit messy, so it would be nice to find a solution.

I am now doing some refactoring, and decided to take a look at the
Decimal type. My initial impressions are that it is quite awkward to
use, that I do not need its advanced features, and that it does not
help solve the one problem I have mentioned above.

I therefore spent a bit of time experimenting with a Number type that
suits my particular requirements. I have come up with something that
seems to work, which I show below.

I have two questions.

1. Are there any obvious problems in what I have done?

2. Am I reinventing the wheel unnecessarily? i.e. can I do the
equivalent quite easily using the Decimal type?

Hi Frank,
I don't know why you think Decimal is complicated: it has some
advanced features, but for what you seem to be doing it should be easy
to replace your 'Number' with it. In fact, it makes things simpler
since you don't have to worry about 'scale'.

Your examples convert easily:

from decimal import Decimal
qty = Decimal('12.5')
price = Decimal('123.45')

print price * qty
print qty * price
print (qty * price).quantize(Decimal('0.01'))
 
F

Frank Millman

Hi Frank,
I don't know why you think Decimal is complicated: it has some
advanced features, but for what you seem to be doing it should be easy
to replace your 'Number' with it. In fact, it makes things simpler
since you don't have to worry about 'scale'.

Your examples convert easily:

from decimal import Decimal
qty = Decimal('12.5')
price = Decimal('123.45')

print price * qty
print qty * price
print (qty * price).quantize(Decimal('0.01'))

I thought I might be missing something obvious. This does indeed look
easy. Thanks, Paul

Frank
 
M

Mel

Frank said:
Hi all

I have a standard requirement for a 'decimal' type, to instantiate and
manipulate numeric data that is stored in a database. I came up with a
solution long before the introduction of the Decimal type, which has
been working well for me. I know the 'scale' (number of decimal
places) of the number in advance. When I read the number in from the
database I scale it up to an integer. When I write it back I scale it
down again. All arithmetic is done using integers, so I do not lose
accuracy.

There is one inconvenience with this approach. For example, if I have
a product quantity with a scale of 4, and a price with a scale of 2,
and I want to multiply them to get a value with a scale of 2, I have
to remember to scale the result down by 4. This is a minor chore, and
errors are quickly picked up by testing, but it does make the code a
bit messy, so it would be nice to find a solution.

I am now doing some refactoring, and decided to take a look at the
Decimal type. My initial impressions are that it is quite awkward to
use, that I do not need its advanced features, and that it does not
help solve the one problem I have mentioned above.

I therefore spent a bit of time experimenting with a Number type that
suits my particular requirements. I have come up with something that
seems to work, which I show below.

I have two questions.

1. Are there any obvious problems in what I have done?

2. Am I reinventing the wheel unnecessarily? i.e. can I do the
equivalent quite easily using the Decimal type?

--------------------
from __future__ import division

class Number(object):
def __init__(self,value,scale):
self.factor = 10.0**scale
if isinstance(value,Number):
value = value.value / value.factor

I think this could lead to trouble. One complaint against binary floating
point is that it messes up low-order decimal digits, and this ensures that
all calculations are effectively done in binary floating point. Better, I
think would be

if isinstance (value, Number):
self.value = value.value
self.scale = scale + value.scale

and be done with it. Of course, this means self.scale no longer gives the
preferred number of fractional digits. My bias: I did a DecimalFloat class
way back when, when Decimal was being discussed, and separated the exponent
for calculations from the rounding precision for display.

Cheers, Mel
 
F

Frank Millman

I think this could lead to trouble.  One complaint against binary floating
point is that it messes up low-order decimal digits, and this ensures that
all calculations are effectively done in binary floating point.  Better, I
think would be

        if isinstance (value, Number):
            self.value = value.value
            self.scale = scale + value.scale

and be done with it.

Thanks for the reply, Mel. I don't quite understand what you mean.
Bear in mind my next line, which you did not quote -

if isinstance(value,Number):
value = value.value / value.factor
--> self.value = long(round(value * self.factor))

I do understand that binary floating point does not always give the
expected results when trying to do decimal arithmetic.

However, given a float f1 and a scaling factor s, I thought that if I
did the following -

i1 = long(round(f1 * s)) # scale up to integer
f2 = i1 / s # reduce back to float
i2 = long(round(f2 * s)) # scale up again

then i2 would always be equal to i1.

If you are saying that there could be situations where this is not
guaranteed, then I agree with you that what I have written is
dangerous.

I will do some more testing to see if this could happen. I suppose
that if the scaling factor is very high it could cause a problem, but
I cannot envisage it exceeding 6 in my application.

Thanks

Frank
 
F

Frank Millman

Thanks for the reply, Mel. I don't quite understand what you mean.

As so often happens, after I sent my reply I re-read your post and I
think I understand what you are getting at.

One problem with my approach is that I am truncating the result down
to the desired scale factor every time I create a new instance. This
could result in a loss of precision if I chain a series of instances
together in a calculation. I think that what you are suggesting avoids
this problem.

I will read your message again carefully. I think it will lead to a
rethink of my approach.

Thanks again

Frank

P.S. Despite my earlier reply to Paul, I have not abandoned the idea
of using my Number class as opposed to the standard Decimal class.

I did a simple test of creating two instances and adding them
together, using both methods, and timing them. Decimal came out 6
times slower than Number.

Is that important? Don't know, but it might be.
 
D

Diez B. Roggisch

Frank said:
As so often happens, after I sent my reply I re-read your post and I
think I understand what you are getting at.

One problem with my approach is that I am truncating the result down
to the desired scale factor every time I create a new instance. This
could result in a loss of precision if I chain a series of instances
together in a calculation. I think that what you are suggesting avoids
this problem.

I will read your message again carefully. I think it will lead to a
rethink of my approach.

Thanks again

Frank

P.S. Despite my earlier reply to Paul, I have not abandoned the idea
of using my Number class as opposed to the standard Decimal class.

I did a simple test of creating two instances and adding them
together, using both methods, and timing them. Decimal came out 6
times slower than Number.

Is that important? Don't know, but it might be.

It is because it uses arbitrary precision integer literals instead of ieee
floats. It pays this price so you get decimal rounding errors instead of
binary. Yet rounding errors you get...

If you are in money calculations with your Number-class - you certainly want
Decimal instead.

If all you want is auto-rounding... then you might not care.

Diez
 
E

Ethan Furman

Mel said:
Frank said:
Hi all

I have a standard requirement for a 'decimal' type, to instantiate and
manipulate numeric data that is stored in a database. I came up with a
solution long before the introduction of the Decimal type, which has
been working well for me. I know the 'scale' (number of decimal
places) of the number in advance. When I read the number in from the
database I scale it up to an integer. When I write it back I scale it
down again. All arithmetic is done using integers, so I do not lose
accuracy. [snip]
--------------------
from __future__ import division

class Number(object):
def __init__(self,value,scale):
self.factor = 10.0**scale
if isinstance(value,Number):
value = value.value / value.factor

I think this could lead to trouble. One complaint against binary floating
point is that it messes up low-order decimal digits, and this ensures that
all calculations are effectively done in binary floating point. Better, I
think would be

if isinstance (value, Number):
self.value = value.value
self.scale = scale + value.scale

and be done with it. Of course, this means self.scale no longer gives the
preferred number of fractional digits. My bias: I did a DecimalFloat class
way back when, when Decimal was being discussed, and separated the exponent
for calculations from the rounding precision for display.

What about a little rewrite so the current implementation is like the
original, and all calculations are done as integers? Or is this just
getting closer and closer to what Decimal does?

[only lightly tested]

from __future__ import division

class Number(object):
def __init__(self, value, scale):
if isinstance(value, Number):
delta = value.scale - scale
if delta > 0:
self.value = value.value // 10**delta
elif delta < 0:
self.value = value.value * 10**abs(delta)
else:
self.value = value.value
else:
if not scale:
scale += 1
self.scale = scale
self.factor = 10**self.scale
self.value = long(round(value * self.factor))

def __add__(self, other):
answer = Number(other, self.scale)
answer.value += self.value
return answer

def __sub__(self, rhs):
answer = Number(rhs, self.scale)
answer.value = self.value - answer.value
return answer

def __mul__(self, other):
answer = Number(other, self.scale)
answer.value *= self.value
answer.value //= answer.factor
return answer

def __truediv__(self, rhs):
answer = Number(rhs, self.scale)
quotient = 0
divisor = answer.value
dividend = self.value
for i in range(self.scale+1):
quotient = (quotient * 10) + (dividend // divisor)
dividend = (dividend % divisor) * 10
answer.value = quotient
return answer

def __radd__(self, lhs):
return self.__add__(lhs)

def __rsub__(self, lhs):
answer = Number(lhs, self.scale)
answer.value = answer.value - self.value
return answer

def __rmul__(self, lhs):
return self.__mul__(lhs)

def __rtruediv__(self, lhs):
answer = Number(lhs, self.scale)
quotient = 0
divisor = self.value
dividend = answer.value
for i in range(self.scale+1):
quotient = (quotient * 10) + (dividend // divisor)
dividend = (dividend % divisor) * 10
answer.value = quotient
return answer

def __cmp__(self, rhs):
other = Number(rhs, self.scale)
if self.value < other.value:
return -1
elif self.value > other.value:
return 1
else:
return 0

def __str__(self):
s = str(self.value)
if s[0] == '-':
minus = '-'
s = s[1:].zfill(self.scale+1)
else:
minus = ''
s = s.zfill(self.scale+1)
return '%s%s.%s' % (minus, s[:-self.scale], s[-self.scale:])
 
F

Frank Millman

Thanks to all for the various replies. They have all helped me to
refine my ideas on the subject. These are my latest thoughts.

Firstly, the Decimal type exists, it clearly works well, it is written
by people much cleverer than me, so I would need a good reason not to
use it. Speed could be a good reason, provided I am sure that any
alternative is 100% accurate for my purposes.

My approach is based on expressing a decimal number as a combination
of an integer and a scale, where scale means the number of digits to
the right of the decimal point.

Therefore 0.04 is integer 4 with scale 2, 1.1 is integer 11 with scale
1, -123.456 is integer -123456 with scale 3. I am pretty sure that any
decimal number can be accurately represented in this form.

All arithmetic is carried out using integer arithmetic, so although
there may be rounding differences, there will not be the spurious
differences thrown up by trying to use floats for decimal arithmetic.

I use a class called Number, with two attributes - an integer and a
scale. My first attempt required these two to be provided every time
an instance was created. Then I realised that this would cause loss of
precision if I chain a series of instances together in a calculation.
The constructor can now accept any of the following forms -

1. A digit (either integer or float) and a scale. It uses the scale
factor to round up the digit to the appropriate integer.

2. Another Number instance. It takes the integer and scale from the
other instance.

3. An integer, with no scale. It uses the integer, and assume a scale
of zero.

4. A float in string format (e.g. '1.1') with no scale. It uses the
number of digits to the right as the scale, and scales the number up
to the appropriate integer.

For addition, subtraction, multiplication and division, the 'other'
number can be any of 2, 3, or 4 above. The result is a new Number
instance. The scale of the new instance is based on the following rule
-

For addition and subtraction, the new scale is the greater of the two
scales on the left and right hand sides.

For multiplication, the new scale is the sum of the two scales on the
left and right hand sides.

For division, I could not think of an appropriate rule, so I just hard-
coded a scale of 9. I am sure this will give sufficient precision for
any calculation I am likely to encounter.

My Number class is now a bit more complicated than before, so the
performance is not as great, but I am still getting a four-fold
improvement over the Decimal type, so I will continue running with my
version for now.

My main concern is that my approach may be naive, and that I will run
into situations that I have not catered for, resulting in errors. If
this is the case, I will drop this like a hot potato and stick to the
Decimal type. Can anyone point out any pitfalls I might be unaware of?

I will be happy to show the code for the new Number class if anyone is
interested.

Thanks

Frank
 
F

Frank Millman

Out of curiosity, what is the purpose of these numbers?  Do they
represent money, measurements, or something else?

I am writing a business/accounting application. The numbers represent
mostly, but not exclusively, money.

Examples -

Multiply a selling price (scale 2) by a product quantity (scale 4) to
get an invoice value (scale 2), rounded half-up.

Multiply an invoice value (scale 2) by a tax rate (scale 2) to get a
tax value (scale 2), rounded down.

Divide a currency value (scale 2) by an exchange rate (scale 6) to get
a value in a different currency (scale 2).

Divide a product quantity (scale 4) by a pack size (scale 2) to get an
equivalent quantity in a different pack size.

In my experience, the most important thing is to be consistent when
rounding. If you leave the rounding until the presentation stage, you
can end up with an invoice value plus a tax amount differing from the
invoice total by +/- 0.01. Therefore I always round a result to the
required scale before 'freezing' it, whether in a database or just in
an object instance.

Frank
 
E

Ethan Furman

Frank said:
Thanks to all for the various replies. They have all helped me to
refine my ideas on the subject. These are my latest thoughts.

Firstly, the Decimal type exists, it clearly works well, it is written
by people much cleverer than me, so I would need a good reason not to
use it. Speed could be a good reason, provided I am sure that any
alternative is 100% accurate for my purposes.
[snip]

For addition, subtraction, multiplication and division, the 'other'
number can be any of 2, 3, or 4 above. The result is a new Number
instance. The scale of the new instance is based on the following rule

For addition and subtraction . . .
For multiplication . . .
For division . . .

Out of curiosity, what is the purpose of these numbers? Do they
represent money, measurements, or something else? The reason I ask is
way back in physics class (or maybe chemistry... it was way back :) I
was introduced to the idea of significant digits -- that idea being that
a measured number is only accurate to a certain degree, and calculations
using that number therefore could not be more accurate. Sort of like a
built-in error range.

I'm thinking of developing the class in the direction of maintaining the
significant digits through calculations... mostly as I think it would be
fun, and it also seems like a good test case to get me in the habit of
unit testing. I'll call it something besides Number, though. :)

Is anybody aware of such a class already in existence?
 
T

Terry Reedy

| Thanks to all for the various replies. They have all helped me to
| refine my ideas on the subject. These are my latest thoughts.
|
| Firstly, the Decimal type exists, it clearly works well, it is written
| by people much cleverer than me, so I would need a good reason not to
| use it. Speed could be a good reason, provided I am sure that any
| alternative is 100% accurate for my purposes.

The Decimal module is a Python (now C coded in 2.6/3.0, I believe)
implementation of a particular IBM-sponsored standard. The standard is
mostly good, but it is somewhat large with lots of options (such as
rounding modes) and a bit of garbage (the new 'logical' operations, for
instance) added by IBM for proprietary purposes. Fortunately, one can
ignore the latter once you recognize them for what they are.

As Nick indicated, it is not the first in its general category. And I
believe the Decimal implementation could have been done differently.

By writing you own class, you get just what you need with the behavior you
want. I think just as important is the understanding of many of the issues
involved, even if you eventually switch to something else. That is a
pretty good reason in itself.

tjr
 
F

Frank Millman

Thanks to all for the various replies. They have all helped me to
refine my ideas on the subject. These are my latest thoughts.
[snip]

My main concern is that my approach may be naive, and that I will run
into situations that I have not catered for, resulting in errors. If
this is the case, I will drop this like a hot potato and stick to the
Decimal type. Can anyone point out any pitfalls I might be unaware of?

I will be happy to show the code for the new Number class if anyone is
interested.

Thanks again for all the really useful replies.

I will have a look at gmpy, and I will study FixedPoint.py closely.

Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top