My own accounting python euler problem

V

vsoler

In the accounting department I am working for we are from time to time
confronted to the following problem:

A customer sends us a check for a given amount, but without specifying
what invoices it cancels. It is up to us to find out which ones the
payment corresponds to.

For example, say that the customer has the following outstanding
invoices: $300, $200, $50; and say that the check is for $250. This
time it is clear, the customer is paying bills $200 and $50.

However, let's now say that the outstanding invoices are $300, $200,
$100 and that the check is for $300. In this case there are already
two possibilities. The customer is paying the $300 invoice or the $200
and $100. In other words, there is more than one solution to the
problem.

My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

1.a. first approach using "brute force", that is, exploring all
different combinations: every single invoice, all of 2-element tuples,
3-element tuples, etc...

1.b can all the solutions be found without exploring all possible
combinations? some problems can be solved by discarding some invoices,
for example those whose amounts are greater than the amount of the
check. Any ideas?

My second question is:
2. this time there are also credit notes outstanding, that is,
invoices with negative amounts. For example, I=[500, 400, -100, 450,
200, 600, -200, 700] and a check Ch=600

2.a is the "brute force" method used in 1.a still applicable now that
"I" contains negative values?

2.b same as 1.b. However, this time I can imagen that the number of
invoices that can be discarded is a lot more reduced.

I am a fan of Python, which I find very elegant, powerful and easy to
develop with. I would like to find answers to the problems described
above, partially because I am still learning python, and I would like
to make use of it.

Can anybody help?
 
R

Robert P. J. Day

In the accounting department I am working for we are from time to
time confronted to the following problem:

A customer sends us a check for a given amount, but without
specifying what invoices it cancels. It is up to us to find out
which ones the payment corresponds to.

For example, say that the customer has the following outstanding
invoices: $300, $200, $50; and say that the check is for $250. This
time it is clear, the customer is paying bills $200 and $50.

However, let's now say that the outstanding invoices are $300, $200,
$100 and that the check is for $300. In this case there are already
two possibilities. The customer is paying the $300 invoice or the
$200 and $100. In other words, there is more than one solution to
the problem.

My first question is: 1. given a list of invoives I=[500, 400, 450,
200, 600, 700] and a check Ch=600 how can I print all the different
combinations of invoices that the check is possibly cancelling

that sounds like the classic knapsack problem:

http://www.itl.nist.gov/div897/sqg/dads/HTML/knapsackProblem.html

it's NP-complete.

rday
--

========================================================================
Robert P. J. Day Waterloo, Ontario, CANADA

Linux Consulting, Training and Kernel Pedantry.

Web page: http://crashcourse.ca
Twitter: http://twitter.com/rpjday
========================================================================
 
R

Robert P. J. Day

In the accounting department I am working for we are from time to
time confronted to the following problem:

A customer sends us a check for a given amount, but without
specifying what invoices it cancels. It is up to us to find out
which ones the payment corresponds to.

For example, say that the customer has the following outstanding
invoices: $300, $200, $50; and say that the check is for $250. This
time it is clear, the customer is paying bills $200 and $50.

However, let's now say that the outstanding invoices are $300, $200,
$100 and that the check is for $300. In this case there are already
two possibilities. The customer is paying the $300 invoice or the
$200 and $100. In other words, there is more than one solution to
the problem.

My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

by the way, there's a bit more to it than just seeing if you can
match the cheque amount exactly. some python solutions are here:

http://rosettacode.org/wiki/Knapsack_Problem

and a general solution allows you to place different "values" on which
items you pack into your knapsack.

say a customer has outstanding invoices for 200, 400 and 600, and
you get a cheque for 600. what do you apply that against? the single
invoice for 600, or the two for 200 and 400? that depends.

if all invoices have the same "value", it won't matter. but if the
invoice for 600 just went out, while the two others are just about to
become, say, overdue so that a penalty is about to be applied, your
customer would probably *really* appreciate it if you applied that
cheque to the older invoices.

in general, then, you can not only see what matches exactly but,
for the sake of your customer, you can give higher value to paying off
older invoices. that's how the general knapsack problem works.

rday
--

========================================================================
Robert P. J. Day Waterloo, Ontario, CANADA

Linux Consulting, Training and Kernel Pedantry.

Web page: http://crashcourse.ca
Twitter: http://twitter.com/rpjday
========================================================================
 
M

Martin P. Hellwig

vsoler wrote:
As mentioned in the other answers, your problem is best solved by a long
overdue organisational decision instead of a technical one.

Most popular solutions are:
- All invoices are essential a single credit amount, money coming in is
taken of from the big pile.
- Invoices are split by date, creating multiple credit amounts, any
money coming in is used the pay of the oldest one.

The latter one allows more easily for different interest and penalty rates.
 
O

Ozz

Hi,
My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working on
more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
S = []
if (len(L) == 1):
return [L, []]
else:
for s in subsets(L[1:]):
S.append(s)
S.append(s + [ L[0]])
return S

Now, to use the above piece of code (after 'import subset'):
[[2], [2, 4], [2, 7], [2, 7, 4], [2, 8], [2, 8, 4], [2, 8, 7], [2, 8, 7,
4], [], [4], [7], [7, 4], [8], [8, 4], [8, 7], [8, 7, 4]]
>>> map(sum,subset.subsets([4,7,8,2]))
[2, 6, 9, 13, 10, 14, 17, 21, 0, 4, 7, 11, 8, 12, 15, 19]

It's not a real solution yet, and as others have pointed out the problem
is NP complete but it might help to get you going.

cheers,
Ozz
 
R

Robert P. J. Day

Hi,
My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working
on more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
S = []
if (len(L) == 1):
return [L, []]
else:
for s in subsets(L[1:]):
S.append(s)
S.append(s + [ L[0]])
return S

Now, to use the above piece of code (after 'import subset'):
subset.subsets([4,7,8,2])
[[2], [2, 4], [2, 7], [2, 7, 4], [2, 8], [2, 8, 4], [2, 8, 7], [2, 8, 7, 4],
[], [4], [7], [7, 4], [8], [8, 4], [8, 7], [8, 7, 4]]
map(sum,subset.subsets([4,7,8,2]))
[2, 6, 9, 13, 10, 14, 17, 21, 0, 4, 7, 11, 8, 12, 15, 19]

It's not a real solution yet, and as others have pointed out the
problem is NP complete but it might help to get you going.

does your solution allow for the possibility of different invoices
of equal amounts? i would be reluctant to use the word "subset" in a
context where you can have more than one element with the same value.

rday
--


========================================================================
Robert P. J. Day Waterloo, Ontario, CANADA

Linux Consulting, Training and Kernel Pedantry.

Web page: http://crashcourse.ca
Twitter: http://twitter.com/rpjday
========================================================================
 
O

Ozz

Oops,
For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
S = []
if (len(L) == 1):
return [L, []]

better to check for the empty set too, thus;

if (len(L) == 0):
return [[]]

The order of the sets looks better too;
[[], [1], [2], [2, 1], [3], [3, 1], [3, 2], [3, 2, 1]]

cheers,
 
O

Ozz

Robert P. J. Day schreef:
does your solution allow for the possibility of different invoices
of equal amounts? i would be reluctant to use the word "subset" in a
context where you can have more than one element with the same value.

I think it handles different invoices of equal amounts correctly.
I agree with you though that the term subset may not be the best name in
this context because of those duplicates.

cheers,
Ozz
 
D

Dan Bishop

Hi,
My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working on
more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
         S = []
         if (len(L) == 1):
                 return [L, []]
         else:
                 for s in subsets(L[1:]):
                         S.append(s)
                         S.append(s + [ L[0]])
         return S

You can avoid the S list my making it a generator:

def subsets(L):
if L:
for s in subsets(L[1:]):
yield s
yield s + [L[0]]
else:
yield []
 
V

vsoler

Robert P. J. Day schreef:


I think it handles different invoices of equal amounts correctly.
I agree with you though that the term subset may not be the best name in
this context because of those duplicates.

cheers,
Ozz

Ozz,

Instead of subsets, do you mean permutations/combinations? Since 2
invoices can have the same amount perhaps the terms permutation is
better.

Vicente Soler
 
M

MRAB

Ozz said:
Hi,
My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working on
more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
S = []
if (len(L) == 1):
return [L, []]
else:
for s in subsets(L[1:]):
S.append(s)
S.append(s + [ L[0]])
return S

Now, to use the above piece of code (after 'import subset'):
subset.subsets([4,7,8,2])
[[2], [2, 4], [2, 7], [2, 7, 4], [2, 8], [2, 8, 4], [2, 8, 7], [2, 8, 7,
4], [], [4], [7], [7, 4], [8], [8, 4], [8, 7], [8, 7, 4]]
map(sum,subset.subsets([4,7,8,2]))
[2, 6, 9, 13, 10, 14, 17, 21, 0, 4, 7, 11, 8, 12, 15, 19]

It's not a real solution yet, and as others have pointed out the problem
is NP complete but it might help to get you going.
Here's my own take on it:

def list_possible_invoices(invoices, check):
if check:
# The check hasn't yet been fully consumed.
for index, inv in enumerate(invoices):
# If this invoice doesn't exceed the check then it consume
some of the check.
if inv <= check:
# Try to consume the remainder of the check.
for inv_list in list_possible_invoices(invoices[index +
1 : ], check - inv):
yield [inv] + inv_list
else:
# The check has been fully consumed.
yield []

invoices = [500, 400, 450, 200, 600, 700]
check = 600

# List all the possible subsets of invoices.
# Sorting the invoices first in descending order lets us reduce the
number of possibilities to try.
for inv_list in list_possible_invoices(sorted(invoices, reverse=True),
check):
print inv_list
 
G

geremy condra

Hi,
My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working on
more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
         S = []
         if (len(L) == 1):
                 return [L, []]
         else:
                 for s in subsets(L[1:]):
                         S.append(s)
                         S.append(s + [ L[0]])
         return S

You can avoid the S list my making it a generator:

def subsets(L):
   if L:
       for s in subsets(L[1:]):
           yield s
           yield s + [L[0]]
   else:
       yield []

What you're describing is the powerset operation. Here's the example
from the python docs:

def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

What I find interesting is that running it through timeit, it is much
slower than the code suggested by Dan Bishop.

setup = """
from itertools import chain, combinations

x = list(range(100))

def subsets1(L):
S = []
if (len(L) == 1):
return [L, []]
else:
for s in subsets1(L[1:]):
S.append(s)
S.append(s + [ L[0]])
return S

def subsets2(L):
if L:
for s in subsets(L[1:]):
yield s
yield s + [L[0]]
else:
yield []

def subsets3(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
"""

#timeit.timeit("subsets1(x)", setup) doesn't appear to terminate
timeit.timeit("subsets2(x)", setup)
timeit.timeit("subsets3(x)", setup)


I'm getting numbers roughly 3:1 in Dan's favor.

Geremy Condra
 
G

geremy condra

Hi,

My first question is:
1. given a list of invoives I=[500, 400, 450, 200, 600, 700] and a
check Ch=600
how can I print all the different combinations of invoices that the
check is possibly cancelling

Incidentally, I'm currently learning python myself, and was working on
more or less the same problem as an exercise;

For listing all different subsets of a list (This is what I came up
with. Can it be implemented shorter, btw?):

def subsets(L):
         S = []
         if (len(L) == 1):
                 return [L, []]
         else:
                 for s in subsets(L[1:]):
                         S.append(s)
                         S.append(s + [ L[0]])
         return S

You can avoid the S list my making it a generator:

def subsets(L):
   if L:
       for s in subsets(L[1:]):
           yield s
           yield s + [L[0]]
   else:
       yield []

What you're describing is the powerset operation. Here's the example
from the python docs:

def powerset(iterable):
   "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
   s = list(iterable)
   return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

What I find interesting is that running it through timeit, it is much
slower than the code suggested by Dan Bishop.

setup = """
from itertools import chain, combinations

x = list(range(100))

def subsets1(L):
      S = []
      if (len(L) == 1):
              return [L, []]
      else:
              for s in subsets1(L[1:]):
                      S.append(s)
                      S.append(s + [ L[0]])
      return S

def subsets2(L):
  if L:
      for s in subsets(L[1:]):
          yield s
          yield s + [L[0]]
  else:
      yield []

def subsets3(iterable):
   s = list(iterable)
   return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
"""

#timeit.timeit("subsets1(x)", setup) doesn't appear to terminate
timeit.timeit("subsets2(x)", setup)
timeit.timeit("subsets3(x)", setup)


I'm getting numbers roughly 3:1 in Dan's favor.

Geremy Condra

I made a mistake copying it on line 18 of the above; it should be
subsets2, rather than just subsets.

Geremy Condra
 
O

Ozz

Dan Bishop schreef:
You can avoid the S list my making it a generator:

def subsets(L):
if L:
for s in subsets(L[1:]):
yield s
yield s + [L[0]]
else:
yield []

Nice one. Thanks!

Ozz
 
O

Ozz

vsoler schreef:
Instead of subsets, do you mean permutations/combinations? Since 2
invoices can have the same amount perhaps the terms permutation is
better.

As some other poster already suggested 'powerset' (
http://en.wikipedia.org/wiki/Power_set ) may be a better name, except
for those duplicates, of course. On the other hand, I think viewing it
as a powerset is the most 'natural' in this case. (imo permutations are
about the order of objects, not about whether the objects are included
in a set or not)

cheers,
Ozz
 
S

Steven D'Aprano

What you're describing is the powerset operation. Here's the example
from the python docs: [...]
What I find interesting is that running it through timeit, it is much
slower than the code suggested by Dan Bishop.

Your test doesn't show what you think it shows. You shouldn't just
blindly apply timeit without testing to see that the functions return
what you think they return. Your test is, I'm afraid, totally bogus and
the conclusion you draw is completely wrong.


[...]
#timeit.timeit("subsets1(x)", setup) doesn't appear to terminate
timeit.timeit("subsets2(x)", setup)
timeit.timeit("subsets3(x)", setup)

For the sake of speed, I've used a smaller x. Here are the results of
calling the three functions:

x = range(3)
subsets1(x) [[2], [2, 0], [2, 1], [2, 1, 0], [], [0], [1], [1, 0]]
subsets2(x)
subsets3(x)
<itertools.chain object at 0xb7c608ac>

The reason subsets1() "doesn't appear to terminate" is that you are
trying to list all the subsets of x = range(100). That's a LOT of
subsets: 2**100 to be precise, or approximately 1.2e30.

subsets2() and subsets3() return a generator function and an iterator
object respectively. There may be some overhead difference between those,
but that's trivial compared to the cost of generating the subsets
themselves.

A better way of doing the timing is as follows:



from itertools import chain, combinations
from timeit import Timer

# use a number small enough to calculate in a reasonable time
x = list(range(10))

def subsets1(L):
S = []
if (len(L) == 1):
return [L, []]
else:
for s in subsets1(L[1:]):
S.append(s)
S.append(s + [ L[0]])
return S

def subsets2(L):
if L:
for s in subsets2(L[1:]):
yield s
yield s + [L[0]]
else:
yield []

def subsets3(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in
range(len(s)+1))

setup = "from __main__ import subsets1, subsets2, subsets3, x"


# Check the three functions return the same results:
x1 = sorted(subsets1(x))
x2 = sorted(subsets2(x))
x3 = sorted(list(t) for t in subsets1(x))
assert x1 == x2 == x3

# Set up some timers.
t1 = Timer("subsets1(x)", setup)
t2 = Timer("list(subsets2(x))", setup)
t3 = Timer("list(subsets3(x))", setup)

# And run them!
for t in (t1, t2, t3):
print min(t.repeat(number=1000, repeat=5))



The results I get are:

1.19647693634
0.901714801788
0.175387859344

Which as you can see, shows that the recipe in the docs is nearly ten
times faster than Dan's version. That's not surprising -- the docs
version uses highly optimized C code from itertools, while Dan's version
uses slow-ish Python code and recursion.


To show this is no fluke, I increased the size of x and ran the tests
again:

.... print min(t.repeat(number=1000, repeat=5))
....
45.283468008
33.9274909496
7.40781188011
 
G

geremy condra

What you're describing is the powerset operation. Here's the example
from the python docs: [...]
What I find interesting is that running it through timeit, it is much
slower than the code suggested by Dan Bishop.

Your test doesn't show what you think it shows. You shouldn't just
blindly apply timeit without testing to see that the functions return
what you think they return. Your test is, I'm afraid, totally bogus and
the conclusion you draw is completely wrong.

<snip>

Doh! I even noticed that the asymptotic times didn't match up
and still blew right past the obvious answer. Thanks (again) for
the correction, Steven.

Geremy Condra
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top