Generating all combinations

J

Johannes Bauer

Hello group,

I'm trying to write a function in Python which does the following: For a
number of arguments which are all lists, return a list (or generator)
which yields all tuples of combination. E.g:

foofunction()
# returns [ ]

foofunction([1, 2, 3])
# returns [ (1), (2), (3) ]

foofunction([1, 2, 3], ["a", "b"], ["x"])
# returns [ (1, "a", "x"), (1, "b", "x"),
(2, "a", "x"), (2, "b", "x"),
(3, "a", "x"), (3, "b", "x") ]

I must admit that I have no clue on how to do that efficiently (and more
importantly: in a variadic manner) in Python. Any help is appreciated!

Kind regards,
Johannes
 
M

Mark Dickinson

I'm trying to write a function in Python which does the following: For a
number of arguments which are all lists, return a list (or generator)
which yields all tuples of combination. E.g:

foofunction()
# returns [ ]

Are you sure that's what you want? I'd expect [()] here (on the
basis that the empty product is a one-element set). And
indeed that's what itertools.product gives:
[()]

foofunction([1, 2, 3])
# returns [ (1), (2), (3) ]

Presumably you mean [(1,), (2,), (3,)] here?

Mark
 
P

pataphor

Johannes said:
Any help is appreciated!

This is on the fringe of exploitation, but hey, maybe the code helps you
think about the algorithm.

IMHO the following code is a glaring complaint about the injustice of
omission itertools inflicts on the perfectly natural and obvious
procedure of repeat_each (whatever it's name ought to be):

from itertools import izip, islice, cycle

def repeat_each(seq,n):
while True:
for x in seq:
for i in range(n):
yield x

def repeat_all(seq,n):
while True:
for i in range(n):
for x in seq:
yield x

def product(X):
N = []
total = 1
for x in X:
N.append(total)
total *= len(x)
R = [repeat_all(repeat_each(x,k),n)
for x,k,n in izip(X,N,reversed(N))]
return islice(izip(*R),total)

def test1():
L = ['a', 'bc','def' ]
for x in product(L):
print x
print

def test2():
repeat_all = cycle
test1()

if __name__ == '__main__':
test1()
test2()

See? Repeat_all and repeat_each are almost brothers, just separated by
the tiniest rearrangement of their genetic code (or should I say code
genetics?). Yet one is included as 'itertools.cycle' and the other is
doomed to live in limbo. Such injustice! Can it be that
'itertools.repeat' has usurped repeat_each's rightful position?

P.
 
M

Mensanator

This is on the fringe of exploitation, but hey, maybe the code helps you
think about the algorithm.

IMHO the following code is a glaring complaint about the injustice of
omission itertools inflicts on the perfectly natural and obvious
procedure of repeat_each (whatever it's name ought to be):

I believe the name you're looking for is
combinations_with_replacement.
It is one of the features being added to 3.1 which should give all
the subsets of the Cartesian Product:

permutations_with_replacement: product()
combinations_with_replacement: combinations_with_replacement()
permutations_without_replacement: permutations()
combinations_without_replacement: combinations()
from itertools  import izip, islice, cycle

def repeat_each(seq,n):
     while True:
         for x in seq:
             for i in range(n):
                 yield x

def repeat_all(seq,n):
     while True:
         for i in range(n):
             for x in seq:
                 yield x

def product(X):
     N = []
     total = 1
     for x in X:
         N.append(total)
         total *= len(x)
     R = [repeat_all(repeat_each(x,k),n)
                     for x,k,n in izip(X,N,reversed(N))]
     return islice(izip(*R),total)

def test1():
     L = ['a', 'bc','def' ]
     for x in product(L):
         print x
     print

def test2():
     repeat_all = cycle
     test1()

if __name__ == '__main__':
     test1()
     test2()

See? Repeat_all and repeat_each are almost brothers, just separated by
the tiniest rearrangement of their genetic code (or should I say code
genetics?). Yet one is included as 'itertools.cycle' and the other is
doomed to live in limbo. Such injustice! Can it be that
'itertools.repeat' has usurped repeat_each's rightful position?

P.
 
S

Steven D'Aprano

I believe the name you're looking for is combinations_with_replacement.
It is one of the features being added to 3.1 which should give all the
subsets of the Cartesian Product:

permutations_with_replacement: product()
combinations_with_replacement: combinations_with_replacement()
permutations_without_replacement: permutations()
combinations_without_replacement: combinations()

What, no partitions?

http://en.wikipedia.org/wiki/Partition_of_a_set
 
S

Steven D'Aprano

Itertools does partitions?

Er, no. That's why I asked "What, no partitions?" instead of saying
"Look, itertools also does partitions!"

I didn't see any reference to Cartesian Product there.

Wikipedia is the encyclopedia anyone can edit. Go right ahead and put it
in if you think it needs to be there. While you're at it, there is no
mention of Cartesian Product in any of

http://en.wikipedia.org/wiki/Permutations
http://en.wikipedia.org/wiki/Combinations

http://mathworld.wolfram.com/Permutation.html
http://mathworld.wolfram.com/k-Subset.html

either. Are you sure that permutations and combinations are subsets of
the Cartesian Product?
 
M

Mensanator

Er, no. That's why I asked "What, no partitions?" instead of saying
"Look, itertools also does partitions!"



Wikipedia is the encyclopedia anyone can edit. Go right ahead and put it
in if you think it needs to be there. While you're at it, there is no
mention of Cartesian Product in any of

http://en.wikipedia.org/wiki/Permutationshttp://en.wikipedia.org/wiki/Combinations

http://mathworld.wolfram.com/Permutation.htmlhttp://mathworld.wolfram.com/k-Subset.html

either.

You might have better luck with Google.
Are you sure that permutations and combinations are subsets of
the Cartesian Product?

Sure looks that way (SQL examples):

Cartesian Product (Permutaions w/replacement)

SELECT B.q, A.p
FROM A, B;

q p
a a
a b
b a
b b


Permutaions wo/replacement
SELECT B.q, A.p
FROM A, B
WHERE (((A.p)<>.[q]));

q p
a b
b a


Combinations w/replacement

SELECT B.q, A.p
FROM A, B
WHERE (((A.p)>=.[q]));

q p
a a
a b
b b



Combinations wo/replacement

SELECT B.q, A.p
FROM A, B
WHERE (((A.p)>.[q]));

q p
a b


I couldn't do that if they weren't subsets.
 
M

Mensanator

Er, no. That's why I asked "What, no partitions?" instead of saying
"Look, itertools also does partitions!"



Wikipedia is the encyclopedia anyone can edit. Go right ahead and put it
in if you think it needs to be there. While you're at it, there is no
mention of Cartesian Product in any of

http://en.wikipedia.org/wiki/Permutationshttp://en.wikipedia.org/wiki/Combinations

http://mathworld.wolfram.com/Permutation.htmlhttp://mathworld.wolfram.com/k-Subset.html

either. Are you sure that permutations and combinations are subsets of
the Cartesian Product?

Didn't notice this before - it says so in the docs!

itertools.product(*iterables[, repeat])¶
Cartesian product of input iterables.

The code for permutations() can be also expressed as a subsequence of
product(), filtered to exclude entries with repeated elements (those
from the same position in the input pool):

The code for combinations() can be also expressed as a subsequence of
permutations() after filtering entries where the elements are not in
sorted order (according to their position in the input pool):
 
P

pataphor

Mensanator said:
I couldn't do that if they weren't subsets.

Right. Sometimes one just has to assume things are different even if
they look the same on the surface. That is because else one wouldn't be
able to produce the other generators. I guess it would also work the
other way around, assuming things are the same even when they look
different.

For example see my two functions:

def repeat_each(seq,n):
while True:
for x in seq:
for i in range(n):
yield x

def repeat_all(seq,n):
while True:
for i in range(n):
for x in seq:
yield x

(I should probably not smoke stuff before posting, but anyway)


They are the same, except for switching two lines of code. But for the
second one ('repeat_all') the argument 'n' seems redundant. What does it
even mean to repeat a sequence n times, and do that forever? Isn't that
the same as just repeating the sequence itself, forever? So that's how
we arrive at itertools.cycle . The second argument is there, but it
would be foolish to include it, so it is left out.

But now let's look at how itertools.repeat should really be. It should
look like 'repeat_each' above. Here second argument ('n') *is*
necessary, or else the generator would just infinitely repeat only the
first element of the sequence, which is obviously nonsense. But that is
exactly why itertools.repeat does not accept a sequence (like cycle, its
virtual brother) but instead it has degenerated into something that just
repeats only one thing n times, which is stupid.

So to set things right one has to forget everything and just write
complete balderdash if necessary, if it only leads to finally
understanding how the two are related. Then one can see that
itertools.repeat should be an infinite generator *on sequences* that
however still needs a second argument specifying how many times each
individual item should be repeated, and that itertools.cycle's second
argument is there but hidden.

P.
 
S

Steven D'Aprano

Are you sure that permutations and combinations are subsets of the
Cartesian Product?

Sure looks that way (SQL examples): [snip]
I couldn't do that if they weren't subsets.

Permutations and combinations are arrangements of a single set. The
Cartesian product, on the other hand, are the arrangements of multiple
sets, one item from each set. As a general rule, for arbitrary arguments,
the items you get from product() aren't even the same size as the items
you get from permutations():

{(2, 1, 0), (0, 1, 2), (1, 0, 2), (2, 0, 1), (0, 2, 1), (1, 2, 0)}


Perhaps you mean that, in the special case of the Cartesian product of a
set with itself sufficient times, the permutations and combinations of
that set are subsets of the product? I could live with that:
True


Oh, there's another type of arrangement missing from the collection...
dearrangements. That's the arrangements where every item appears in the
"wrong" place, i.e. in a position where it *didn't* start off: e.g. given
(1, 2, 3), the dearrangements are: (2, 3, 1), (3, 1, 2). Typical problem:
you have N addressed letters and envelopes, but someone has shuffled the
letters before putting them into the envelopes. How many ways are there
such that no letter will be delivered to the intended recipient?
 
M

Mensanator

Sure looks that way (SQL examples): [snip]
I couldn't do that if they weren't subsets.

Permutations and combinations are arrangements of a single set.

The OP *did* say combinations, didn't he? So the thread *was* in the
context of a single set, wasn't it?
The
Cartesian product, on the other hand, are the arrangements of multiple
sets, one item from each set.

And the Cartesian Product of a set with itself gives you
permutations with replacement.
As a general rule, for arbitrary arguments,
the items you get from product() aren't even the same size as the items
you get from permutations():

<snip example>

But I wasn't talking about that case.

What I *was* talking about is this quote from the 3.1 What's New page:

<quote>
The itertools.combinations_with_replacement() function is one of
four for generating combinatorics including permutations and
Cartesian products.
</quote>

Is there some reason I *shouldn't* assume that the "four for
generating
combinatorics" are not permutations vs combinations & with vs without
replacement? Does the fact that product() can do more than operate on
a single set change that? I never said it couldn't.
Perhaps you mean that, in the special case of the Cartesian product of a
set with itself sufficient times, the permutations and combinations of
that set are subsets of the product? I could live with that:

<snip example>

That *is*, in fact, what I was refering to. I never intended my
comment
to be a complete tutorial on all that can be done with itertools. I
was
replying to pataphor's bitching about the itertools being incomplete,
that that issue has been addressed. What did I say wrong? Was *I* the
one
who was complaing? (I try not to do that anymore.)
Oh, there's another type of arrangement missing from the collection...
dearrangements.

Is that "one of four for generating combinatorics"?

Maybe you should be talking to the itertools developers?
That's the arrangements where every item appears in the
"wrong" place, i.e. in a position where it *didn't* start off: e.g. given
(1, 2, 3), the dearrangements are: (2, 3, 1), (3, 1, 2). Typical problem:
you have N addressed letters and envelopes, but someone has shuffled the
letters before putting them into the envelopes. How many ways are there
such that no letter will be delivered to the intended recipient?

Ok, but isn't that not within the scope of what we're discussing?
 
R

Raymond Hettinger

What, no partitions?

Seems like you could roll your own (using combinations as a starting
point):

def pairwise(iterable):
a, b = tee(iterable)
next(b, None)
return izip(a, b)

def partition(s):
n = len(s)
for i in range(n):
for div in combinations(range(1,n), i):
yield map(s.__getitem__, starmap(slice, pairwise(chain
([0], div, [n]))))

pprint(list(partition('abcd')))
[['abcd'],
['a', 'bcd'],
['ab', 'cd'],
['abc', 'd'],
['a', 'b', 'cd'],
['a', 'bc', 'd'],
['ab', 'c', 'd'],
['a', 'b', 'c', 'd']]
 
R

Raymond Hettinger

What, no partitions?
Simpler version:

def partition(s):
n = len(s)
parts = range(1, n)
for i in range(n):
for div in combinations(parts, i):
print map(s.__getslice__, chain([0], div), chain(div,
[n]))
['abcd']
['a', 'bcd']
['ab', 'cd']
['abc', 'd']
['a', 'b', 'cd']
['a', 'bc', 'd']
['ab', 'c', 'd']
['a', 'b', 'c', 'd']
 
S

Steven D'Aprano

Simpler version:

def partition(s):
n = len(s)
parts = range(1, n)
for i in range(n):
for div in combinations(parts, i):
print map(s.__getslice__, chain([0], div), chain(div,
[n]))



Nice one!


Raymond, as perhaps *the* principle contributor to itertools, do you feel
that the combinatorics-related tools should be in their own module? Or is
that dividing the module too fine?

Would you consider moving permutations() and friends into a module
together with functions for calculating the number of permutations etc?

e.g. permutations(iterable[, r]) --> permutations object
nPr(n, r) --> int

and similar.
 
S

Steven D'Aprano

On Wed, 03 Jun 2009 18:21:37 -0700, Mensanator wrote:

[mass snippage]
What I *was* talking about is this quote from the 3.1 What's New page:

<quote>
The itertools.combinations_with_replacement() function is one of four
for generating combinatorics including permutations and Cartesian
products.
</quote>

Is there some reason I *shouldn't* assume that the "four for generating
combinatorics" are not permutations vs combinations & with vs without
replacement? Does the fact that product() can do more than operate on a
single set change that? I never said it couldn't.

Settle down Mensanator! Don't take it so personally! You're sounding
awfully agitated.

The quoted doc doesn't imply that there are only four combinatorics
functions, only that there are four *in itertools*. Nor does it say that
permutations etc are subsets of the Cartesian Product, because *without
clarification* that is a fairly dubious thing to say. Remember, these are
terms with very precise technical meanings, and if you say to a
mathematician "the permutations P of some set S are a subset of the
Cartesian Product with no specified arguments", they'll probably react
just as I did -- with confusion.

Now that I've narrowed down what you actually meant, I'm happy to agree
with you, at least informally.


<snip example>

That *is*, in fact, what I was refering to. I never intended my comment
to be a complete tutorial on all that can be done with itertools.

Dear me. Did I say it was?

I was
replying to pataphor's bitching about the itertools being incomplete,
that that issue has been addressed. What did I say wrong? Was *I* the
one who was complaing? (I try not to do that anymore.)

The only thing you did "wrong" was to be less pedantic than me. All I was
trying to do was narrow down in exactly what sense you meant that
arrangements of a single set are subsets of the product of two or more
sets.


Is that "one of four for generating combinatorics"?

Obviously not. If it's *missing* how could it be one of the four?

Maybe you should be talking to the itertools developers?

If I had a good use-case for dearrangements, or a fast, reliable
implementation, then maybe I would.

Ok, but isn't that not within the scope of what we're discussing?

In the context of Pataphor complaining about itertools having an
incomplete set of combinatorics tools, missing partions() and
dearrangements() are certainly within the scope. Whether they are serious
lacks is another question.
 
M

Mensanator

On Wed, 03 Jun 2009 18:21:37 -0700, Mensanator wrote:

[mass snippage]
Settle down Mensanator! Don't take it so personally! You're sounding
awfully agitated.

Don't worry, I'm not.
Now that I've narrowed down what you actually meant, I'm happy to agree
with you, at least informally.

Ok. End of thread.
If I had a good use-case for dearrangements, or a fast, reliable
implementation, then maybe I would.

*I* happen to have a good user-case for "partitions
of DEPTH indistinguishable items into WIDTH ordered
bins such that DEPTH >= WIDTH and each bin must contain
at least 1 item". That comes up in the Collatz
Conjecture, specifically, a list of WIDTH integers
that sums to DEPTH such that the list cannot be
empty nor contain any number less than 1.

Horribly important.

For example, 7 items into 4 bins would be:

import collatz_functions as cf
for i in cf.partition_generator(7,4):
print i

## [1, 1, 1, 4]
## [1, 1, 2, 3]
## [1, 1, 3, 2]
## [1, 1, 4, 1]
## [1, 2, 1, 3]
## [1, 2, 2, 2]
## [1, 2, 3, 1]
## [1, 3, 1, 2]
## [1, 3, 2, 1]
## [1, 4, 1, 1]
## [2, 1, 1, 3]
## [2, 1, 2, 2]
## [2, 1, 3, 1]
## [2, 2, 1, 2]
## [2, 2, 2, 1]
## [2, 3, 1, 1]
## [3, 1, 1, 2]
## [3, 1, 2, 1]
## [3, 2, 1, 1]
## [4, 1, 1, 1]

But, as you can see, I already know how to calculate
it and I doubt anyone but me would be interested in
such a thing.
 
R

Raymond Hettinger

Nice one!

It only does partitions of a sequence. I haven't yet looked at a way
to
do partitions of a set. Any ideas?

Raymond, as perhaps *the* principle contributor to itertools, do you feel
that the combinatorics-related tools should be in their own module? Or is
that dividing the module too fine?

I like having them in itertools because they form part of the
"iterator algebra"
for splitting, filtering, joining, and combining streams of iterators.

Would you consider moving permutations() and friends into a module
together with functions for calculating the number of permutations etc?

e.g. permutations(iterable[, r]) --> permutations object
     nPr(n, r) --> int

Looked at this a while back. If the nPr style functions go in, they
will
probably be in the math module (or a new module for integer functions
like gcd and whatnot). The reason for not keeping them together is
that
the use cases are likely disjoint -- when I go to my calculator for
nCr
I don't expect a listing -- when I need to loop over permutations, I
typically only care about the contents of the permutation, not the
total
number of them (except for purposes of estimating how long a search
will
take). People look to the math module for things they find on their
calculator and to the itertools module for high-speed looping
constructs.

Also, there is a issue of naming the functions. For combinations, you
you might think to look for ncombs() or somesuch, but
binomial_coefficient()
is what someone else may be searching for.

What would be nice is to turn the itertool combinatorics into
classes with a len() method, an ability to index to the n-th
iteration, or to find at an arbitrary iteration:
c = combinations('monte', 3)
len(c) 10
c[2] ('m', 'o', 'e')
c.find(('m', 't', 'e')) 5
random.choice(c)
('o', 'n', 'e')

If the input iterable contains repeats, the find() method
would find the first match.

These things are interesting and fun, but they also smell
of feeping-creaturism.


Raymond
 
M

Mensanator

Nice one!

It only does partitions of a sequence.  I haven't yet looked at a way
to
do partitions of a set.  Any ideas?
Raymond, as perhaps *the* principle contributor to itertools, do you feel
that the combinatorics-related tools should be in their own module? Or is
that dividing the module too fine?

I like having them in itertools because they form part of the
"iterator algebra"
for splitting, filtering, joining, and combining streams of iterators.
Would you consider moving permutations() and friends into a module
together with functions for calculating the number of permutations etc?
e.g. permutations(iterable[, r]) --> permutations object
     nPr(n, r) --> int

Looked at this a while back.  If the nPr style functions go in, they
will
probably be in the math module (or a new module for integer functions
like gcd and whatnot).  The reason for not keeping them together is
that
the use cases are likely disjoint -- when I go to my calculator for
nCr
I don't expect a listing -- when I need to loop over permutations, I
typically only care about the contents of the permutation, not the
total
number of them (except for purposes of estimating how long a search
will
take).  People look to the math module for things they find on their
calculator and to the itertools module for high-speed looping
constructs.

Also, there is a issue of naming the functions.  For combinations, you
you might think to look for ncombs() or somesuch, but
binomial_coefficient()
is what someone else may be searching for.

What would be nice is to turn the itertool combinatorics into
classes with a len() method, an ability to index to the n-th
iteration, or to find at an arbitrary iteration:
c = combinations('monte', 3)
len(c) 10
c[2] ('m', 'o', 'e')
c.find(('m', 't', 'e')) 5
random.choice(c)

('o', 'n', 'e')

If the input iterable contains repeats, the find() method
would find the first match.

These things are interesting and fun, but they also smell
of feeping-creaturism.

After all, everybody knows that for m items taken n at
a time, the counts are

perm w/repl = m**n
comb w/repl = (m+n-1)!/(n!(m-1)!)
perm wo/repl = m!/(m-n)!
comb wo/repl = m!/(n!(m-n)!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top