Programming challenge: wildcard exclusion in cartesian products

W

wkehowski

The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

#-------------------------------------------------------------------------------
# Short algorithm description
# using function _genAll the program generates
# cartesian product without sets, which match
# some wildcarts
# Sets generation uses recursion ->
# first of all sets will be generated with dimension 1 and than
filtered through wildcarts
# then sets will be generated with dimension 2 and filtered again
# until the required set dimension is reached
# Program avoids explicit generation of some part of CP sets
# if the end of whildcart is asterics (*) and if the first part of
whildcart (without astrics)
# matches current set => then this set will be filtered out and won't
be used in
# higher dimension set generation
# example *,1,*,2,* [1,2] dim = 10
# by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
# => array [1,2] won't be used in next recursion levels
#-------------------------------------------------------------------------------
# To obtaine result use function
# CPWithoutWC first parameter is a list of any elements
(char,int,string,class exemplar ,.... any type)
# secont param is CP dimension
# other parameters are wildcarts => lists with any values then may
include
# special value ALL - asterics equivalent
#Example of usage: command line
# >>> import cartesianProduct as cp
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 1]
# [2, 1, 2]
# [2, 2, 1]
# [2, 2, 2]
# >>> for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
# print i
# ['a', 'a', 'a']
# ['a', 'b', 'a']
# ['b', 'a', 'b']
# ['b', 'b', 'b']
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 2]
# [2, 2, 2]
# >>>
# >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
# print i
## execute immediately
# >>>
# if You don't want to print cp. before ALL and CPWithoutWC use import
like this:
# from cartesianProduct import ALL,CPWithoutWC
# CPWithoutWC is a python generator. Which means that it returns values

# immediately and generate next in next cycle.
# Program example
#
## from cartesianProduct import ALL,CPWithoutWC
## def main():
## for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
## ## do what You want with current value
## .........
## ## go back to for statement and generate new
## if __name__ == "__main__":
## main()
#
"""
Using logical combinations of WC:
1) It's possible to pass on to the function CPWithoutWC
any number of wildcarts after first two parameters, for example:
CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
where ... - is any other wildcart's additional function parameters.
Number of additional WC is not limited.
Function will filter out all combinations, which match any passed on
WC.
It's equal to WC1 | WC2 | .... , where | is python analog of OR
logical operations.
2) To use more complex WC combinations follow these steps
a) First of all create all needed WC
b) Then use operators |, & and braces () to create combinations
required and then pass it on to function
CPWithoutWCEx as the third parameter. Don't use "or" and "and"
python statement, otherwise program will
work improper. First two parameters of this function are the same as
of CPWithoutWC function - set of
elements and CP dimension. An example of what was described above in
command line:
>>> from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
>>> a = WC([ALL,1,ALL])
>>> b = WC([ALL,2,ALL])
>>> c = a & b #filter out all sets which match a and b
>>> for i in CPWithoutWCEx([1,2],3,c) : print i
[1, 1, 1]
[2, 2, 2]
>>> # all sets where both 1 and 2 are present will be filtered out
>>> d = a | b
>>> for i in CPWithoutWCEx([1,2],3,d) : print i
>>> # returns nothing
>>> for i in CPWithoutWCEx([1,2,3],3,d) : print i [3, 3, 3]
>>> a = WC([2,1,ALL])
>>> b = WC([1,2,ALL])
>>> c = WC([ALL,2])
>>> d = ( a | b ) & c
>>> for i in CPWithoutWCEx([1,2],3,d) : print i
[1, 1, 1]
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]
[2, 2, 1]
[2, 2, 2]
>>> # filters out all combinations which start with [1,2] or [2,1]
and end with 2

Number of WC, which are used to form logical combinations is not
limited.
"""
"""
13.02.2006
a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
Their interface is the same as of CPWithoutWCEx and CPWithoutWC
accordingly, except that the third parameter is WC list and
they accept strictly three parameters.

As You can see these functions are very simple because
python is quite flexible =>
>>> def s(x,y): return x * y
>>> d = [3,2]
>>> s(*d) ## == s(3,2)
6

b)Now WC can take string as parameter, and You can use string
as parameters of functions CPWithoutWC and CPWithoutWC_L
instead of WC lists.
Strings transform into WC according to these rules
1)If first symbol in the string is
alphanumeric (a-z or A-Z or 0-9) or '*'
character the every character of the string will be recognized as
a distinct set element. Examples:
"ad*d*" == ['a','d',cp.ALL,'d',cp.ALL]
"*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
2)If first character is not (alphanumeric or '*')
it will be treated as a delimitator. Examples:
":a:A:1:*" == ['a','A','1',cp.ALL]
":aA1:*" == ['aA1',cp.ALL]
it's not necessary to write delimitators around the asterics
":aA1*" == ['aA1',cp.ALL]
"%aA%1*" == ['aA','1',cp.ALL]
3)If all non delimit and non asterics character in elements
are digits => they will be treated as numbers.Examples:
"123*" == [1,2,3,cp.ALL]
":12:3*" == [12,3,cp.ALL]
but
":12:a:3*" == ['12','a','3',cp.ALL]
Examples of use:
for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
#You can mixe strings and lists for wildcarts
for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
['abc', 'abc', 'abc']
['abc', 'xyz', 'abc']
['xyz', 'abc', 'abc']
['xyz', 'abc', 'xyz']
['xyz', 'xyz', 'abc']
['xyz', 'xyz', 'xyz']
"""
#-------------------------------------------------------------------------------
class ALL(object):pass
#-------------------------------------------------------------------------------
class NO_ONE(object):pass
#-------------------------------------------------------------------------------
class BFunctor(object):
def __init__(self,func):
self.func = func
def __call__(self,*dt,**mp):
return self.func(*dt,**mp)
@classmethod
def OR(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
@classmethod
def AND(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)
#-------------------------------------------------------------------------------
def _genAll(head,n,WCF,curr):
if len(curr) != 0 and n != 0:
for i in curr:
nhead = head +
if n != 1 :
# needed dimension are not reached
# -> we mast tell WC that some other values
# may concatenate in the end of nhead in next recursion levels
# but if WC is ended with asterics (ALL), than dosn't matter
# so i use special walue NO_ONE to resolve this problem
# if WC with final asterics like [1,2,3,ALL] are matched nhead
=>
# they matched nhead + [NO_ONE] to
# but if WC is like [1,ALL,2,3] => they dont match
[1,2,3,NO_ONE] =>
# they don't prevent to generate [1,2,3,4] on next recursion
level
x = WCF(nhead + [NO_ONE],curr)
else : x = WCF(nhead,curr)
if False == x:
if n == 1 : yield nhead
else:
for i in _genAll(nhead,n - 1,WCF,curr):
yield i
elif n == 0 :
yield head
#-------------------------------------------------------------------------------
class WC(object):
def __init__(self,wc):
self.wc = wc
self.transformWC()
self.num_els = 0
self.compress()
self.comphdr = None
self.findMaxHeader()
self.ln = len(self.wc)

#-----------------------------------------------------------------------------
def transformWC(self):
if self.wc.__class__ not in (str,unicode) : return
if len(self.wc) == 0 : return
if self.wc[0].isalnum() or self.wc[0] == "*":
wc = self.wc
else:
wc = self.wc[1:].split(self.wc[0])
nwc = []
for i in wc:
if i == '*' : nwc.append(ALL)
elif '*' in i :
for j in i.split('*'):
if j : nwc.append(j)
nwc.append(ALL)
del nwc[-1]
else : nwc.append(i)
#check if all elements are numbers or *
allnum = True
for i in nwc:
if i is ALL : continue
try : int(i)
except :
allnum = False
break
if allnum:
for i,j in enumerate(nwc):
if j is not ALL:
nwc = int(j)
self.wc = nwc

#-----------------------------------------------------------------------------
def findMaxHeader(self):
return

#-----------------------------------------------------------------------------
def compress(self):
"delete dublicated * values"
if len(self.wc) == 0 : return
wc_ = self.wc[:1]
for i in self.wc[1:]:
if i == ALL and i == wc_[-1] : continue
wc_.append(i)
self.wc = wc_

#-----------------------------------------------------------------------------
def matchExact(self,hd,pos = 0):
if pos == len(self.wc) : return len(hd) == 0
if self.wc[pos] == ALL :
if pos + 1 == len(self.wc) : return True
vl = self.wc[pos + 1]
cpos = -1
while True:
try : cpos = hd.index(vl,cpos + 1)
except : return False
if self.matchExact(hd[cpos + 1:],pos + 2) : return True
else:
if len(hd) == 0 : return False
if hd[0] != self.wc[pos] : return False
return self.matchExact(hd[1:],pos + 1)

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)

#-----------------------------------------------------------------------------
def __call__(self,hd,st):
return self.matchExact(hd)
#-------------------------------------------------------------------------------
def CPWithoutWCEx(set,n,wc):
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC(set,n,*dt):
if len(dt) == 0 :
wc = lambda hd,st : True
else:
wc = WC(dt[0])
#print wc.wc
for i in dt[1:]:
wc = wc | WC(i)
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC_L(set,n,WCs):
for i in CPWithoutWC(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def CPWithoutWCEx_L(set,n,WCs):
for i in CPWithoutWCEx(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def main():
for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
#-------------------------------------------------------------------------------
if __name__ == "__main__" : main()
#-------------------------------------------------------------------------------
 
T

Tomasz Zielonka

The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

What is your goal? You want to learn or to cause a flamewar? ;-)

Anyway, I found the problem entertaining, so here you go, here is my
Haskell code. It could be shorter if I didn't care about performance and
wrote in specification style. It's not very efficient either, because it
will generate all lists matching the given patterns.

In GHCi you can test it by:

$ ghci
:l WildCartesian.hs
test

I apologise for the lack of comments.

----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
module WildCartesian where

import Data.Set (Set)
import qualified Data.Set as Set
import Control.Monad
import Control.Exception (assert)
import Maybe
import List

data Pat a = All | Lit a deriving Show

generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]]
generateMatching 0 _ [] = [[]]
generateMatching 0 _ (_:_) = []
generateMatching len alphabet (Lit x : ps)
| x `Set.member` alphabet =
[ (x : xs) | xs <- generateMatching (len - 1) alphabet ps ]
| otherwise =
[ ]
generateMatching len alphabet (All : ps) =
[ (x : xs)
| x <- Set.toList alphabet
, xs <- unionSorted
(generateMatching (len - 1) alphabet ps)
(generateMatching (len - 1) alphabet (All : ps)) ]
`unionSorted`
generateMatching len alphabet ps
generateMatching _ _ [] = []

generateNotMatching :: (Ord a) => [a] -> Int -> [[Pat a]] -> [[a]]
generateNotMatching alphabet len patterns =
generateMatching len alphaSet [All]
`subtractSorted`
foldr unionSorted []
(map (generateMatching len alphaSet . simplifyPat) patterns)
where
alphaSet = Set.fromList alphabet

simplifyPat (All : All : ps) = simplifyPat (All : ps)
simplifyPat (p : ps) = p : simplifyPat ps
simplifyPat [] = []

joinSorted :: Ord a => [a] -> [a] -> [(Maybe a, Maybe a)]
joinSorted (x1:x2:_) _ | assert (x1 < x2) False = undefined
joinSorted _ (y1:y2:_) | assert (y1 < y2) False = undefined
joinSorted (x:xs) (y:ys) =
case x `compare` y of
LT -> (Just x, Nothing) : joinSorted xs (y:ys)
EQ -> (Just x, Just y) : joinSorted xs ys
GT -> (Nothing, Just y) : joinSorted (x:xs) ys
joinSorted (x:xs) [] = (Just x, Nothing) : joinSorted xs []
joinSorted [] (y:ys) = (Nothing, Just y) : joinSorted [] ys
joinSorted [] [] = []

unionSorted :: Ord a => [a] -> [a] -> [a]
unionSorted xs ys = catMaybes (map (uncurry mplus) (joinSorted xs ys))

subtractSorted :: Ord a => [a] -> [a] -> [a]
subtractSorted xs ys = catMaybes (map f (joinSorted xs ys))
where
f (Just x, Nothing) = Just x
f _ = Nothing

test = do
t [1,2] 3 [[Lit 1, All, Lit 2]]
t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']]
t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]]
where
t a b c = do
putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c]))
mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c)
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----

Best regards
Tomasz
 
T

Tomasz Zielonka

Tomasz said:
putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c]))

Minor correction: it should be "generateNotMatching".

Best regards
Tomasz
 
T

Tomasz Zielonka

Major correction (missing case):

Tomasz said:
generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]]
generateMatching 0 _ [] = [[]]
generateMatching 0 alphabet (All:ps) = generateMatching 0 alphabet ps
generateMatching 0 _ (_:_) = []

Best regards
Tomasz
 
W

wkehowski

Flame war? Absolutely not. My reason is to learn. There are many sites
dedicated to reasonably objective comparisons between languages. Here
are two examples:

http://www.smallscript.org/Language Comparison Chart.asp
http://www.jvoegele.com/software/langcomp.html

The wildcard exclusion problem is interesting enough to have many
distinct, elegant solutions in as many languages. It would be
interesting to see if they converge to roughly the same solution or if
there are essential differences. And your code is a crash course in
Haskell! Tossing aside the 'flame war' inquiry your code response is my
only goal. I hope many others find the problem and responses as
fascinating as I do.
 
W

wkehowski

The point is to submit elegant code that showcases the features of each
language. And the problem is, just to clarify, given a set WC of
wildcards in any logical combination, and if WC(S^n) is the set all s
in S^n that matches the wildcards, then efficiently generate the
complement S^n\WC(S^n). You are free to restate the problem in any
equivalent way.
 
W

Wade Humeniuk

Without much testing. Common Lisp

Pattern exclusions are made lispy.


(defun all-lists (list length)
(unless (zerop length)
(if (= length 1) (mapcar #'list list)
(loop for elt in list
nconc
(mapcar (lambda (rest)
(cons elt rest))
(loop for rest on list
nconc (all-lists rest (1- length))))))))

(defun cp-without-wc (source-list &rest patterns)
(let* ((length (length (first patterns)))
(all-lists (all-lists source-list length)))
(dolist (pattern patterns)
(setf all-lists
(set-difference all-lists
(mapcar (lambda (insertion)
(let ((cp (copy-list pattern)))
(loop for place on cp
when (eql :any (car place)) do
(setf (car place) (pop insertion)))
cp))
(all-lists source-list (count :any pattern)))
:test #'equal)))
(remove-duplicates all-lists :test #'equal)))

CL-USER 22 > (cp-without-wc '(a b) '(a :any b) '(b :any a))
((A A A) (A B A) (B A B) (B B B))

CL-USER 23 > (cp-without-wc '(abc xyz) '(abc :any xyz))
((XYZ XYZ XYZ) (XYZ XYZ ABC) (XYZ ABC XYZ) (XYZ ABC ABC) (ABC XYZ ABC) (ABC ABC ABC))

CL-USER 24 > (cp-without-wc '(a b) '(a :any :any))
((B B B) (B B A) (B A B) (B A A))

CL-USER 25 > (cp-without-wc '(a b) '(a :any :any) '(b :any :any))
NIL

CL-USER 26 > (cp-without-wc '(a b) ':)any :any b))
((B B A) (B A A) (A B A) (A A A))

CL-USER 27 >

Wade
 
W

wkehowski

What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.
 
F

funkyj

here is my version of the same.

REPL output:

CL-USER> (tests)


set = (1 2)
n = 3
patterns = ((1 ANY 2))
-----------------------
(1 1 1)
(1 2 1)
(2 1 1)
(2 1 2)
(2 2 1)
(2 2 2)


set = (A B)
n = 3
patterns = ((A ANY B) (B ANY A))
-----------------------
(A A A)
(A B A)
(B A B)
(B B B)


set = (1 2)
n = 3
patterns = ((1 ANY 2) (2 ANY 1))
-----------------------
(1 1 1)
(1 2 1)
(2 1 2)
(2 2 2)
NIL
CL-USER>

source:

;;;; cartesian products minus wildcard patterns per:
;;;;
;;;; >Newsgroups: comp.lang.lisp, etc...
;;;; >Subject: Programming challenge: wildcard exclusion in cartesian
products
;;;; >Date: 16 Mar 2006 03:14:23 -0800
;;;;
;;;;

(defun show-me (x) (format t "~A~%" x))

(defun set^n (fn set n &optional acc)
"call `fn' on each permutation of `set' raised to the `n' power"
(if (<= n 0)
(funcall fn (reverse acc))
(dolist (e set)
(set^n fn set (- n 1) (cons e acc)))))

;; test set^n by printing and visually inspecting the result
(defun pr-set^n (set n) (set^n #'show-me set n))

;; curry `set^n' so that `fn' is the only parameter
(defun set^n-gen (set n)
(lambda (fn) (set^n fn set n)))

(defun mk-matchl-p (pat-list)
"return a function that tests a value against the patterns in
`pat-list'"
(labels ((matchp (pat val)
(cond ((null pat) t)
((or (eq (car pat) (car val))
(eq (car pat) :any))
(matchp (cdr pat) (cdr val))))))
(lambda (val)
"predicate: return true if val matches any pattern in `pat-list'"
(dolist (p pat-list)
(if (matchp p val)
(return t))))))

(defun not-fp (f-pred)
"return the complement of predicate `f-pred'"
(lambda (x) (not (funcall f-pred x))))

;; f-gen is a generator of the form returned by set^n-gen
(defun accumulate-if (f-gen f-pred)
"accumulate values generated by f-gen that satisfy f-pred"
(let (acc)
(funcall f-gen (lambda (x) (if (funcall f-pred x) (push x acc))))
(nreverse acc)))

;; `pr-set^n-withoutWC' is the lisp equivalent (more or less) of
;; python code:
;; >>> for i in cp.CPWithoutWC(x,y,z): print i
(defun pr-set^n-withoutWC (set n pat-list)
(format t "~%~%set = ~A~%n = ~A~%patterns = ~A~%~A~%"
set n pat-list "-----------------------")
(dolist (e (accumulate-if (set^n-gen set n)
(not-fp (mk-matchl-p pat-list))))
(format t "~A~%" e)))

(defun tests ()
"generate test output per the original problem examples"
(pr-set^n-withoutWC '(1 2) 3 '((1 :any 2)))
(pr-set^n-withoutWC '(a b) 3 '((a :any b) (b :any a)))
(pr-set^n-withoutWC '(1 2) 3 '((1 :any 2) (2 :any 1))))
 
F

funkyj

NOTE: I am a lisp newbie. I'm sure our resident lisp experts can
create much better (both faster, shorter and clearer) solutions than
the one above.

Even I could have created something shorter but I thought it would be
fun to apply the "utility function" approach in decomposing the
problem.

--jfc
 
S

sa

in k:

cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:mad:[x;&x=-1;:[;!c]]}'p}

examples:

cp[2;3;,0 -1 1]
(0 0 0
0 1 0
1 0 0
1 0 1
1 1 0
1 1 1)

cp[2;3;(0 -1 1;1 -1 0)]
(0 0 0
0 1 0
1 0 1
1 1 1)

cp[2;3;(0 -1 1;1 -1 1)]
(0 0 0
0 1 0
1 0 0
1 1 0)

arguments of cp:

c = cardinality of the input set
n = power
p = list of patterns (-1 = wildcard)

the algorithm directly computes the target set. in other words,
it does not generate the set, then filter the matches from the
target.

modifying cp to accept s instead of the cardinality of s,
patterns expressed in terms of elements of s, &c. adds nothing
of interest to the problem.

The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

#-------------------------------------------------------------------------------
# Short algorithm description
# using function _genAll the program generates
# cartesian product without sets, which match
# some wildcarts
# Sets generation uses recursion ->
# first of all sets will be generated with dimension 1 and than
filtered through wildcarts
# then sets will be generated with dimension 2 and filtered again
# until the required set dimension is reached
# Program avoids explicit generation of some part of CP sets
# if the end of whildcart is asterics (*) and if the first part of
whildcart (without astrics)
# matches current set => then this set will be filtered out and won't
be used in
# higher dimension set generation
# example *,1,*,2,* [1,2] dim = 10
# by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
# => array [1,2] won't be used in next recursion levels
#-------------------------------------------------------------------------------
# To obtaine result use function
# CPWithoutWC first parameter is a list of any elements
(char,int,string,class exemplar ,.... any type)
# secont param is CP dimension
# other parameters are wildcarts => lists with any values then may
include
# special value ALL - asterics equivalent
#Example of usage: command line
# >>> import cartesianProduct as cp
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 1]
# [2, 1, 2]
# [2, 2, 1]
# [2, 2, 2]
# >>> for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
# print i
# ['a', 'a', 'a']
# ['a', 'b', 'a']
# ['b', 'a', 'b']
# ['b', 'b', 'b']
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 2]
# [2, 2, 2]
# >>>
# >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
# print i
## execute immediately
# >>>
# if You don't want to print cp. before ALL and CPWithoutWC use import
like this:
# from cartesianProduct import ALL,CPWithoutWC
# CPWithoutWC is a python generator. Which means that it returns values

# immediately and generate next in next cycle.
# Program example
#
## from cartesianProduct import ALL,CPWithoutWC
## def main():
## for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
## ## do what You want with current value
## .........
## ## go back to for statement and generate new
## if __name__ == "__main__":
## main()
#
"""
Using logical combinations of WC:
1) It's possible to pass on to the function CPWithoutWC
any number of wildcarts after first two parameters, for example:
CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
where ... - is any other wildcart's additional function parameters.
Number of additional WC is not limited.
Function will filter out all combinations, which match any passed on
WC.
It's equal to WC1 | WC2 | .... , where | is python analog of OR
logical operations.
2) To use more complex WC combinations follow these steps
a) First of all create all needed WC
b) Then use operators |, & and braces () to create combinations
required and then pass it on to function
CPWithoutWCEx as the third parameter. Don't use "or" and "and"
python statement, otherwise program will
work improper. First two parameters of this function are the same as
of CPWithoutWC function - set of
elements and CP dimension. An example of what was described above in
command line:
from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
a = WC([ALL,1,ALL])
b = WC([ALL,2,ALL])
c = a & b #filter out all sets which match a and b
for i in CPWithoutWCEx([1,2],3,c) : print i
[1, 1, 1]
[2, 2, 2]
# all sets where both 1 and 2 are present will be filtered out
d = a | b
for i in CPWithoutWCEx([1,2],3,d) : print i
# returns nothing
for i in CPWithoutWCEx([1,2,3],3,d) : print i [3, 3, 3]
a = WC([2,1,ALL])
b = WC([1,2,ALL])
c = WC([ALL,2])
d = ( a | b ) & c
for i in CPWithoutWCEx([1,2],3,d) : print i
[1, 1, 1]
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]
[2, 2, 1]
[2, 2, 2]
# filters out all combinations which start with [1,2] or [2,1]
and end with 2

Number of WC, which are used to form logical combinations is not
limited.
"""
"""
13.02.2006
a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
Their interface is the same as of CPWithoutWCEx and CPWithoutWC
accordingly, except that the third parameter is WC list and
they accept strictly three parameters.

As You can see these functions are very simple because
python is quite flexible =>
def s(x,y): return x * y
d = [3,2]
s(*d) ## == s(3,2)
6

b)Now WC can take string as parameter, and You can use string
as parameters of functions CPWithoutWC and CPWithoutWC_L
instead of WC lists.
Strings transform into WC according to these rules
1)If first symbol in the string is
alphanumeric (a-z or A-Z or 0-9) or '*'
character the every character of the string will be recognized as
a distinct set element. Examples:
"ad*d*" == ['a','d',cp.ALL,'d',cp.ALL]
"*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
2)If first character is not (alphanumeric or '*')
it will be treated as a delimitator. Examples:
":a:A:1:*" == ['a','A','1',cp.ALL]
":aA1:*" == ['aA1',cp.ALL]
it's not necessary to write delimitators around the asterics
":aA1*" == ['aA1',cp.ALL]
"%aA%1*" == ['aA','1',cp.ALL]
3)If all non delimit and non asterics character in elements
are digits => they will be treated as numbers.Examples:
"123*" == [1,2,3,cp.ALL]
":12:3*" == [12,3,cp.ALL]
but
":12:a:3*" == ['12','a','3',cp.ALL]
Examples of use:
for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
#You can mixe strings and lists for wildcarts
for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]):
print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
['abc', 'abc', 'abc']
['abc', 'xyz', 'abc']
['xyz', 'abc', 'abc']
['xyz', 'abc', 'xyz']
['xyz', 'xyz', 'abc']
['xyz', 'xyz', 'xyz']
"""
#-------------------------------------------------------------------------------
class ALL(object):pass
#-------------------------------------------------------------------------------
class NO_ONE(object):pass
#-------------------------------------------------------------------------------
class BFunctor(object):
def __init__(self,func):
self.func = func
def __call__(self,*dt,**mp):
return self.func(*dt,**mp)
@classmethod
def OR(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
@classmethod
def AND(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)
#-------------------------------------------------------------------------------
def _genAll(head,n,WCF,curr):
if len(curr) != 0 and n != 0:
for i in curr:
nhead = head +
if n != 1 :
# needed dimension are not reached
# -> we mast tell WC that some other values
# may concatenate in the end of nhead in next recursion levels
# but if WC is ended with asterics (ALL), than dosn't matter
# so i use special walue NO_ONE to resolve this problem
# if WC with final asterics like [1,2,3,ALL] are matched nhead
=>
# they matched nhead + [NO_ONE] to
# but if WC is like [1,ALL,2,3] => they dont match
[1,2,3,NO_ONE] =>
# they don't prevent to generate [1,2,3,4] on next recursion
level
x = WCF(nhead + [NO_ONE],curr)
else : x = WCF(nhead,curr)
if False == x:
if n == 1 : yield nhead
else:
for i in _genAll(nhead,n - 1,WCF,curr):
yield i
elif n == 0 :
yield head
#-------------------------------------------------------------------------------
class WC(object):
def __init__(self,wc):
self.wc = wc
self.transformWC()
self.num_els = 0
self.compress()
self.comphdr = None
self.findMaxHeader()
self.ln = len(self.wc)

#-----------------------------------------------------------------------------
def transformWC(self):
if self.wc.__class__ not in (str,unicode) : return
if len(self.wc) == 0 : return
if self.wc[0].isalnum() or self.wc[0] == "*":
wc = self.wc
else:
wc = self.wc[1:].split(self.wc[0])
nwc = []
for i in wc:
if i == '*' : nwc.append(ALL)
elif '*' in i :
for j in i.split('*'):
if j : nwc.append(j)
nwc.append(ALL)
del nwc[-1]
else : nwc.append(i)
#check if all elements are numbers or *
allnum = True
for i in nwc:
if i is ALL : continue
try : int(i)
except :
allnum = False
break
if allnum:
for i,j in enumerate(nwc):
if j is not ALL:
nwc = int(j)
self.wc = nwc

#-----------------------------------------------------------------------------
def findMaxHeader(self):
return

#-----------------------------------------------------------------------------
def compress(self):
"delete dublicated * values"
if len(self.wc) == 0 : return
wc_ = self.wc[:1]
for i in self.wc[1:]:
if i == ALL and i == wc_[-1] : continue
wc_.append(i)
self.wc = wc_

#-----------------------------------------------------------------------------
def matchExact(self,hd,pos = 0):
if pos == len(self.wc) : return len(hd) == 0
if self.wc[pos] == ALL :
if pos + 1 == len(self.wc) : return True
vl = self.wc[pos + 1]
cpos = -1
while True:
try : cpos = hd.index(vl,cpos + 1)
except : return False
if self.matchExact(hd[cpos + 1:],pos + 2) : return True
else:
if len(hd) == 0 : return False
if hd[0] != self.wc[pos] : return False
return self.matchExact(hd[1:],pos + 1)

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)

#-----------------------------------------------------------------------------
def __call__(self,hd,st):
return self.matchExact(hd)
#-------------------------------------------------------------------------------
def CPWithoutWCEx(set,n,wc):
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC(set,n,*dt):
if len(dt) == 0 :
wc = lambda hd,st : True
else:
wc = WC(dt[0])
#print wc.wc
for i in dt[1:]:
wc = wc | WC(i)
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC_L(set,n,WCs):
for i in CPWithoutWC(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def CPWithoutWCEx_L(set,n,WCs):
for i in CPWithoutWCEx(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def main():
for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
#-------------------------------------------------------------------------------
if __name__ == "__main__" : main()
#-------------------------------------------------------------------------------
 
M

Marcin 'Qrczak' Kowalczyk

The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

I'm afraid that different programs in this thread has understood the
asterisk differently: that it matches any single element, or that it
matches any sequence of elements.
 
W

wkehowski

The asterisk '*' matches any sequence of elements, not just one
element. The wildcard '*1*2*' would then correspond to a tuple with a 1
preceding a 2 in any positions. The wc '1*2' would correspond to a
starting 1 and an ending 2 with anything in between. The wc *12* would
correspond to a 1 adjacent to a 2 with the pair in any position.
Possibilities like '*a*a*b*' and '*a*a*a*' of any length are also
allowed. If n is the dimension, then any n-tuple wc is just a point. My
apologies for the confusion.
 
D

Dan Piponi

Is this Haskell implementation what you want? It does the wildcard
matching through a state machine and it essentially threads the
state machine through the cartesian product, switching to the
ordinary cartesian product when possible as an optimisation.
The execution of the state machine is shared by strings with the
same prefix making it reasonably efficient even though the state
machine itself isn't optimised.

If it doesn't work, I'm sure it's only a few typos away...

-- generate strings of length n from alphabet l such that
-- the state machine, with transition function t, is not on
-- a final state (determined by function f) at the
-- end of the string.
-- If the state is ever 'unmatchable' (as determined by u)
-- we just return the cartesian product as no rejection
-- can take place.
generate f u t s 0 l = if f s then [] else [[]]
generate f u t s n l | u s = sequence (replicate n l)
| otherwise =
[a:b | a <- l, let s' = t s a,
b <- generate f u t s' (n-1) l]

-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

-- This is the transition function for our machine.
transition pat a = pat >>= d a where
-- Brzozowski derivative
d a [] = []
d a p@('*':pat) = p:d a pat
d a (p:pat) | a==p = [pat]
| otherwise = []

-- A terminal state is one that matches the null string
terminal p = or $ map terminal' p where
terminal' "" = True
terminal' ('*':p) = terminal' p
terminal' _ = False

run n alphabet pat =
generate terminal null transition [pat] n alphabet

test = run 3 "abc" "aa*a"
 
W

Wade Humeniuk

What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

Another attempt. I have made no special attempt to create an
exclusion language, just used an anonymous lambda predicate.


;; Wade Humeniuk

(defclass odometer ()
((base :initform 0 :accessor base)
(meter :initform nil :accessor meter)
(n-digits :initarg :n-digits :accessor n-digits)
(digit-set :initarg :digit-set :accessor digit-set)))

(defmethod initialize-instance :after ((obj odometer) &rest initargs)
(setf (base obj) (length (digit-set obj))
(meter obj) (make-array (n-digits obj) :initial-element 0)
(digit-set obj) (coerce (digit-set obj) 'vector)))

(defun inc-odometer (odometer)
(loop with carry = 1
for i from (1- (n-digits odometer)) downto 0
for digit = (incf (aref (meter odometer) i) carry)
if (= digit (base odometer)) do
(setf (aref (meter odometer) i) 0)
(setf carry 1)
else do
(setf carry 0)
while (not (zerop carry))))

(defun zero-meter-p (odometer)
(every #'zerop (meter odometer)))

(defmethod next-set ((obj odometer))
(prog1 (map 'list (lambda (digit)
(aref (digit-set obj) digit))
(meter obj))
(inc-odometer obj)))

(defclass cs-with-wc (odometer)
((exclusion :initarg :exclusion :accessor exclusion)
(at-end :initform nil :accessor at-end)))

(defmethod next-set ((obj odometer))
(tagbody
:next
(unless (at-end obj)
(let ((set (call-next-method)))
(when (zero-meter-p obj) (setf (at-end obj) t))
(if (not (funcall (exclusion obj) set))
(return-from next-set set)
(go :next))))))

(defun print-all-cs (set length exclusion)
(let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set
:exclusion exclusion)))
(loop for set = (next-set cs-with-wc)
while set do (print set))))

CL-USER 134 > (cs-with-wc '(a b) 3 (lambda (set)
(destructuring-bind (x y z)
set
(or (and (eql x 'a) (eql z 'b))
(and (eql x 'b) (eql z 'a))))))

(A A A)
(A B A)
(B A B)
(B B B)
NIL

CL-USER 135 > (cs-with-wc '(a b) 3 (lambda (set)
(eql (second set) 'a)))

(A B A)
(A B B)
(B B A)
(B B B)
NIL

CL-USER 136 > (cs-with-wc '(abc xyz) 3 (lambda (set)
(and (eql (first set) 'abc)
(eql (third set) 'xyz))))

(ABC ABC ABC)
(ABC XYZ ABC)
(XYZ ABC ABC)
(XYZ ABC XYZ)
(XYZ XYZ ABC)
(XYZ XYZ XYZ)
NIL
 
W

Wade Humeniuk

Oops, problems cutting an pasting, should be,

;; Wade Humeniuk

(defclass odometer ()
((base :initform 0 :accessor base)
(meter :initform nil :accessor meter)
(n-digits :initarg :n-digits :accessor n-digits)
(digit-set :initarg :digit-set :accessor digit-set)))

(defmethod initialize-instance :after ((obj odometer) &rest initargs)
(setf (base obj) (length (digit-set obj))
(meter obj) (make-array (n-digits obj) :initial-element 0)
(digit-set obj) (coerce (digit-set obj) 'vector)))

(defun inc-odometer (odometer)
(loop with carry = 1
for i from (1- (n-digits odometer)) downto 0
for digit = (incf (aref (meter odometer) i) carry)
if (= digit (base odometer)) do
(setf (aref (meter odometer) i) 0)
(setf carry 1)
else do
(setf carry 0)
while (not (zerop carry))))

(defun zero-meter-p (odometer)
(every #'zerop (meter odometer)))

(defmethod next-set ((obj odometer))
(prog1 (map 'list (lambda (digit)
(aref (digit-set obj) digit))
(meter obj))
(inc-odometer obj)))

(defclass cs-with-wc (odometer)
((exclusion :initarg :exclusion :accessor exclusion)
(at-end :initform nil :accessor at-end)))

(defmethod next-set ((obj cs-with-wc))
(tagbody
:next
(unless (at-end obj)
(let ((set (call-next-method)))
(when (zero-meter-p obj) (setf (at-end obj) t))
(if (not (funcall (exclusion obj) set))
(return-from next-set set)
(go :next))))))

(defun print-all-cs (set length exclusion)
(let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set
:exclusion exclusion)))
(loop for set = (next-set cs-with-wc)
while set do (print set))))

CL-USER 7 > (print-all-cs '(a b) 3 (lambda (set)
(destructuring-bind (x y z)
set
(or (and (eql x 'a) (eql z 'b))
(and (eql x 'b) (eql z 'a))))))

(A A A)
(A B A)
(B A B)
(B B B)
NIL

CL-USER 8 > (print-all-cs '(abc xyz) 3 (lambda (set)
(and (eql (first set) 'abc)
(eql (third set) 'xyz))))

(ABC ABC ABC)
(ABC XYZ ABC)
(XYZ ABC ABC)
(XYZ ABC XYZ)
(XYZ XYZ ABC)
(XYZ XYZ XYZ)
NIL

CL-USER 9 >
 
W

wkehowski

-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded, so the list
[wc1..wcn] is to correspond generating all tuples matching not(wc1 and
... and wcn). Maybe you're already doing that. The wc's themselves
could be logical statements among the 'primitive' wc's. That's why I
named it the 'wildcard exclusion problem'. It's a lot easier to specify
a list of simpler wc's than create a long logical expression.

Thanks to all who are making this an interesting thread.
 
D

Dinko Tenev

Heh, here's a Prolog version:

==========================================================

gen( _, 0, [] ) :- !.
gen( S, N, [H | T] ) :- member( H, S ), M is N - 1, gen( S, M, T ).

==========================================================

Yep, that's it :)))

Here's how to test it:

==========================================================

1 ?- gen([a, b, c], 3, X), print(X), nl, fail.
[a, a, a]
[a, a, b]
[a, a, c]
[a, b, a]
[a, b, b]
[a, b, c]
[a, c, a]
[a, c, b]
[a, c, c]
[b, a, a]
[b, a, b]
[b, a, c]
[b, b, a]
[b, b, b]
[b, b, c]
[b, c, a]
[b, c, b]
[b, c, c]
[c, a, a]
[c, a, b]
[c, a, c]
[c, b, a]
[c, b, b]
[c, b, c]
[c, c, a]
[c, c, b]
[c, c, c]

No
2 ?- gen([a, b, c], 3, X), not(member(X, [[a, _, _], [_, b, _], [_, _,
c]])), print(X), nl, fail.
[b, a, a]
[b, a, b]
[b, c, a]
[b, c, b]
[c, a, a]
[c, a, b]
[c, c, a]
[c, c, b]

No

==========================================================
 
T

Tomasz Zielonka

Dan said:
Is this Haskell implementation what you want? It does the wildcard
matching through a state machine and it essentially threads the
state machine through the cartesian product, switching to the
ordinary cartesian product when possible as an optimisation.
The execution of the state machine is shared by strings with the
same prefix making it reasonably efficient even though the state
machine itself isn't optimised.

I've implemented the same concept yesterday evening:

----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
module WildCartesian where

import List

data Pat a = All | Lit a deriving (Show, Eq)

advancePattern :: Eq a => a -> [Pat a] -> [[Pat a]]
advancePattern y (Lit x : ps)
| x == y = [ps]
| otherwise = []
advancePattern y (All : ps) = [All : ps] ++ [ps] ++ advancePattern y ps
advancePattern _ [] = []

generateNotMatching :: Eq a => [a] -> Int -> [[Pat a]] -> [[a]]
generateNotMatching alphabet = gen []
where
gen _ n pats
| any (\ps -> all (== All) ps && (not (null ps) || n == 0)) pats = []
gen acc 0 _
= [reverse acc]
gen acc n pats
= [ w | x <- alphabet
, let pats' = [ p' | p <- pats, p' <- advancePattern x p ]
, w <- gen (x : acc) (n - 1) pats' ]

test :: IO ()
test = do
t [1,2] 3 [[Lit 1, All, Lit 2]]
t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']]
t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]]
where
t a b c = do
putStrLn (concat (intersperse " " ["generateNotMatching", show a, show b, show c]))
mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c)
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----

Best regards
Tomasz
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top