Predictably Breaking Up a Cartesian Product

B

Brad

Here is some code I wrote that breaks up a Cartesian Product. It
compiles and runs fine:

#include <iostream>
#include <cmath>

void cartesian_product( const std::string& my_string, int b1, int e1 )
{

std::string::const_iterator it1, it2, it3;
std::string possibility;
possibility.reserve( 3 );

for ( it1 = my_string.begin()+b1; it1 != my_string.end()-e1; ++it1 )
{
possibility.push_back( *it1 );

for ( it2 = my_string.begin(); it2 != my_string.end(); ++it2 )
{
possibility.push_back( *it2 );

for ( it3 = my_string.begin(); it3 != my_string.end(); ++it3 )
{
possibility.push_back( *it3 );
std::cout << possibility << std::endl;
possibility.resize( 2 );
}

possibility.resize( 1 );
}
}
possibility.clear();

//// Show the break-ups
std::cout << "---" << std::endl;
}

int main ()
{
const std::string my_string ("abc"); // The string you wish to
generate the CP from.
int magic = my_string.size()-1; // Always 1 less than your string
size.
int j = 0;

// The number of character... must match number of loops in
cartesian_product()
int length = 3;
double total = pow( my_string.size(), length );
std::clog << "Total Combinations = " << total << std::endl;

// Generate CP chunks
for ( int i = magic; i >= 0; --i )
{
cartesian_product( my_string, j, i );
++j;
}
}

------------------------------

You can produce one full (unchunked) CP by doing this:

cartesian_product( my_string, 0, 0 );

I'd like to make it so that I pass an int (say 2, 3 or 4) to the
cartesian_product function so that it would produce 2, 3 or 4 chunks.
Basically control the number of chunks it produces. So for example if
there are 64 possibilities and I pass in 2, then the first run
produces the first 32 possibilities and the second the last 32. Any
suggestions?

Thanks,
Brad
 
B

Brad

Here is some code I wrote that breaks up a Cartesian Product. It
compiles and runs fine:

#include <iostream>
#include <cmath>

void cartesian_product( const std::string& my_string, int b1, int e1 )
{

        std::string::const_iterator it1, it2, it3;
        std::string possibility;
        possibility.reserve( 3 );

        for ( it1 = my_string.begin()+b1; it1 != my_string.end()-e1; ++it1 )
        {
                possibility.push_back( *it1 );

                for ( it2 = my_string.begin(); it2 != my_string.end(); ++it2 )
                {
                        possibility.push_back( *it2 );

                        for ( it3 = my_string.begin(); it3 != my_string.end(); ++it3 )
                        {
                                possibility.push_back( *it3 );
                                std::cout << possibility << std::endl;
                                possibility.resize( 2 );
                        }

                        possibility.resize( 1 );
                }
        }
        possibility.clear();

        //// Show the break-ups
        std::cout << "---" << std::endl;

}

int main ()
{
        const std::string my_string ("abc"); // The string you wish to
generate the CP from.
        int magic = my_string.size()-1; // Always 1 less than your string
size.
        int j = 0;

        // The number of character... must match number of loops in
cartesian_product()
        int length = 3;
        double total = pow( my_string.size(), length );
        std::clog << "Total Combinations = " << total << std::endl;

        // Generate CP chunks
        for ( int i = magic; i >= 0; --i )
        {
                cartesian_product( my_string, j, i );
                ++j;
        }

}

------------------------------

You can produce one full (unchunked) CP by doing this:

cartesian_product( my_string, 0, 0 );

I'd like to make it so that I pass an int (say 2, 3 or 4) to the
cartesian_product function so that it would produce 2, 3 or 4 chunks.
Basically control the number of chunks it produces. So for example if
there are 64 possibilities and I pass in 2, then the first run
produces the first 32 possibilities and the second the last 32. Any
suggestions?

Thanks,
Brad

Whoops... possibility.clear() is in the wrong place (I wrote that
quickly), it should be in the scope of the first for loop (moved up 2
lines). My apologies.
 
F

Francesco S. Carta

Here is some code I wrote that breaks up a Cartesian Product. It
compiles and runs fine:

You can produce one full (unchunked) CP by doing this:

cartesian_product( my_string, 0, 0 );

I'd like to make it so that I pass an int (say 2, 3 or 4) to the
cartesian_product function so that it would produce 2, 3 or 4 chunks.
Basically control the number of chunks it produces. So for example if
there are 64 possibilities and I pass in 2, then the first run
produces the first 32 possibilities and the second the last 32. Any
suggestions?

If you want a function that behaves depending on its previous calls you
need to give some static memory to it.

Such a function could, for example:

- store a local copy of all the incoming data;
- store an additional "progress" counter
- produce chunk_size combinations starting from progress
- update the progress counter

subsequent calls could then lookup the arguments and, if matching a
previously incomplete job, continue from there.

(all the above can be done passing chunks_number instead of chunk_size
as you request, the algorithm wouldn't be any trickier)

A functor could well done the job, maybe better, because you can throw
in additional features as, for example, querying the current progress of
a particular job, avoiding to lookup previously entered data - calling
the object with an empty pair of parentheses is just a minor, secondary
typing advantage :)

Hope these two cents help.
 
J

Jonathan Lee

Here is some code I wrote that breaks up a Cartesian Product. It
compiles and runs fine: [snip]

You can produce one full (unchunked) CP by doing this:

cartesian_product( my_string, 0, 0 );

I'd like to make it so that I pass an int (say 2, 3 or 4) to the
cartesian_product function so that it would produce 2, 3 or 4 chunks.
Basically control the number of chunks it produces. So for example if
there are 64 possibilities and I pass in 2, then the first run
produces the first 32 possibilities and the second the last 32. Any
suggestions?

A couple things:

As far as I can tell, you're only doing 3 dimensional products?
When I try changing it to, say, a 2 element string I get
"Total Combinations = 4" and then 8 outputs

You may want to look at Vol. 4 Fascicle 2 of Knuth's "The Art
of Computer Programming" for alternative ways of doing this.

As to your direct question you can either return the whole
(mathematical) set as a std::vector and break it up manually,
or keep a variable that increments every time you print. If
you get to (total combinations / number of blocks), output
a line and reset the counter.

--Jonathan
 
B

Brad

A couple things:

 As far as I can tell, you're only doing 3 dimensional products?
 When I try changing it to, say, a 2 element string I get
 "Total Combinations = 4" and then 8 outputs

2 elements work fine for me:

../cartesian
Total Combinations = 8

aaa
aab
aba
abb
---
baa
bab
bba
bbb
---
 
B

Brad

snip
 As to your direct question you can either return the whole
 (mathematical) set as a std::vector and break it up manually,

.... also forgot... big products and little memory do not go well
together.
 
J

Joe Greer

Whoops... possibility.clear() is in the wrong place (I wrote that
quickly), it should be in the scope of the first for loop (moved up 2
lines). My apologies.

What about wrapping it in a function object?

Then you have all sorts of possibilities for maintaining state. Just a
thought. As a start...

struct CartesianProduct
{
CartesianProduct(int chunks = 0) : chunks_(chunks) {}
void operator()(/* some appropriate args*/)
{
// some appropriate action
}
private:
int chunks_;
// any other state data
};

CartesianProduct f(2);

f(my_string, 0);

joe
 
J

Jonathan Lee

2 elements work fine for me:


Ah, now that I look more closely I see I made an assumption. In
particular, that changing "length" to 2 would produce only
combinations of 2 characters.
... also forgot... big products and little memory do not go well
together.

Err.. no, they don't. Hence my other suggestions.

--Jonathan
 
K

Keith H Duggar

Here is some code I wrote that breaks up a Cartesian Product. It
compiles and runs fine:

#include <iostream>
#include <cmath>

void cartesian_product( const std::string& my_string, int b1, int e1 )
{

        std::string::const_iterator it1, it2, it3;
        std::string possibility;
        possibility.reserve( 3 );

        for ( it1 = my_string.begin()+b1; it1 != my_string.end()-e1; ++it1 )
        {
                possibility.push_back( *it1 );

                for ( it2 = my_string.begin(); it2 != my_string.end(); ++it2 )
                {
                        possibility.push_back( *it2 );

                        for ( it3 = my_string.begin(); it3 != my_string.end(); ++it3 )
                        {
                                possibility.push_back( *it3 );
                                std::cout << possibility << std::endl;
                                possibility.resize( 2 );
                        }

                        possibility.resize( 1 );
                }
        }
        possibility.clear();

        //// Show the break-ups
        std::cout << "---" << std::endl;

}

int main ()
{
        const std::string my_string ("abc"); // The string you wish to
generate the CP from.
        int magic = my_string.size()-1; // Always 1 less than your string
size.
        int j = 0;

        // The number of character... must match number of loops in
cartesian_product()
        int length = 3;
        double total = pow( my_string.size(), length );
        std::clog << "Total Combinations = " << total << std::endl;

        // Generate CP chunks
        for ( int i = magic; i >= 0; --i )
        {
                cartesian_product( my_string, j, i );
                ++j;
        }

}

------------------------------

You can produce one full (unchunked) CP by doing this:

cartesian_product( my_string, 0, 0 );

I'd like to make it so that I pass an int (say 2, 3 or 4) to the
cartesian_product function so that it would produce 2, 3 or 4 chunks.
Basically control the number of chunks it produces. So for example if
there are 64 possibilities and I pass in 2, then the first run
produces the first 32 possibilities and the second the last 32. Any
suggestions?

There are many ways to solve this problem. Below is one (not the
best) solution. This solution can be improved in many ways which
I leave to you as an exercise. Also note that your algorithm and
the one below outputs a multiset not a set, if that matters for
your purpose.

<code>
#include <cstdlib>
#include <iostream>
#include <string>

using std::string ;

int TheNumBeg ;
int TheNumEnd ;
int TheNumCur ;

void
scanToCout (
string::const_iterator i
, string::const_iterator e
) {
if ( TheNumCur >= TheNumBeg && TheNumCur < TheNumEnd ) {
for ( ; i != e ; ++i ) {
std::cout << *i ;
}
std::cout << '\n' ;
}
++TheNumCur ;
}

template <
class miter
, class citer
, class Func >
void
scanProduct (
citer const & valseti
, citer const & valsete
, miter const & bufferi
, miter const & bufferm
, miter const & buffere
, Func const & scan
) {
if ( bufferm != buffere ) {
miter buffern = bufferm ;
++buffern ;
for ( citer vi = valseti ; vi != valsete ; ++vi ) {
*bufferm = *vi ;
scanProduct(valseti,valsete,bufferi,buffern,buffere,scan) ;
}
} else {
scan(bufferi,buffere) ;
}
}

int
main (
int argc
, char * argv[]
) {
if ( argc < 5 ) return -1 ;

string const valset(argv[1]) ;
int const bufsiz = std::atoi(argv[2]) ;
int const numbeg = std::atoi(argv[3]) ;
int const numend = std::atoi(argv[4]) ;

TheNumBeg = numbeg ;
TheNumEnd = numend ;

if ( valset.empty() ) return 0 ;

string buffer ;
buffer.resize(bufsiz>0?bufsiz:0) ;

scanProduct (
valset.begin(), valset.end()
, buffer.begin(), buffer.begin(), buffer.end()
, scanToCout
) ;

return 0 ;
}
</code>

For example to output value range [9,18)

cartprod abc 3 9 18

KHD
 
B

Brad

Thanks for the tips.

I wrote some code to calculate some preliminary things. I have very
big sets (thus the use of ULL). Now I just need to write a Cartesian
Product generator that given a set, a number and a starting point will
generate that number of possibilities from that set at that starting
point. Wow, that's a mouth full.

My goal is to generate a full Cartesian Product in small pieces using
multiple CPUs while having each CPU do some calculations, rather than
one doing it all.

I appreciate all the code examples and tips. If there are any more
ideas about the generator itself, let me know. This below code works
and compiles.


#include <iostream>
#include <vector>

// Boost Includes
#include <boost/thread/thread.hpp>

// To compile
g++ -static -O3 -Wall cartesian.cpp -o cartesian \
-I/usr/local/include/boost-1_43 /usr/local/lib/libboost_thread-mt.a \
-pthread

bool debug = true;

unsigned long long possibilities( std::string& n, unsigned int
length )
{
// Size of the character set to the power of length.
// pow( n ) is cleaner, but I don't want a double.
unsigned long long p = 1;
unsigned long long x = n.size();
for ( unsigned int i = 0; i < length; ++i )
{
p = x * p;
if ( debug == true )
{
std::clog << i << "\t" << p << std::endl;
}
}

if ( debug == true )
{
std::clog << "Possibilities = " << p << std::endl;
}

return p;
}


unsigned long long chunk_size( unsigned long long p, unsigned int dc )
{
// possibilities divided by the number of desired chunks.
unsigned long long cs = p/dc;

if ( debug == true )
{
std::clog << "Chunk Size = " << cs << std::endl;
}

return cs;
}


unsigned int hwt()
{
// Number of CPUs or CPU Cores in the computer
// set t manually to simulate more or less CPUs
unsigned int t = boost::thread::hardware_concurrency();

if ( debug == true )
{
std::clog << "HWTs = " << t << std::endl;
}

return t;
}


std::vector<unsigned long long> chunks( unsigned long long p,
unsigned long long cs,
unsigned int hwt )
{
std::vector<unsigned long long> c;
c.reserve( hwt );

// Chunks match possibilities perfectly... Yeah!
if ( p % cs == 0 )
{
for ( unsigned int i = 0; i < hwt; ++i )
{
c.push_back( cs );
if ( debug == true )
{
std::clog << "Vector " << i << " Value = " << cs << std::endl;
}
}
}

// Deal with non-perfect matches
else
{
// Push all but the last chunk into the vector
for ( unsigned int i = 1; i < hwt; ++i )
{
c.push_back( cs );
if ( debug == true )
{
std::clog << "Vector " << i << " Value = " << cs << std::endl;
}
}

// Push the last chunk into the vector
unsigned long long last = cs+(p % cs);
c.push_back( last );

if ( debug == true )
{
std::clog << "Last Vector" << " Value = " << last << std::endl;
std::clog << "Addition to last vector = " << ( p % cs )<<
std::endl;
}
}


if ( debug == true )
{
std::clog << "Vector Size = " << c.size() << " (Must match HWTs)" <<
std::endl;
}

return c;
}


int main ()
{
// Make this whatever you like.
std::string my_string = "abc";

unsigned long long p = possibilities( my_string, 4 );
unsigned int t = hwt();
unsigned long long cs = chunk_size( p, t );

chunks( p, cs, t );
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top