A dynamic unit test framework in C

M

MC Andre

I'm writing a port of QuickCheck in C. Working code is available at
GitHub (https://github.com/mcandre/qc). There are several problems to
consider, but I'll limit myself to one for now.

There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:

typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

char gen_char() {
return (char) (rand() % 128);
}

To create a random string, one would write:

char* random_string = (char*) gen_array(gen_char, sizeof(char));

For now, I'm getting a compiler error.

$ make
gcc -o example example.c qc.c qc.h -lgc
qc.c: In function ‘gen_array’:
qc.c:29: warning: dereferencing ‘void *’ pointer
qc.c:29: error: invalid use of void expression
make: *** [example] Error 1

A StackOverflow user has posted code that almost works. It compiles,
but it requires generator functions to do pointer arithmetic,
something I don't want users of my framework to have to do.

http://stackoverflow.com/questions/...on-passed-to-another-function/7523699#7523699

If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

Cheers,
Andrew Pennebaker
 
I

Ike Naar

typedef void (*fp)();

Type fp is defined a pointer to a function that returns no value.
void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

gen returns a value?
But gen is an fp, and an fp does not return a value.
This seems inconsistent.
 
S

Stanley Rice

I'm writing a port of QuickCheck in C. Working code is available at
GitHub (https://github.com/mcandre/qc). There are several problems to
consider, but I'll limit myself to one for now.

There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:

typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

char gen_char() {
        return (char) (rand() % 128);

}

To create a random string, one would write:

char* random_string = (char*) gen_array(gen_char, sizeof(char));

For now, I'm getting a compiler error.

$ make
gcc -o example example.c qc.c qc.h -lgc
qc.c: In function ‘gen_array’:
qc.c:29: warning: dereferencing ‘void *’ pointer
qc.c:29: error: invalid use of void expression
make: *** [example] Error 1

A StackOverflow user has posted code that almost works. It compiles,
but it requires generator functions to do pointer arithmetic,
something I don't want users of my framework to have to do.

http://stackoverflow.com/questions/7523358/how-do-i-call-an-arbitrary...

If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

Cheers,
Andrew Pennebaker

I ever tried to write some generic function which is similar to the
template of C++, but I found it quite difficult. I don't know if I
fully understand your question, and I modified some of your code,
which is pass in my machine.

However, the code I post is dangerous and with high cost, because I
didn't check the return value of malloc and calling malloc is quite
expensive more than I could think.

typedef void *(*fp)(); // The function pointer returns a pointer to
void, which makes it easy to cast to other type

/* SIZE_TYPE: the size of each element in the array in byte
* ARR_SIZE: the number of elements in the array */
static void *gen_array(fp gen, size_t size_type, size_t arr_size)
{
size_t i;
char *result;

result = malloc(size_type * arr_size);

for (i = 0; i < arr_size; ++i)
{
void *p = gen();
memcpy(result + i, p, size_type);
free(p);
}

return result;
}

/* I couldn't evaluate how low it is!!
* Maybe you can use a global buffer to hold the value,
* or pass the buffer as a parameter*/
static void *gen_char()
{
char *c = malloc(1);

*c = rand() % 128;
return c;
}

static int finalize(void *A)
{
free(A);
return 0;
}

And I generate the array in the following way to make a array with
size of 10:
char *A = gen_array(gen_char, sizeof(char), 10);
/* TO-DO */
finalize(A);
 
J

James Kuyper

On 09/22/2011 11:19 PM, MC Andre wrote:
....
typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

You've declared gen to be an 'fp', which is a typedef for nullary
function that does NOT return a value. That's the fundamental problem.
 
B

Ben Bacarisse

MC Andre said:
I'm writing a port of QuickCheck in C. Working code is available at
GitHub (https://github.com/mcandre/qc). There are several problems to
consider, but I'll limit myself to one for now.

There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:

typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

char gen_char() {
return (char) (rand() % 128);
}

To create a random string, one would write:

char* random_string = (char*) gen_array(gen_char, sizeof(char));

I don't think this way will work. The trouble is that it requires the
generic code (gen_array) to know too much about the type of object in
the array. All gen_array can do is walk the array (it can do this if it
know the element size which it seems to be given) and call a users fill
me up function. That will require a pointer to be passed to the user's
function.
For now, I'm getting a compiler error.

$ make
gcc -o example example.c qc.c qc.h -lgc
qc.c: In function ‘gen_array’:
qc.c:29: warning: dereferencing ‘void *’ pointer
qc.c:29: error: invalid use of void expression
make: *** [example] Error 1

I won't investigate that. I don't think it matters.
A StackOverflow user has posted code that almost works. It compiles,
but it requires generator functions to do pointer arithmetic,
something I don't want users of my framework to have to do.

I don't see any pointer arithmetic in the user code. The global is
ugly, but that can be got rid of simply enough.
http://stackoverflow.com/questions/...on-passed-to-another-function/7523699#7523699

If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

I'd like to know the reason for these restrictions. Passing data to
functions is fundamental to how they work. Banning it (and banning the
kludgey alternative of a shared variable) is seriously hampering the
language.

I think you have two ways to go:

(a) Accept that the user's generator function will be passed a pointer.
All C programmer should be fine with this. The function must then fill
in the object the pointer points to. This does not even require a cast:

void make_random_struct(void *p)
{
struct my_big_struct *sp = p;
sp->member1 = rand();
/* and so on... */
}

(b) Get the user's function to return a pointer to an object which the
generic array code copies into the correct location (using memcpy).
This will require, if not a global object, at least one with static
storage duration:

void *make_random_struct(void)
{
static struct my_big_struct s;
s.member1 = rand();
/* and so on... */
return &s;
}

Nothing passed and no global. I think (a) is more elegant, but (b) has
an interesting advantage for a test framework: the user code never gets
to write directly into the test array so it can't make a mistake that
messes up the test. As a result, I might favour (b) though I would be
worried about things like threaded code which don't play well with
static storage.
 
I

ImpalerCore

I'm writing a port of QuickCheck in C. Working code is available at
GitHub (https://github.com/mcandre/qc). There are several problems to
consider, but I'll limit myself to one for now.

There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:

typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

char gen_char() {
        return (char) (rand() % 128);

}

To create a random string, one would write:

char* random_string = (char*) gen_array(gen_char, sizeof(char));

For now, I'm getting a compiler error.

$ make
gcc -o example example.c qc.c qc.h -lgc
qc.c: In function ‘gen_array’:
qc.c:29: warning: dereferencing ‘void *’ pointer
qc.c:29: error: invalid use of void expression
make: *** [example] Error 1

A StackOverflow user has posted code that almost works. It compiles,
but it requires generator functions to do pointer arithmetic,
something I don't want users of my framework to have to do.

http://stackoverflow.com/questions/7523358/how-do-i-call-an-arbitrary...

If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

If you want to create a random value generator for an object of any
type, here is one observation that may help.

If one has a function that can create a uniform distribution for the
range of a single byte, one should be able to create a uniform
distribution for any object by filling it like an array of 'sizeof
(type)' bytes.

The common method of using the modulo operator typically introduces a
small amount of bias because the range of values required is not
evenly divisible by the maximum range of the random number
(RAND_MAX). To illustrate this point, let's consider when RAND_MAX is
equal to 7, but we want the values between 0 and 2. A modulo bias is
introduced because 'rand() % 3' cannot represent all cases evenly.

rand() yields 0, so 'rand() % 3' is 0
rand() yields 1, so 'rand() % 3' is 1
rand() yields 2, so 'rand() % 3' is 2
rand() yields 3, so 'rand() % 3' is 0
rand() yields 4, so 'rand() % 3' is 1
rand() yields 5, so 'rand() % 3' is 2
rand() yields 6, so 'rand() % 3' is 0
rand() yields 7, so 'rand() % 3' is 1

As you can see, the values of '0' and '1' have a probability of '3/8',
and the value of '2' has a probability of '2/8'. To eliminate modulo
bias, one would need to re-roll if a value of '6' or '7' is found.

You can remove modulo bias by using this function.

\code
int c_rand_int( int n )
{
int limit = RAND_MAX - RAND_MAX % n;
int r;

do {
r = rand();
} while ( r >= limit );

return r % n;
}
\endcode

If we substitute RAND_MAX of 7, and an 'n' of 3, the upper limit is '7
- (7 % 3) == 6'. If 'r' happens to be greater-than or equal to 6, we
re-roll the dice. When RAND_MAX is large (at least 32767), the number
of re-rolls needed goes down, so there is little risk of going
infinite. Still, if one wants to be extra safe, one could include a
safety counter to kick out of the while loop if rand() happens to be
broken.

Okay, so now we have a uniform randomizer for a byte by calling
'c_rand_int( UCHAR_MAX )'.

To make a randomized value for a value of arbitrary type, we just fill
in the bytes of the object.

\code
void rndset( void* p, size_t n )
{
unsigned char* ucp = (unsigned char*)p;

while ( n-- != 0 ) {
*ucp++ = (unsigned char)c_rand_int( UCHAR_MAX );
}
}
\endcode

If you want to generate a random string, I would recommend sampling
'c_rand_int' until you get a character that matches the character set
you want.

char* generate_random_string( size_t n )
{
char* rnd_str = NULL;
char* s;
char ch;

if ( n > 0 )
{
rnd_str = malloc( n );
if ( rnd_str )
{
while ( n-- > 1 ) /* keep one character open for '\0' */
{
/* sample random characters until one is found in
the preferred character set, isprint, isalnum, ... */
do {
ch = (char)c_rand_int( CHAR_MAX );
} while ( !isalnum( ch ) );

*s = ch;
}

*s = '\0';
}
}

return rnd_str;
}
\endcode

Then one can use it like the following.

\code
#define RAND_STR_LENGTH 25

int main( void )
{
char* str;
int value;

str = generate_random_string( RAND_STR_LENGTH + 1 );
puts( str );
free( str );

do {
rndset( &value, sizeof (value) );
} while ( !( 10000 <= r && r <= 100000 ) );

printf( "%d\n", value );

return EXIT_SUCCESS;
}
\endcode

Should get something like "aXqtnzPu9h9ck0pVJjoo3oGYH" and '69191'.

Of course one could parameterize the character set into a function
pointer, so you could generate a random integer string, etc.

I'm no statistical expert, so there may be caveats I'm not aware of.
I did not heavily test the functions either.

Best regards,
John D.
 
I

ImpalerCore

I'm writing a port of QuickCheck in C. Working code is available at
GitHub (https://github.com/mcandre/qc). There are several problems to
consider, but I'll limit myself to one for now.
There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:
typedef void (*fp)();
void* gen_array(fp gen, size_t size);
Where gen is a nullary function that returns a value, for example:
char gen_char() {
        return (char) (rand() % 128);

To create a random string, one would write:
char* random_string = (char*) gen_array(gen_char, sizeof(char));
For now, I'm getting a compiler error.
$ make
gcc -o example example.c qc.c qc.h -lgc
qc.c: In function ‘gen_array’:
qc.c:29: warning: dereferencing ‘void *’ pointer
qc.c:29: error: invalid use of void expression
make: *** [example] Error 1
A StackOverflow user has posted code that almost works. It compiles,
but it requires generator functions to do pointer arithmetic,
something I don't want users of my framework to have to do.

If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

If you want to create a random value generator for an object of any
type, here is one observation that may help.

If one has a function that can create a uniform distribution for the
range of a single byte, one should be able to create a uniform
distribution for any object by filling it like an array of 'sizeof
(type)' bytes.

The common method of using the modulo operator typically introduces a
small amount of bias because the range of values required is not
evenly divisible by the maximum range of the random number
(RAND_MAX).  To illustrate this point, let's consider when RAND_MAX is
equal to 7, but we want the values between 0 and 2.  A modulo bias is
introduced because 'rand() % 3' cannot represent all cases evenly.

rand() yields 0, so 'rand() % 3' is 0
rand() yields 1, so 'rand() % 3' is 1
rand() yields 2, so 'rand() % 3' is 2
rand() yields 3, so 'rand() % 3' is 0
rand() yields 4, so 'rand() % 3' is 1
rand() yields 5, so 'rand() % 3' is 2
rand() yields 6, so 'rand() % 3' is 0
rand() yields 7, so 'rand() % 3' is 1

As you can see, the values of '0' and '1' have a probability of '3/8',
and the value of '2' has a probability of '2/8'.  To eliminate modulo
bias, one would need to re-roll if a value of '6' or '7' is found.

You can remove modulo bias by using this function.

\code
int c_rand_int( int n )
{
  int limit = RAND_MAX - RAND_MAX % n;
  int r;

  do {
    r = rand();
  } while ( r >= limit );

  return r % n;}

\endcode

If we substitute RAND_MAX of 7, and an 'n' of 3, the upper limit is '7
- (7 % 3) == 6'.  If 'r' happens to be greater-than or equal to 6, we
re-roll the dice.  When RAND_MAX is large (at least 32767), the number
of re-rolls needed goes down, so there is little risk of going
infinite.  Still, if one wants to be extra safe, one could include a
safety counter to kick out of the while loop if rand() happens to be
broken.

Okay, so now we have a uniform randomizer for a byte by calling
'c_rand_int( UCHAR_MAX )'.

To make a randomized value for a value of arbitrary type, we just fill
in the bytes of the object.

\code
void rndset( void* p, size_t n )
{
  unsigned char* ucp = (unsigned char*)p;

  while ( n-- != 0 ) {
    *ucp++ = (unsigned char)c_rand_int( UCHAR_MAX );
  }}

\endcode

If you want to generate a random string, I would recommend sampling
'c_rand_int' until you get a character that matches the character set
you want.

char* generate_random_string( size_t n )
{
  char* rnd_str = NULL;
  char* s;
  char ch;

  if ( n > 0 )
  {
    rnd_str = malloc( n );
    if ( rnd_str )
    {
      while ( n-- > 1 )  /* keep one character open for '\0' */
      {
        /* sample random characters until one is found in
           the preferred character set, isprint, isalnum, ...*/
        do {
          ch = (char)c_rand_int( CHAR_MAX );
        } while ( !isalnum( ch ) );

        *s = ch;
      }

      *s = '\0';
    }
  }

  return rnd_str;}

\endcode

Then one can use it like the following.

\code
#define RAND_STR_LENGTH 25

int main( void )
{
  char* str;
  int value;

  str = generate_random_string( RAND_STR_LENGTH + 1 );
  puts( str );
  free( str );

  do {
    rndset( &value, sizeof (value) );
  } while ( !( 10000 <= r && r <= 100000 ) );

should be 'value' instead of 'r' ...

do {
rndset( &value, sizeof (value) );
} while ( !( 10000 <= value && value <= 100000 ) );
 
B

Ben Bacarisse

ImpalerCore said:
If you want to create a random value generator for an object of any
type, here is one observation that may help.

If one has a function that can create a uniform distribution for the
range of a single byte, one should be able to create a uniform
distribution for any object by filling it like an array of 'sizeof
(type)' bytes.

Any object? It is unlikely to work for double or float (or their
derived complex types). I say "unlikely" because I've not checked how
wacky an FP type C permits, but I suspect it won't work for any
permissible C floating type. Also, non-array aggregate types are going
to be a problem too, but I imagine you are implicitly ruling those out.

To be really picky, it's problematic even for integer types. You
probably don't want to generate trap representations with a probability
determined by the number of padding bits. In fact, you may want to avoid
them altogether. Also, for integer systems that permit two (non-trap)
zero representations, the result won't be "value" uniform since 0 will
be (ever so slightly) more common. Of course, having both zeros
represented might be seen as an advantage when testing.

<snip>
 
I

ImpalerCore

Any object?  It is unlikely to work for double or float (or their
derived complex types).  I say "unlikely" because I've not checked how
wacky an FP type C permits, but I suspect it won't work for any
permissible C floating type.  Also, non-array aggregate types are going
to be a problem too, but I imagine you are implicitly ruling those out.

Yeah, that was worded too broad. It's certainly not going to generate
uniform distributions of floating point types in the normal sense.
You should be able to get a limited uniform floating point
distribution by taking the ratio of a random number over a decimal
base.

/* should generate a uniform 6-digit floating point
distribution from 0 to 1 */
c_rand_int( 1000000 + 1 ) / 1000000.;
To be really picky, it's problematic even for integer types.  You
probably don't want to generate trap representations with a probability
determined by the number of padding bits.  In fact, you may want to avoid
them altogether.  Also, for integer systems that permit two (non-trap)
zero representations, the result won't be "value" uniform since 0 will
be (ever so slightly) more common.  Of course, having both zeros
represented might be seen as an advantage when testing.

Good points.

Best regards,
John D.
 
N

Nobody

There's a gen_array() function that should populate an array of random
length (e.g. { 1, 5, 7 } or "ArF"), using a generator function to fill
in values. It has the following signature:

typedef void (*fp)();

void* gen_array(fp gen, size_t size);

Where gen is a nullary function that returns a value, for example:

char gen_char() {
return (char) (rand() % 128);
}

To create a random string, one would write:

char* random_string = (char*) gen_array(gen_char, sizeof(char));
If anyone can provide a version of gen_array that doesn't require
passing values to generators, and doesn't require globals, I'd be
elated to see it.

The easy solution is to change the type to:

typedef void (*fp)(void *);

E.g.:

void gen_char(void *resultp) {
*(char *)resultp = (rand() % 128);
}

If this isn't possible, then I suggest looking at libffi.
 
I

ImpalerCore

[snip]

\code
int c_rand_int( int n )
{
  int limit = RAND_MAX - RAND_MAX % n;
  int r;

  do {
    r = rand();
  } while ( r >= limit );

  return r % n;
}

\endcode

[snip]

I made a subtle mistake ... in the use of the c_rand_int() function,
the modulus goes from [0...N).

\code
void rndset( void* p, size_t n )
{
unsigned char* ucp = (unsigned char*)p;

while ( n-- != 0 ) {
*ucp++ = (unsigned char)c_rand_int( UCHAR_MAX + 1 );
^^^
}
}
\endcode

UCHAR_MAX for 8-bit characters is 255, which implies that the value
from 'r % 255' will be from 0...254. Replace UCHAR_MAX with UCHAR_MAX
+ 1 and CHAR_MAX with CHAR_MAX + 1.
 
I

ImpalerCore

[snip]

To be really picky, it's problematic even for integer types.  You
probably don't want to generate trap representations with a probability
determined by the number of padding bits.  In fact, you may want to avoid
them altogether.  Also, for integer systems that permit two (non-trap)
zero representations, the result won't be "value" uniform since 0 will
be (ever so slightly) more common.  Of course, having both zeros
represented might be seen as an advantage when testing.

I just discovered that my 'c_rand_int( 1000000 + 1 )' was infinite
looping because my 'limit' was negative. It appears that it needs
some adjustment for when INT_MAX > RAND_MAX, in my case RAND_MAX =
32767. Is there an alternative to filling bytes that doesn't tread
into risky territory?
 
B

Ben Bacarisse

ImpalerCore said:
[snip]

To be really picky, it's problematic even for integer types.  You
probably don't want to generate trap representations with a probability
determined by the number of padding bits.  In fact, you may want to avoid
them altogether.  Also, for integer systems that permit two (non-trap)
zero representations, the result won't be "value" uniform since 0 will
be (ever so slightly) more common.  Of course, having both zeros
represented might be seen as an advantage when testing.

I just discovered that my 'c_rand_int( 1000000 + 1 )' was infinite
looping because my 'limit' was negative. It appears that it needs
some adjustment for when INT_MAX > RAND_MAX, in my case RAND_MAX =
32767. Is there an alternative to filling bytes that doesn't tread
into risky territory?

If the risky territory is the possibility of padding bits and so on,
then I think you need to take a value-based approach. For example, one
could start with uniform double in [0, 1) and use that to get a uniform
int in [0. INT_MAX]. There's risky territory here also, because INT_MAX
might exceed the range of exact integers that double can represent.

Most good random number generators produce random bits (but if
RAND_MAX+1 is not a power of 2 this will probably not be true) so you
can often just take bytes from rand() and put them where you need them.
This is equivalent to what you were doing with % (UCHAR_MAX + 1). The
point about the bits being random is simply that you don't need to worry
about bias when reducing the range to some smaller number of bits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top