Thread-Safety in Functions: A Random String Generator

J

Jonathan Burd

Greetings everyone,

Here is a random string generator I wrote for an application
and I'm wondering about the thread-safety of this function.
I was told using static and global variables cause
potential problems for thread-safety. So far, I'm only confused.
I need a proper explanation for the concept so I can understand how
to write thread-safe functions in the future.

My apologies for posting a long routine.

#include <stdlib.h>
#include <time.h>

/* ---( PRIVATE )--- */
#define RANDSTR_MIN 32
#define RANDSTR_MAX 126
#define RANDSTR_RANGE ((RANDSTR_MAX + 1) - RANDSTR_MIN) + RANDSTR_MIN

char *
garbage (
char * buf, /* pointer to the buffer that will contain the garbage */
size_t count, /* including the last null character. */
const char * chrs /* character range used */
)
{
char * eosbuf = buf;
const char * eoschrs = NULL;
time_t timeval;

/* static & 0 IMPORTANT! Thread-safe? */
static unsigned int seed = 0;


/* (A)
* If the seed is 0 (not created yet), we generate a seed.
*/
if (seed == 0)
{
timeval = time((time_t*)NULL);

/*
* check if time is available on system
*/
if (timeval == (time_t)-1)
return (NULL);
seed = (unsigned) timeval;

/*
* Seed the random number generation algo, so we can call rand()
*/
srand (seed);
}


/* (B)
* if chrs is an empty string, choose our own range
* of characters from ASCII 32 to ASCII 126
*/
if (NULL == chrs || '\0' == *chrs)
{
/* generate only the required number of characters */
while (count-- > 1)
*eosbuf++ = (char) (rand () % RANDSTR_RANGE);
}
/* we are using a user-specified range. */
else
{
eoschrs = chrs;

/* get a pointer to point to the last character '\0' of the range.
*/
while (*eoschrs)
++eoschrs;

/* generate the random string. MAX == length of range. */
while (count-- > 1)
*eosbuf++ = *(chrs + (rand () % (size_t)(eoschrs - chrs)));
}


/* (C)
* Important: terminate the string!
*/
*eosbuf = (char) '\0';


/* (D)
* Return pointer to the buffer.
*/
return (buf);
}


Sample Usage:
=============
char buf[11]; /* 10 chars + 1 '\0' */

garbage (
buf,
sizeof (buf),
"0aA1bB2cC3dD4eE5fF6gG7hH8iI9jJk9Kl8Lm7Mn6No5Op4Pq3Qr2Rs1StT0uVvWwXxYyZz"
);

puts (buf);


Regards,
Jonathan.
 
A

Arthur J. O'Dwyer

Here is a random string generator I wrote for an application
and I'm wondering about the thread-safety of this function.
I was told using static and global variables cause
potential problems for thread-safety. So far, I'm only confused.
I need a proper explanation for the concept so I can understand how
to write thread-safe functions in the future.

"Thread safety" generally just means that you need to make sure
that if two or more threads are calling this function at the same
time, nothing bad will happen. "Bad," in this context, means
"undefined behavior."
(In fact, in this newsgroup we don't even have the concept of
"threading," which is not provided by standard C. You might get
much better answers if you asked in a newsgroup devoted to your
particular platform and/or thread model. So take everything I say
below --- and everything anyone else in this newsgroup says in this
thread --- with a large grain of salt.)
My apologies for posting a long routine.

My apologies for posting a long response. ;)
#include <stdlib.h>
#include <time.h>

Coding note 1:
Your "sample usage" requires #include said:
/* ---( PRIVATE )--- */
#define RANDSTR_MIN 32
#define RANDSTR_MAX 126
#define RANDSTR_RANGE ((RANDSTR_MAX + 1) - RANDSTR_MIN) + RANDSTR_MIN

Coding note 2:
For the benefit of your code's maintainer, put in the extra parentheses
around the whole expression in the definition of 'RANDSTR_RANGE'. That
way, when he writes 'half = RANDSTR_RANGE/2', he won't be surprised at
the result.
...Wait! On second glance, this is utterly stupid! You're not
using this macro as a "range" at all, but as a kind of half-calculation!
Save the obfuscated text-substitution stunts for the IOCCC, please.
char *
garbage (

Style note 1: 'garbage'?!
char * buf, /* pointer to the buffer that will contain the garbage */
size_t count, /* including the last null character. */
const char * chrs /* character range used */
)
{
char * eosbuf = buf;
const char * eoschrs = NULL;
time_t timeval;

/* static & 0 IMPORTANT! Thread-safe? */
static unsigned int seed = 0;

Thread-safety note 1:
The above line is thread-safe. But the 'if' statement below is not
really what you want. First of all, suppose you do generate a random
seed, and that seed turns out to be 0? Then you keep generating new
seeds. So your random number generator has a bias, which is just
intrinsically a Bad Thing.
And here's the thread-safety part: Suppose Thread A calls this
function, and 'seed' has not been initialized. Then Thread A enters
the 'if' body, and calls 'time'. Meanwhile, Thread B has called this
function also, and since 'seed' is still zero (Thread A hasn't changed
it yet), Thread B enters the 'if' body. Then we have a "race condition"
(look it up) on the assignment to 'seed' --- a harmless one in this
case (on some implementations), perhaps, but a race condition
nevertheless. So you need to (at least!) look up "mutex" and figure
out how to apply it to 'seed'. Or continue reading, and I'll show you
a Standard-conforming way to avoid the problem.
/* (A)
* If the seed is 0 (not created yet), we generate a seed.
*/
if (seed == 0)
{
timeval = time((time_t*)NULL);

Coding note 3:
The cast is redundant and distracts the reader from what's really
happening. I won't flag the other redundant and bogus casts in this
function, but remember that casts are hardly ever necessary in C.
/*
* check if time is available on system
*/
if (timeval == (time_t)-1)
return (NULL);

Style note 2: 'return' is not a function. I suggest you follow
widespread modern style in omitting the redundant parentheses.
seed = (unsigned) timeval;

/*
* Seed the random number generation algo, so we can call rand()
*/
srand (seed);

Thread-safety note 2:
Another possible race condition; 'srand' may not be thread-safe
itself. In general, if you use any library function in your threaded
code, you need to look it up to see whether it's guaranteed to be
thread-safe on your particular implementation. Usually (IMLE), functions
like 'sin' and 'time' will be safe; functions like 'printf' will be
safe but might give funny output if you call them with two threads at
once; and functions like 'strtok' and 'srand' probably won't be.
}


/* (B)
* if chrs is an empty string, choose our own range
* of characters from ASCII 32 to ASCII 126
*/
if (NULL == chrs || '\0' == *chrs)
{
/* generate only the required number of characters */
while (count-- > 1)

Style note 3: Ick.
*eosbuf++ = (char) (rand () % RANDSTR_RANGE);

Coding note 4: See the FAQ about how to generate "nice" random numbers.
Modding by the range is often not the best way to do it.
}
/* we are using a user-specified range. */
else
{
eoschrs = chrs;

/* get a pointer to point to the last character '\0' of the range.
*/
while (*eoschrs)
++eoschrs;

Coding note 5: Why use a hand-coded loop when your implementation
provides optimized assembly versions of 'strlen' and 'strchr' for
exactly this purpose? [Use eoschrs=chrs+strlen(chrs) or
eoschrs=strchr(chrs,'\0') instead of the loop.]
/* generate the random string. MAX == length of range. */
while (count-- > 1)
*eosbuf++ = *(chrs + (rand () % (size_t)(eoschrs - chrs)));

In fact, you never use the value of 'eoschrs' /per se/ at all;
just write 'size_t len = strlen(chrs);' and use 'len' in place
of '(size_t)(eoschrs - chrs)' here.
}


/* (C)
* Important: terminate the string!
*/
*eosbuf = (char) '\0';

Coding note 6: What if count==0?
/* (D)
* Return pointer to the buffer.
*/
return (buf);
}


Sample Usage:
=============
char buf[11]; /* 10 chars + 1 '\0' */

garbage (
buf,
sizeof (buf),
"0aA1bB2cC3dD4eE5fF6gG7hH8iI9jJk9Kl8Lm7Mn6No5Op4Pq3Qr2Rs1StT0uVvWwXxYyZz"
);

puts (buf);


Now for the standard method I promised. Instead of using a single
static variable in the function itself (which means one variable shared
by all the threads and therefore needing to be protected from race
conditions), just use one variable per thread. The way we do that is
to have each thread store its own copy of the "state" of the function,
and pass that state as a parameter, like this: [UNTESTED CODE]


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define RANDSTR_MIN 32
#define RANDSTR_MAX 126

typedef struct {
unsigned int seed;
} rndstr_state;

char *rndstr(char *buf, size_t buflen, const char *chrs, rndstr_state *state)
{
size_t i, m;

if (state->seed == 0)
{
time_t timeval = time(NULL);
if (timeval == (time_t)-1)
return NULL;
state->seed = timeval;
/*
Note: |srand| is probably still not thread-safe. You
may need to find a third-party PRNG that will fit your
needs, and use it here. Its state information must be
kept in the |rndstr_state| struct as well.
*/
srand(state->seed);
}

if (NULL == chrs || '\0' == *chrs)
chrs = " !\"#$%&'()*+,-./0123456789:;<=>?@"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`ab"
"cdefghijklmnopqrstuvwxyz{|}~";

m = strlen(chrs);
if (buflen <= 0)
return buf;
for (i=0; i < buflen-1; ++i)
buf = chrs[rand() % m];
buf = '\0';

return buf;
}


Sample Usage:
=============
char buf[11]; /* 10 chars + 1 '\0' */
rndstr_state state = {0};
rndstr(
buf,
sizeof (buf),
"0aA1bB2cC3dD4eE5fF6gG7hH8iI9jJk9Kl8Lm7Mn6No5Op4Pq3Qr2Rs1StT0uVvWwXxYyZz",
&state
);

puts(buf);


-Arthur
 
E

E. Robert Tisdale

Jonathan said:
Here is a random string generator [that] I wrote for an application.
I'm wondering about the thread-safety of this function.
I was told using static and global variables
[can] cause potential problems for thread-safety.
So far, I'm only confused.
I need a proper explanation for the concept [of thread safety]
so [that] I can understand how to write thread-safe functions in the future.

My apologies for posting a long routine.
> cat garbage.c
#include <stdlib.h>
#include <time.h>

/* ---( PRIVATE )--- */
#define RANDSTR_MIN 32
#define RANDSTR_MAX 126
#define RANDSTR_RANGE \
((RANDSTR_MAX + 1) - RANDSTR_MIN) + RANDSTR_MIN

char* garbage( // pointer to the buffer
char* buf, // that will contain the garbage
size_t count, // including the last null character.
const
char* chrs // character range used
) {

// static & 0 IMPORTANT! Thread-safe?
static unsigned int seed = 0;


// (A)
// If the seed is 0 (not created yet), we generate a seed.

if (0 == seed) {
time_t timeval = time((time_t*)NULL);

// Check if time is available on system.

if (timeval == (time_t)-1)
return (NULL);
seed = (unsigned) timeval; // Not thread safe!

// Seed the random number generation algo,
// so we can call rand().
srand(seed); // Not thread safe!
}


// (B)
// If chrs is an empty string, choose our own range
// of characters from ASCII 32 to ASCII 126.

char* eosbuf = buf;
if (NULL == chrs || '\0' == *chrs) {
// generate only the required number of characters
while (1 < count--)
*eosbuf++ = (char)(rand()%RANDSTR_RANGE);
}
else { // We are using a user-specified range.
const
char* eoschrs = chrs;

// Get a pointer
// to point to the last character '\0' of the range.

while (*eoschrs)
++eoschrs;

// Generate the random string. MAX == length of range.
while (1 < count--)
*eosbuf++ = *(chrs + (rand()%(size_t)(eoschrs - chrs)));
}


// (C)
// Important: terminate the string!

*eosbuf = (char)'\0';


// (D)
// Return pointer to the buffer.

return buf;
}

srand() and rand() modify a global "state" variable.
Suppose that two threads are running this code
and the first thread executes srand() but,
before it can complete the random string,
it is interrupted by the second thread
which rests the global state variable by executing srand() again
before the first thread takes control again
and completes the random string
with the same sequence of characters with which it began.

This may or may not be a serious problem for your application.

The ANSI/ISO C standards do *not* guarantee thread safety
even if you avoid static and global variables
(and function libraries that are not thread safe).
Compilers are allowed to use static or global variable
to store temporary intermediate results but, in fact,
all viable standard compliant implementations will emit
thread safe code if you avoid static and global variables
and function libraries that are not thread safe.

Also, be aware that some people use the term "thread safe"
when they are actually talking about library functions
that invoke mutual exclusion explicitly.
Such libraries are more correctly described as "threaded"
and typically use some thread library to implement mutual exclusion.
 
P

pete

Jonathan said:
Greetings everyone,

Here is a random string generator I wrote for an application
and I'm wondering about the thread-safety of this function.
I was told using static and global variables cause
potential problems for thread-safety. So far, I'm only confused.
I need a proper explanation for the concept so I can understand how
to write thread-safe functions in the future.

static and global variables cause potential problems for thread-safety.

Replace the static duration object
with an automatic object delcared in main,
and pass it's value or address to all the functions
which will call the prng().

seed = prng(seed);

or if your prng takes an array of seeds

number = prng(unsigned *array);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top