a gift for the mortensens

F

frank

Ben said:
i = (int) ((double) rand () / ((double) RAND_MAX + 1) * N);

Here be dragons. As 64 bit integers get more and more common we are
edging towards a time when this will routinely fail because
(double)RAND_MAX + 1 can be equal to RAND_MAX if RAND_MAX is big
enough. This does not happen with a 32 RNG and normal IEEE
double-precision numbers, but if RAND_MAX is big enough (and a signed
64-bit int is big enough) the +1 has no effect on (double)RAND_MAX.

To get a floating-point number in [0, 1) I have taken to writing:

nextafter((double)rand() / RAND_MAX, 0)

nextafter is a C99 function that gives the next representable number,
near the first argument in the direction of the second. There are
probably better ways to do this, but the best of all would be a
floating-point random function in C. Such a function could rely on
the internal representation of a floating point number to give a
properly uniform distribution. Many C libraries include such a
function as an extension.

Is gcc one of them?

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra mort3.c -o out; ./out
mort3.c: In function ‘main’:
mort3.c:12: warning: unused variable ‘c’
/tmp/ccAPcgbE.o: In function `main':
mort3.c:(.text+0x58): undefined reference to `nextafter'
collect2: ld returned 1 exit status
bash: ./out: No such file or directory
dan@dan-desktop:~/source$ cat mort3.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>

#define N 26

int
main (void)
{
int i;
char c;

srand(time(NULL));
printf ("RAND_MAX is %d\n", RAND_MAX);
i = nextafter((double)rand() / RAND_MAX, 0);
printf ("i is %d\n", i);

return 0;
}

// gcc -std=c99 -Wall -Wextra mort3.c -o out; ./out
 
F

frank

Keith said:
As I already pointed out and you acknowledged, the second "i is" was a
typo for "RAND_MAX is". i never takes on the value 2147483647, except
perhaps by coincidence. As printed, the first statement is true, the
second is false.

Now that I think about it, you are absolutely correct. i assumes one
value in the program and hence cannot be two. I think I was injecting
less belief in the "i is " part than a literal one. A longer version of
it may have been better to read "The integer I'm looking for is ".
Nope. ASCII is a 7-bit code with codes 0-127 (typically stored in an
8-bit byte). EBCDIC is an 8-bit code, mostly inconsistent with ASCII.

Google is your friend.

I swear I've seen it differently, but I doubt you'd write that if it
weren't demonstrable.
Historical reasons, mostly. The point is that, on many modern
implementations, very likely including the one you're using, plain
char is a signed type.

Try printing the values of CHAR_MIN, CHAR_MAX, SCHAR_MIN, SCHAR_MAX,
and UCHAR_MAX (defined in <limits.h>).

For something like this, I don't want to write a program; I want to look
at limits.h for my implementation. I've probably asked you three times
for this over the past couple years, but I keep getting sent back to go
in a lot of ways, as I work up my linux install for the 4th time. What
is the name of the newsgroup specific to gcc?
 
F

frank

Keith said:
frank said:
Keith said:
[snipped and reordered for thematic reasons]
None of the three casts in your program are necessary, and IMHO your
code would be improved by dropping them.

srand(time(NULL);;
...
c = i;
This seems to work (with a right paren added and semi-colon removed):

Oops, typo on my part.

I think "we" should update the FAQ to replace your expression with the
one that Steve Summit had. People like me, who read his collection
while sitting in a parking lot, are always grateful for his
contribution, but useless casts make code unreadable.
[...]
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 26

int
main (void)
{
int i;
char c;

srand(time(NULL));
printf ("RAND_MAX is %d\n", RAND_MAX);
i = (int) ((double) rand () / ((double) RAND_MAX + 1) * N);
printf ("i is %d\n", i);

return 0;
}

// gcc -std=c99 -Wall -Wextra mort2.c -o out; ./out
dan@dan-desktop:~/source$

So none of those casts were doing anything for me? If so, I say we
replace this part of the FAQ.

No, that's not what I said. None of the casts in your previous code
were necessary. I obviously wasn't commenting on code you hadn't
posted yet.

Yeah, I didn't do the best editing here.
In your new code:

i = (int) ((double) rand () / ((double) RAND_MAX + 1) * N);

the cast to int is unnecessary, since the result is being assigned to
an int object. The other two casts are necessary and appropriate,
since in their absence the int values wouldn't be converted to double.

Still curious about this.
Indentation?
[52 lines deleted]
It took less than a minute.

Great. Though I'm not quite sure why you felt the need to tell us, in
great detail, how you did it. Just posting properly indented code is
more than enough.

http://clc-wiki.net/wiki/clc-wiki:Policies#codeformat

This is topical, according to the wiki. I didn't explain the details
but will now take the opportunity to do so. MY OS (ubuntu) registered
that I wanted indent and gave me the the line that I needed to make it
happen off the command line. Then I hit the up arrow twice to find the
command that had failed, and voila.
 
B

Barry Schwarz

snip


No, that's not what I said. None of the casts in your previous code
were necessary. I obviously wasn't commenting on code you hadn't
posted yet.

In your new code:

i = (int) ((double) rand () / ((double) RAND_MAX + 1) * N);

the cast to int is unnecessary, since the result is being assigned to
an int object. The other two casts are necessary and appropriate,
since in their absence the int values wouldn't be converted to double.

Only the second cast to double is necessary. Once RAND_MAX is
converted to double, all the remaining values in the expression must
also be converted. While the first cast to double might achieve the
same effect, the denominator would be evaluated as an int first and
could overflow before being converted to double.
 
B

Barry Schwarz

They are not simultaneously, but sequentially true.

I thought they were the same for the first 128 elements, and that ascii
filled out 129-256, while ebcdic was size 128.

The character '1' on an ASCII system is 0x31. On an EBCDIC system it
is 0xF1. ASCII 'A' is 0x41, EBCDIC is 0xC1. Worse, on an ASCII
system, 'J'-'I' is 1; in EBCDIC it is 8.

About the only character in common is '\0' which is 0x00 on both.
 
K

Keith Thompson

frank said:
Ben Bacarisse wrote: [...]
To get a floating-point number in [0, 1) I have taken to writing:

nextafter((double)rand() / RAND_MAX, 0)

nextafter is a C99 function that gives the next representable number,
near the first argument in the direction of the second. There are
probably better ways to do this, but the best of all would be a
floating-point random function in C. Such a function could rely on
the internal representation of a floating point number to give a
properly uniform distribution. Many C libraries include such a
function as an extension.

Is gcc one of them?

No, since gcc is a compiler not a library. (glibc is the library
most commonly associated with gcc, but in fact gcc is generally
used with whatever library exists on the system.)

[...]
 
K

Keith Thompson

frank said:
Keith said:
frank said:
Barry Schwarz wrote:
[...]
i is 1337295409
i is 2147483647
One of these statements must be false.
They are not simultaneously, but sequentially true.

As I already pointed out and you acknowledged, the second "i is" was a
typo for "RAND_MAX is". i never takes on the value 2147483647, except
perhaps by coincidence. As printed, the first statement is true, the
second is false.

Now that I think about it, you are absolutely correct. i assumes one
value in the program and hence cannot be two. I think I was injecting
less belief in the "i is " part than a literal one. A longer version
of it may have been better to read "The integer I'm looking for is ".

So you were using "i" to refer generically to whatever integer you're
looking at the moment, while your program declares an integer object
named "i". That's, um, a very interesting way of looking at things.

Really, given that you wanted to display the values of i and RAND_MAX,
the only sensible thing to write would be:

printf("i is %d\n", i);
printf("RAND_MAX is %d\n", RAND_MAX);

or some variation.

[...]
For something like this, I don't want to write a program; I want to
look at limits.h for my implementation.

Ok, nobody's stopping you. But why? The standard headers aren't
generally written to be particularly human-readable. On my system,
for example, I see the following in /usr/include/limits.h:

/* Minimum and maximum values a `char' can hold. */
# ifdef __CHAR_UNSIGNED__
# define CHAR_MIN 0
# define CHAR_MAX UCHAR_MAX
# else
# define CHAR_MIN SCHAR_MIN
# define CHAR_MAX SCHAR_MAX
# endif

I can guess where __CHAR_UNSIGNED__ would be defined, but I'm not sure
(plain char is signed on my system).

For that matter, the standard headers aren't necessarily even
implemented as source files.

Writing and executing a program is the only reliable way to display
the values to which these macros expand.
I've probably asked you three
times for this over the past couple years, but I keep getting sent
back to go in a lot of ways, as I work up my linux install for the 4th
time. What is the name of the newsgroup specific to gcc?

You're probably looking for gnu.gcc.help.
 
K

Keith Thompson

frank said:
Keith said:
frank said:
Keith Thompson wrote:
[snipped and reordered for thematic reasons]

None of the three casts in your program are necessary, and IMHO your
code would be improved by dropping them.

srand(time(NULL);;
...
c = i;
This seems to work (with a right paren added and semi-colon removed):

Oops, typo on my part.

I think "we" should update the FAQ to replace your expression with the
one that Steve Summit had.

I think you meant that the other way around.
People like me, who read his collection
while sitting in a parking lot, are always grateful for his
contribution, but useless casts make code unreadable.

What useless casts? There were several in the earlier code you
posted; there aren't very many in the FAQ. In question 13.16, we see:

(int)((double)rand() / ((double)RAND_MAX + 1) * N)

and

(int)(drand48() * N)

The double casts *are* necessary. The int casts may or may not be,
depending on what's done with the result.

[...]
Still curious about this.

About what? I don't know what you're asking.
Indentation?
[52 lines deleted]
It took less than a minute.

Great. Though I'm not quite sure why you felt the need to tell us, in
great detail, how you did it. Just posting properly indented code is
more than enough.

http://clc-wiki.net/wiki/clc-wiki:Policies#codeformat

This is topical, according to the wiki.
[l..]

My remark was more about verbosity than about topicality, but arguing
the point any further would just be ironic.
 
K

Keith Thompson

Richard Heathfield said:
Not even the second cast is necessary.

i = (rand() / (RAND_MAX + 1.0)) * N;

Nicely done!

In cases like this, though, I think I'd argue that using casts or not
is largely a matter of taste. I find that I'm a little uncomfortable
with the way the constant 1.0 imposes its type on the rest of the
expression, bubbling up through multiple levels of the tree.

And I suppose that's inconsistent with most of what I've said about
using casts where implicit conversions would do the same job. Oh,
well.
 
B

Ben Bacarisse

frank said:
Ben Bacarisse wrote:

Is gcc one of them?

glibc (rather than gcc) includes drand48. It is a POSIX function.
Someone else posted about this as well (sorry I forget who).

<snip>
 
B

Ben Bacarisse

Keith Thompson said:
frank <[email protected]> writes:

What useless casts? There were several in the earlier code you
posted; there aren't very many in the FAQ. In question 13.16, we see:

(int)((double)rand() / ((double)RAND_MAX + 1) * N)

and

(int)(drand48() * N)

The double casts *are* necessary. The int casts may or may not be,
depending on what's done with the result.

Surely only one of the casts to double is needed? I'm not sure it's
clearer to omit one, but I don't think they are both needed. I have
also seen the expression with neither cast and 1 replaced by 1.0.

<snip>
 
B

Ben Bacarisse

re: (int)((double)rand() / ((double)RAND_MAX + 1) * N)
Surely only one of the casts to double is needed? I'm not sure it's
clearer to omit one, but I don't think they are both needed. I have
also seen the expression with neither cast and 1 replaced by 1.0.

Sorry. I see that got covered in another sub-thread.
 
N

Nick Keighley

Nick Keighley wrote:
I intend to write a couple C utilities [...].
Where I'm stuck right now is that I can't seem to find source
for invoking pseudo-random behaviour.  So here's my first attempt:
try the FAQ http://c-faq.com/lib/index.html
particularly questions 13.15 and 13.16
what do you think this does?

Wouldn't this be a demotion?  

well it stuffs an int into a char. The cast is unecessary.
Mapping onto a smaller set like modular athimetic.
 It *should* be able to produce any char in the set, and
somewhat equiprobably.

but it may produce some stuff that isn't a valid character. char might
be 8 bits and charcaters might be 7 bits (less likely these days).
Plus you might have problems with signed/unsigned chars.
I think they've switched numbers on the on-line ones:

Q: How can I split up a string into whitespace-separated fields?
How can I duplicate the process by which main() is handed argc and argv?

actually they *can't* be in the range 0..25 in C because 0 is reserved
for the nul character.
Use
FAQ [13.16] to get yourself a uniform distribution of numbers in the
range in the range 0..25. The use Keiths idea or add 'a' (which will
work for ASCII).

<snip>
 
E

Ersek, Laszlo

glibc (rather than gcc) includes drand48. It is a POSIX function.
Someone else posted about this as well (sorry I forget who).

It may have been me (too):

http://groups.google.com/group/comp.lang.c/msg/8a13f0c02a4769d2

As written there, drand48() is specifically not a POSIX function, it's
an XSI (X/Open System Interafes / Single UNIX Specification) function.
It's (theoretically) possible to conform to POSIX without conforming to
the SUS.

In my understanding, the POSIX documents and the SUS documents were
different sets of specifications before SUSv3. They were merged starting
with SUSv3, and from that point on, put crudely, SUS \ XSI = POSIX.
Since drand48() is marked XSI, it's possible for an implementation to
conform to POSIX without providing drand48().

Personally, I'd call it without a second thought: if I was programming
for SUSv2 / UNIX 98 (a distinct set of specs from the then-current POSIX
specs), the interface is required; if I was programming for SUSv3 / UNIX
03, or SUSv4, I'd just document the XSI-dependency of the code. I
believe this would be no limitation at all in practice; see the list of
UNIX 98 / UNIX 03 certified products:

http://www.opengroup.org/openbrand/register/

No GNU/Linux based product there, but that's the least worry: the
development of glibc seems to drive the SUS, so to say.

http://www.reddit.com/r/programming...oggit_we_talk_a_lot_about_programming/c0fvdwn

Cheers,
lacos
http://lacos.hu/
 
F

Flash Gordon

frank said:
Keith said:
frank said:
Keith Thompson wrote:
[snipped and reordered for thematic reasons]

None of the three casts in your program are necessary, and IMHO your
code would be improved by dropping them.

srand(time(NULL);;
...
c = i;
This seems to work (with a right paren added and semi-colon removed):

Oops, typo on my part.

I think "we" should update the FAQ to replace your expression with the
one that Steve Summit had.

The FAQ is maintained by Steve Summit, you will have to email him if you
think something should be changed.
People like me, who read his collection
while sitting in a parking lot, are always grateful for his
contribution, but useless casts make code unreadable.

I would be inclined to look more carefully... it is entirely possible
that Steve what is in the FAQ is correct.


<snip>

Those are the policies for the Wiki (and are open to change if someone
wants them changed).

The link for discussions about topical views about comp.lang.c is this
http://clc-wiki.net/wiki/Intro_to_clc
 
F

frank

frank said:
Keith said:
Keith Thompson wrote:
[snipped and reordered for thematic reasons]
None of the three casts in your program are necessary, and IMHO your
code would be improved by dropping them.
    srand(time(NULL);;
    ...
    c = i;
This seems to work (with a right paren added and semi-colon removed):
Oops, typo on my part.
I think "we" should update the FAQ to replace your expression with the
one that Steve Summit had.

The FAQ is maintained by Steve Summit, you will have to email him if you
think something should be changed.
 People like me, who read his collection
while sitting in a parking lot, are always grateful for his
contribution, but useless casts make code unreadable.

I would be inclined to look more carefully... it is entirely possible
that Steve what is in the FAQ is correct.

This is topical, according to the wiki.

<snip>

Those are the policies for the Wiki (and are open to change if someone
wants them changed).

The link for discussions about topical views about comp.lang.c is thishttp://clc-wiki.net/wiki/Intro_to_clc

Thx, Flash, I found this as I was just poking around there:

# Compiler-specific questions, such as installation issues and
locations of header files. Ask about these in compiler-specific
newsgroups, such as gnu.gcc.help, comp.os.msdos.djgpp (x86 version of
the free gcc C compiler), comp.compilers.lcc (the LCC family of C
compilers including LCC-Win32).


1.4 is a good resource for me as I seem to need to repopulate my
newsreader every 6 months.
 
P

Phil Carmody

Keith Thompson said:
Nicely done!

In cases like this, though, I think I'd argue that using casts or not
is largely a matter of taste. I find that I'm a little uncomfortable
with the way the constant 1.0 imposes its type on the rest of the
expression, bubbling up through multiple levels of the tree.

And I suppose that's inconsistent with most of what I've said about
using casts where implicit conversions would do the same job. Oh,
well.

After years of dithering between different techniques, I now almost
always use Richard's. I think nothing says 'use floating point' more
concisely than actually sticking floating point numbers in the
expression. Try it - you might like it ;-)

Phil
 
P

Phil Carmody

Ben Bacarisse said:
Here be dragons. As 64 bit integers get more and more common we are
edging towards a time when this will routinely fail because
(double)RAND_MAX + 1 can be equal to RAND_MAX if RAND_MAX is big
enough. This does not happen with a 32 RNG and normal IEEE
double-precision numbers, but if RAND_MAX is big enough (and a signed
64-bit int is big enough) the +1 has no effect on (double)RAND_MAX.

I'd like to see which real-world value of RAND_MAX and which
rounding mode could make that happen. Typically, RAND_MAX will
be odd, and adding 1 will cause a cascade of carries and leave
a result which requires fewer bits of precision to represent
accurately.

Phil
 
B

Ben Bacarisse

Phil Carmody said:
I'd like to see which real-world value of RAND_MAX and which
rounding mode could make that happen.

I don't know of real world systems with 64-bit RAND_MAX. My warning
was for the future of this common idiom. Of course, if you switch
rand() for a 64-bit RNG (like KISS recently posted by George
Marsaglia) you get the same problem if you use the idiom above.
Typically, RAND_MAX will
be odd, and adding 1 will cause a cascade of carries and leave
a result which requires fewer bits of precision to represent
accurately.

That doesn't help, I don't think. Using my gcc, it is true that a
large RAND_MAX get rounded up when converted to double, but so do a
few of the returned values from rand(). Adding 1 does not have the
effect of making the result of the division < 1.0 because the +1 has
no effect.

The probability of getting a large rand() from a 64-bit generator is
very low, but it will happen one day! Here is the worse case example
on my system. 512 returns from rand() can cause problems. You can
fix it with long double but since that may be no wider than double,
that option can just mask the problem until the code moves to another
machine.

#include <stdio.h>
#include <limits.h>

#define RAND_MAX LLONG_MAX
#define N 10

long long int rand(void) { return RAND_MAX-511; }

int main(void)
{
int n = (rand() / (RAND_MAX + 1.0)) * N;
if (n == N)
puts("Don't use n as an array index!");
return 0;
}
 
F

frank

Keith said:
Nicely done!

In cases like this, though, I think I'd argue that using casts or not
is largely a matter of taste. I find that I'm a little uncomfortable
with the way the constant 1.0 imposes its type on the rest of the
expression, bubbling up through multiple levels of the tree.

And I suppose that's inconsistent with most of what I've said about
using casts where implicit conversions would do the same job. Oh,
well.

I missed this. (A guy's gotta sleep some time.) Ben thinks that what's
happening here is a little too nuanced for me, but I think I can get my
head around it and bring the necessary machinery to bear:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra mort5.c -o out; ./out
1222567701 = 01001000110111101110011100010101
14 = 00000000000000000000000000001110
dan@dan-desktop:~/source$ cat mort5.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#include <limits.h>

#define STRING "%13d = %s\n"
#define E_TYPE int
#define N 26

typedef E_TYPE e_type;

void bitstr (char *str, const void *obj, size_t n);

int
main (void)
{
int i, j;
e_type e;
char ebits[CHAR_BIT * sizeof e + 1];

srand (time (NULL));
j = rand ();
e = j;
bitstr (ebits, &e, sizeof e);
printf (STRING, e, ebits);
i = (j / (RAND_MAX + 1.0)) * N;
e = i;
bitstr (ebits, &e, sizeof e);
printf (STRING, e, ebits);

return 0;
}

void
bitstr (char *str, const void *obj, size_t n)
{
unsigned mask;
const unsigned char *const byte = obj;

while (n-- != 0)
{
mask = ((unsigned char) -1 >> 1) + 1;

do
{
*str++ = (char) (mask & byte[n] ? '1' : '0');
mask >>= 1;
}
while (mask != 0);
}
*str = '\0';
}


// gcc -std=c99 -Wall -Wextra mort5.c -o out; ./out
dan@dan-desktop:~/source$

I've seen now several versions of
i = (j / (RAND_MAX + 1.0)) * N;
What is happening to 1222567701 / (RAND_MAX + 1.0) * 26?
Of what type is (RAND_MAX + 1.0) ?

Thanks again for your thoughtful comments.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top