32/64 bit cc differences

B

Ben Bacarisse

glen herrmannsfeldt said:
(snip, I wrote)



And, of course, using floating point there isn't any bias...

When the floats are used to make an integer selection (i.e. you replace
int_rand() % max with floor(float_rand() * max)) the bias remains.
 
K

Keith Thompson

Ben Bacarisse said:
No. What matters is the types intmax_t and uintmax_t and they can't be
anything like as small as 16 bits. See 6.10.1p4.

<snip>

And C90 had a similar rule, with preprocessor expressions evaluated in
type long or unsigned long. 2147483647 isn't a problem for any
conforming compiler. (And if you're using a non-conforming compiler,
you've got more problems than we can help you with here.)
 
J

JohnF

Keith Thompson said:
JohnF said:
Keith Thompson said:
[...]
The code's supposed to be portable, and not care as long as int>=32.

Then it's only mostly portable; the standard allows int to be as
narrow as 16 bits.

Yeah, but that's pretty much deprecated/archaic, at least for
general purpose computers. I usually just try to follow K&R 2nd ed
for "portable" syntax, whereas "portable semantics" gets trickier,
and I usually just try to figure "anything that can go wrong will".

Most modern *hosted* implementations make int 32 bits, but there's
nothing deprecated or archaic about 16-bit int (at least as far as
the C standard is concerned).

POSIX requires at least 32 bits, so if your program already depends on
POSIX features, you can safely make that assumption. Otherwise, you can
certainly assume 32-bit or wider int if you want to, but I personally
would take care to make that assumption explicit, so if someone tries to
compile my code with a fully conforming implementation that happens to
have 16-bit int the problem will be detected early.
#include <limits.h>
#if INT_MAX < 2147483647
#error This code requires at least 32-bit int
#endif

I wasn't really trying to make any big point. Nowadays I just think,
like you say wrt posix, it's reasonable not to worry about <32-bit
architectures, at least for general purpose programs not intended
to be embedded anywhere, etc. Portability across platforms has so
many pitfalls that you can't reasonably worry about every conceivable
one, but have to "choose your battles".
For example, one thing I dislike about mswin (in addition to what
you mention below) is that stdout isn't default binary mode, but
outputs two chars, crlf, for your one \n. Code that cares, and which
is intended to run on both win and unix, gets messy dealing with that.
The one place it wanted strictly >32 I used long long (despite
obnoxious -Wall warnings about it). Anyway, I found the problem,
explained in subsequent followup, kind of along the lines you're
suggesting, but a rounding problem.

I'd probably use int64_t and friends. But what warnings do you get when
you use long long? You can likely get rid of any such warnings by
telling your compiler to conform to C99 or later.

That might be preferable to LL. All three compilers
64-bit: cc --version cc (Debian 4.3.2-1.1) 4.3.2
32-bit: cc --version cc (NetBSD nb2 20110806) 4.5.3
cc --version cc (GCC) 4.7.1
issue similar -pedantic -Wall warnings. Explicitly, from 4.7.1,
fm.c: In function 'rseeder':
fm.c:865:6: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:866:11: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:877:20: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:878:11: warning: ISO C90 does not support 'long long' [-Wlong-long]
fm.c:880:25: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:30: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:880:37: warning: use of C99 long long integer constant [-Wlong-long]
fm.c:892:3: warning: ISO C90 does not support the 'll' gnu_printf
length modifier [-Wformat]
But that whole function ought to be re-algorithmized anyway,
so my concern is pretty minimal.

Note how the warning is phrased: "ISO C90 does not support 'long long'".
The long long type has been a standard C feature since the 1999 standard
(and a common extension before that). Failure to support long long is
not merely deprecated, it's completely non-standard. If you're willing
to assume that int is at least 32 bits, you should be even more willing
to assume that long long exists.

And <stdint.h> also did not exist in C90; both it and long long were
introduced by C99.

Just invoke your compiler with options to tell it to use a more modern
version of the language.

gcc in particular uses "-std=gnu89" by default, which is C89/C90 with
GNU extensions. IMHO this is unfortunate, and it's time for gcc to
support C99 by default. But it probably doesn't make much sense to rely
on gcc's default anyway.

If you need your code to be portable to Microsoft's compiler, you might
have a problem; I don't remember whether it supports long long, but I
know it doesn't support C99 or C11.

Thanks for suggestions, but note above remark,
But that whole function [that uses long long] ought to be
re-algorithmized anyway, so my concern is pretty minimal.
And, as per a previous followup, the whole "obnoxious warnings"
remark was intended to be humorous, and I'd actually prefer
continuing to see the warnings, just to remind me that I should
get around to fixing the algorithm (it's the one that seeds
the rng with a hash-like number derived from your key -- I should
choose a better hash).
 
J

JohnF

Ike Naar said:
int iran1 ( int ilo, int ihi ) { /* you want int rn from ilo to ihi */
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */

Isn't ran1()'s actual range [1..2147483646] ?
imax = IM - (IM%range); /* force iran's max to a multiple of range */
while ( iran >= imax ) iran=ran1(/*args*/); /*discard out-of-range iran*/
return ( ilo + (iran%range) ); } /* back with random ilo <= i <= ihi */

Yes, actual max is IM-1, which code accommodates with >= in while().
I'll debug the comments later. But the whole bias problem solved by
all this is miniscule when range<<IM, which is pretty much always
the case. You can just comment out that while() and forget the whole
thing.
 
J

John Forkosh

Ben Bacarisse said:
They may be numerical tests based on the floating point value. It will
make almost no difference to a numerical test if the bottom bit of the
int (just before the final divide) cycles 0,1,1,0,1,1,0,... (for
example) but it will make a big difference if you make binary choices by
using ran1(...) & 1. Eric S suggests that this sort of thing does not
happen with the PRNG you use, but I'd not seen that post when I wrote.


Assuming that the floats are well distributed, is not quite the same as
assuming that the ints have all the right properties so a test or two
would not go aims.

Actually, I think your "amiss" went amiss :).
More to the point, I'm now using that iran1() function in preceding
followup, which (if you're not easily finding it) is,
"...the solution I've now coded was based on Eric's preceding
discussion.
It's pseudocoded below from the real code in forkosh.com/fm.zip,"
int iran1 ( int ilo, int ihi ) { /*you want an int rn from ilo to ihi*/
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* the range you want is ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */
imax = IM - (IM%range); /* force ran1's max to a multiple of range*/
while ( iran >= imax ) iran=ran1(/*args*/); /*reject out-of-range iran*/
return ( ilo + (iran%range) ); } /* back with random ilo <= i <= ihi */

So it's using mod arithmetic rather than &. But for the one instance
where a binary choice is needed, I do call iran1(0,1), meaning it
eventually does an iran%2, which is pretty much identical to iran&1.
Of course, I could instead do iran1(0,999)/500 to get 0 or 1.
That would be easy. Trying to come up with a valid test suite
would be harder than I care to contemplate. And if it reveals an
unwanted regularity in those ints, now what?...I have to go get
a whole different rng and start all over with it. Big pain.
But I will change that iran1(0,1). Thanks,
 
J

JohnF

J. Clarke said:
(e-mail address removed) says...

I'm no expert but one thing I learned <mumble> years ago was to make
sure that the problem you're chasing really is the problem you _think_
you're chasing. You've got three different versions of the compiler
with two of them giving one behavior and the third, oldest one giving a
different behavior, which you are attributing to 64 bit vs 32-bit. It
could also be the result of some change made to the more recent releases
of the compiler and I would want to rule that out rather than assuming
that it's a 32- vs 64- bit issue.

Problem found and fixed, as per earlier followups.
Turned out to be slightly different float behavior.
But you could be right that it wasn't a 64-bit issue,
per se. And I'd tried cc -m32-bit, as per previous
followups, but compiler barfed at that switch (not
sure why, man cc wasn't on that box). So I couldn't
try to get a finer-grained understanding of problem.
 
I

Ike Naar

Ike Naar said:
int iran1 ( int ilo, int ihi ) { /* you want int rn from ilo to ihi */
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */

Isn't ran1()'s actual range [1..2147483646] ?
imax = IM - (IM%range); /* force iran's max to a multiple of range */
while ( iran >= imax ) iran=ran1(/*args*/); /*discard out-of-range iran*/
return ( ilo + (iran%range) ); } /* back with random ilo <= i <= ihi */

Yes, actual max is IM-1, which code accommodates with >= in while().
I'll debug the comments later. But the whole bias problem solved by
all this is miniscule when range<<IM, which is pretty much always
the case. You can just comment out that while() and forget the whole
thing.

There's still a bias:
The result from ran1() is in the range [1..2147483646]
Take, for example, [ilo..ihi] = [0..1],
then range = 2 and imax = 2147483646
After discarding out-of-range values of iran, we
end up with iran in the range [1..imax-1] = [1..2147483645].

There are 1073741822 numbers in that range that are 0 (mod 2),
the lowest number being 2, the highest number being 2147483644.
There are 1073741823 numbers in that range that are 1 (mod 2),
the lowest number being 1, the highest number being 2147483645.
So the outcome 0 has a smaller probability than the outcome 1.
 
J

JohnF

Ike Naar said:
Ike Naar said:
int iran1 ( int ilo, int ihi ) { /* you want int rn from ilo to ihi */
long ran1(/*some args go here*/), /*original rng from Numerical Recipes*/
iran = ran1(/*args*/), /* integer result from rng */
range = ihi-ilo+1, /* ihi-ilo+1 */
IM = 2147483647, /* ran1()'s actual range is 1...IM */

Isn't ran1()'s actual range [1..2147483646] ?

imax = IM - (IM%range); /* force iran's max to a multiple of range */
while ( iran >= imax ) iran=ran1(/*args*/); /*discard out-of-range iran*/
return ( ilo + (iran%range) ); } /* back with random ilo <= i <= ihi */

Yes, actual max is IM-1, which code accommodates with >= in while().
I'll debug the comments later. But the whole bias problem solved by
all this is miniscule when range<<IM, which is pretty much always
the case. You can just comment out that while() and forget the whole
thing.

There's still a bias:
The result from ran1() is in the range [1..2147483646]
Take, for example, [ilo..ihi] = [0..1],
then range = 2 and imax = 2147483646
After discarding out-of-range values of iran, we
end up with iran in the range [1..imax-1] = [1..2147483645].

There are 1073741822 numbers in that range that are 0 (mod 2),
the lowest number being 2, the highest number being 2147483644.
There are 1073741823 numbers in that range that are 1 (mod 2),
the lowest number being 1, the highest number being 2147483645.
So the outcome 0 has a smaller probability than the outcome 1.

Ah, yes. Shh, don't breathe a word to anybody,
but right now, as we speak, I'm submitting my patent
application for my new algorithm that takes an
integer odd number of items, and separates them
into two equal-sized piles.
Can you say "internet billionaire"?
 
K

Keith Thompson

JohnF said:
For example, one thing I dislike about mswin (in addition to what
you mention below) is that stdout isn't default binary mode, but
outputs two chars, crlf, for your one \n. Code that cares, and which
is intended to run on both win and unix, gets messy dealing with that.
[...]

stdout is a text stream in *all* C implementations.

The difference is in the way Windows and, say, UNIX represent
text files. In UNIX, the end of a line is indicated by a single
linefeed ('\n') character; in Windows, it's marked by a carriage
return followed by a linefeed ('\r', '\n').

For text streams, C translates a single newline character to the local
system's end-of-line representation on output, and vice versa on input.

The point of this is to make it *easier* to write portable code that
deals with text files. For example, you can write a single line to
stdout like this:

printf("Hello, world\n");

rather than:

if (running_on_windows) {
printf("Hello, world\r\n"); /* unnecessary */
}
else {
printf("Hello, world\n");
}

Things do become a bit more difficult if you have to deal with "foreign"
text files, but that's pretty much unavoidable.

And if you want to read and write binary files, just use a binary
stream; stdout isn't intended to deal with binary files.
 
J

JohnF

Keith Thompson said:
JohnF said:
For example, one thing I dislike about mswin (in addition to what
you mention below) is that stdout isn't default binary mode, but
outputs two chars, crlf, for your one \n. Code that cares, and which
is intended to run on both win and unix, gets messy dealing with that.
[...]

stdout is a text stream in all C implementations.
For text streams, C translates a single newline character to the local
system's end-of-line representation on output, and vice versa on input.
And if you want to read and write binary files, just use a binary
stream; stdout isn't intended to deal with binary files.

Thanks for the info. Here's the problem that I've encountered.
Lots of my programs are cgi's that emit binary files, typically
gifs, used in html as, e.g.,
<img src="/cgi-bin/myprog.cgi?instructions and/or data for image">
In this case, myprog >>has to<<, as I understand it, emit to stdout.
Is that right? If so, I need to put stdout in "binary mode"
(that's what windows calls it, the typical win C command being
something like setmode(fileno(stdout),O_BINARY)).
Got a fix for, or insight into, dealing with that without
messy #ifdef stuff? Thanks,
 
J

JohnF

JohnF said:
Keith Thompson said:
JohnF said:
For example, one thing I dislike about mswin (in addition to what
you mention below) is that stdout isn't default binary mode, but
outputs two chars, crlf, for your one \n. Code that cares, and which
is intended to run on both win and unix, gets messy dealing with that.
[...]

stdout is a text stream in all C implementations.
For text streams, C translates a single newline character to the local
system's end-of-line representation on output, and vice versa on input.
And if you want to read and write binary files, just use a binary
stream; stdout isn't intended to deal with binary files.

Thanks for the info. Here's the problem that I've encountered.
Lots of my programs are cgi's that emit binary files, typically
gifs, used in html as, e.g.,
<img src="/cgi-bin/myprog.cgi?instructions and/or data for image">
In this case, myprog >>has to<<, as I understand it, emit to stdout.
Is that right? If so, I need to put stdout in "binary mode"
(that's what windows calls it, the typical win C command being
something like setmode(fileno(stdout),O_BINARY)).
Got a fix for, or insight into, dealing with that without
messy #ifdef stuff? Thanks,

Sorry for following myself up:
I should have mentioned that several "intended-to-be-portable"
fixes that I've tried, in particular freopen("CON","wb",stdout)
and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work
portably, for one reason or another (tales of woe elided:)
So I'm asking for a pretty much known-to-portably-work fix.
 
J

James Kuyper

Sorry for following myself up:
I should have mentioned that several "intended-to-be-portable"
fixes that I've tried, in particular freopen("CON","wb",stdout)
and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work
portably, for one reason or another (tales of woe elided:)

For freopen(), "It is implementation-defined which changes of mode are
permitted (if any), and under what circumstances.", so you can't count
on that to work.

fdopen() is a POSIX function; I've no idea whether the function with
that name that you're trying to use on a mswin system is supposed to
conform fully to POSIX specifications for that function. More
importantly, stdout is only required to be an expression of the type
"pointer to FILE"; it's not required to be the name of a pointer
variable that you can assign to. For instance, an implementation of
<stdio.h> could have:

extern FILE __std_streams[];
#define stdout (&__std_streams[0])
#define stdin (&__std_streams[1])
#define stderr (&_std_streams[2])

You could get around that problem, at least, by assigning the value
returned by fdopen() in your own pointer, rather than trying to store it
in stdout.
So I'm asking for a pretty much known-to-portably-work fix.

I can't help you with that. The last time I did any CGI work was more
than a decade ago, and the output was pure text, so the fact that stdout
is in text mode wasn't a problem. Also, it was on a unix-like system
where there's no difference between text mode and binary mode.
 
S

Stephen Sprunk

Problem found and fixed, as per earlier followups. Turned out to be
slightly different float behavior. But you could be right that it
wasn't a 64-bit issue, per se. And I'd tried cc -m32-bit, as per
previous followups, but compiler barfed at that switch

Shouldn't that be "-m32"?

http://gcc.gnu.org/onlinedocs/gcc-4...2d64-Options.html#i386-and-x86_002d64-Options
(not sure why, man cc wasn't on that box). So I couldn't try to get a
finer-grained understanding of problem.

Just Google "man gcc"; that's available nearly everywhere.

S
 
K

Keith Thompson

JohnF said:
Keith Thompson said:
JohnF said:
For example, one thing I dislike about mswin (in addition to what
you mention below) is that stdout isn't default binary mode, but
outputs two chars, crlf, for your one \n. Code that cares, and which
is intended to run on both win and unix, gets messy dealing with that.
[...]

stdout is a text stream in all C implementations.
For text streams, C translates a single newline character to the local
system's end-of-line representation on output, and vice versa on input.
And if you want to read and write binary files, just use a binary
stream; stdout isn't intended to deal with binary files.

Thanks for the info. Here's the problem that I've encountered.
Lots of my programs are cgi's that emit binary files, typically
gifs, used in html as, e.g.,
<img src="/cgi-bin/myprog.cgi?instructions and/or data for image">
In this case, myprog >>has to<<, as I understand it, emit to stdout.
Is that right? If so, I need to put stdout in "binary mode"
(that's what windows calls it, the typical win C command being
something like setmode(fileno(stdout),O_BINARY)).
Got a fix for, or insight into, dealing with that without
messy #ifdef stuff? Thanks,

I really don't know.

stdout is *intended* for text output. On Unix-like systems, you
happen to be able to write binary data to a text stream without
any problems (because the end-of-line translation doesn't need to
do anything), but C in general doesn't guarantee that will work.

Normally, you'd write binary data by opening a file (not stdout)
in binary mode and writing to it.

If CGI imposes a requirement to write binary data to stdout,
then I'm sure there's a solution; I just have no idea what
it is. Try Googling and/or posting in another forum. There's a
comp.infosystems.www.authoring.cgi newsgroup, but I have no idea
whether it's still active. I've found stackoverflow.com to be a
good site for this kind of question. But first check for existing
answers; you're unlikely to be the first person to run into this.
 
A

Andrew Cooper

Problem found and fixed, as per earlier followups.
Turned out to be slightly different float behavior.
But you could be right that it wasn't a 64-bit issue,
per se. And I'd tried cc -m32-bit, as per previous
followups, but compiler barfed at that switch (not
sure why, man cc wasn't on that box). So I couldn't
try to get a finer-grained understanding of problem.

I know I am a little late to the thread here, but can explain your problem.

I don't know whether you have updated the code on your website, but the
code (as wgotten 10 mins ago) wont work.

When compiled as 32bit, ran1() uses x87 FPU instructions, but when
compiled as 64bit, ran1() uses SSE instructions.

The 32bit code keeps its intermediate values on the x87 register stack,
causing rounding to occur at 80 bits worth of precision, which is
different to the SSE code (which appears to be rounding at 64 bits, but
frankly its late and SSE instructions look far too similar for their own
good)


Basically, avoid any form of floating point calculations at all,
especially if you are expecting something deterministic. You (like 99
out of every 100 programmers, myself included) do not know how to use
them correctly.

~Andrew
 
A

Andrew Cooper

I know I am a little late to the thread here, but can explain your problem.

I don't know whether you have updated the code on your website, but the
code (as wgotten 10 mins ago) wont work.

When compiled as 32bit, ran1() uses x87 FPU instructions, but when
compiled as 64bit, ran1() uses SSE instructions.

The 32bit code keeps its intermediate values on the x87 register stack,
causing rounding to occur at 80 bits worth of precision, which is
different to the SSE code (which appears to be rounding at 64 bits, but
frankly its late and SSE instructions look far too similar for their own
good)


Basically, avoid any form of floating point calculations at all,
especially if you are expecting something deterministic. You (like 99
out of every 100 programmers, myself included) do not know how to use
them correctly.

~Andrew

And in addition, using an identical binary, the chances are very good
that you would get a different stream of random numbers on Intel vs AMD
hardware, and you would get different random numbers from running the
set of instructions under different operating systems on identical
hardware. C itself does not provide you with an ability to set the
x87/SSE general control registers.

~Andrew
 
J

JohnF

Keith Thompson said:
If CGI imposes a requirement to write binary data to stdout,
then I'm sure there's a solution;

Oh, yeah, a well-known one that I summarized above (snipped here),
setmode(fileno(stdout),O_BINARY)). But that only exists on windows,
so you need some #ifdef's to handle the non-portability.

Recall that I'd only mentioned this as a portability issue
because I had actually been tripped up by, whereas 32-bit ints
hadn't ever been any problem for me (not since about 1989, anyway,
when I actually did have that problem, when requested to port
one of my programs from VAX to msdos pc).
But first check for existing
answers; you're unlikely to be the first person to run into this.

setmode() is indeed the existing answer everybody uses,
as far as I know about, but there's no #ifdef-less portable
solution I'm aware of.
 
J

JohnF

Robert Wessel said:
I don't believe there is a portable fix. The usual thing under
Windows in a C CGI script is to use _setmode() to change stdout to
binary. Bury it in your platform adaptation layer, and avoid the
#ifdefs.

Thanks, Robert. But no such layer besides the #ifdef's.
I had thought about writing my own dummy setmode() that
does nothing, compiled only on unix, so I could call it
regardless of platform. That would minimize #ifdef's.
But some windows compilers call the func setmode() and
the constant O_BINARY, whereas others call it _setmode()
and _O_BINARY. Go figure. So I have to check that, via
additional #ifdef's. Just a big annoying mess, but not
a real problem, except that it's not a standard, so I
don't know when things will change, breaking that code.
 
J

JohnF

James Kuyper said:
[...] several "intended-to-be-portable"
fixes that I've tried, in particular freopen("CON","wb",stdout)
and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work
portably, for one reason or another (tales of woe elided:)

For freopen(), "It is implementation-defined which changes of mode are
permitted (if any), and under what circumstances.", so you can't count
on that to work.

fdopen() is a POSIX function; I've no idea whether the function with
that name that you're trying to use on a mswin system is supposed to
conform fully to POSIX specifications for that function. More
importantly, stdout is only required to be an expression of the type
"pointer to FILE"; it's not required to be the name of a pointer
variable that you can assign to. For instance, an implementation of
<stdio.h> could have:

extern FILE __std_streams[];
#define stdout (&__std_streams[0])
#define stdin (&__std_streams[1])
#define stderr (&_std_streams[2])

You could get around that problem, at least, by assigning the value
returned by fdopen() in your own pointer, rather than trying to store it
in stdout.

Thanks for the above info and suggestion, James.
I'll play around with it a little more to see if something
more portable than setmode() works, at least on the free
djgpp and mingw compilers.
 
J

JohnF

Andrew Cooper said:
And in addition, using an identical binary, the chances are very good
that you would get a different stream of random numbers on Intel vs AMD
hardware, and you would get different random numbers from running the
set of instructions under different operating systems on identical
hardware. C itself does not provide you with an ability to set the
x87/SSE general control registers. ~Andrew

I believe your "as gotten 10 mins ago" code is current.
But if the following remark seems wrong, maybe download again.
I agree that the float result called "temp" in ran1() won't
be portable. But it's not used anywhere, any more. Instead,
there's now that "static long iran;" near the top of the module
that's the only result actually used now. And that's a
completely integer calculation.
If you look real, real carefully, you'll see you can
invoke it as fm -r 0 -etc, which will revert to the original
rng usage. That's there just so stuff which was previously
encrypted can still be decrypted. And then, yeah, in that
case you better not try to decrypt with a 64-bit executable
if you encrypted with a 32-bit one. I guess that's what
all that gpl stuff about no "warranty of merchantability"
is all about:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top