32/64 bit cc differences

J

JohnF

I'm getting a tiny-cum-microscopic, but nevertheless fatal,
difference in the behavior of the exact same C code compiled
on one 64-bit linux machine...
o dreamhost.com
uname -a Linux mothman 2.6.32.8-grsec-2.1.14-modsign-xeon-64 #2 SMP
Sat Mar 13 00:42:43 PST 2010 x86_64 GNU/Linux
cc --version cc (Debian 4.3.2-1.1) 4.3.2
versus two other 32-bit linuxes...
o panix.com
uname -a NetBSD panix3.panix.com 6.1.2 NetBSD 6.1.2 (PANIX-USER) #0:
Wed Oct 30 05:25:05 EDT 2013 i386
cc --version cc (NetBSD nb2 20110806) 4.5.3
o my own local box running slackware 14.0 32-bit
cc --version cc (GCC) 4.7.1

The code is an en/de-cryption utility forkosh.com/fm.zip,
which is way too many lines to ask anybody to look at.
But my own debugging is failing to identify where the
difference creeps in, and googling failed to help suggest
where to look more deeply.

Firstly, both executables "work", i.e., if you encrypt and
then decrypt, you get back the exact same original file.
But if you encrypt using the 32-bit executable, scp the
encrypted file to the 64-bit machine (md5's match) and then
decrypt, the result is exactly the same length and almost
identical except for about one byte in a thousand that doesn't
diff. Vice versa (encrypt on 64-bit, decrypt on 32) gives
the same behavior. (By the way, the 32-vs-64-bit encrypted files
are also ~one-in-a-thousand different, so both stages exhibit
this small problem.)
And I tried cc -m32 on the 64-bit machine, but there's
some stubs32.h that it's missing. So instead, I cc -static
on my own box, and that executable does work on the 64-bit
machine when run against files encrypted on either 32-bit box.
So the problem doesn't seem to be the 64-bit os, but rather
the cc executable, though I'm not 100% sure.

What I'm really finding weird is that ~one-byte-in-a-thousand
diff. The program uses several streams of random numbers
(generated by its own code) to xor bytes, permute bits, etc.
The slightest problem would garble up the data beyond belief.
Moreover, it's got a verbose flag, and I can see the streams
are identical. And everywhere else I've thought to look
seems okay, too, as far as I can tell.
So I'm asking about weird-ish 32/64-bit cc differences
that might give rise to this kind of behavior. Presumably,
there's some subtle bug that I'm failing to see in the code,
and which the output isn't helping me to zero in on. Thanks,
 
G

glen herrmannsfeldt

JohnF said:
I'm getting a tiny-cum-microscopic, but nevertheless fatal,
difference in the behavior of the exact same C code compiled
on one 64-bit linux machine...
o dreamhost.com
(snip)

Firstly, both executables "work", i.e., if you encrypt and
then decrypt, you get back the exact same original file.
But if you encrypt using the 32-bit executable, scp the
encrypted file to the 64-bit machine (md5's match) and then
decrypt, the result is exactly the same length and almost
identical except for about one byte in a thousand that doesn't
diff. Vice versa (encrypt on 64-bit, decrypt on 32) gives
the same behavior. (By the way, the 32-vs-64-bit encrypted files
are also ~one-in-a-thousand different, so both stages exhibit
this small problem.)

Some years ago, I wrote a JBIG2 coder. I could debug it
by sending the output, in the form of a PDF file, into
Adobe Reader.

As part of debugging, first I encoded a file of all zero bits.
(Well, the data part. It was a TIFF file with a bitmap image,
so all the image bits were zero.)

In many cases, as you note (and I snipped) once it goes wrong
it turns into random bits, but with the simpler file it tended
to go less wrong. It turns out that, at least in JBIG2, there
are enough things that don't happen very often that it can work
for a while before failing.

Then I made the data progressively more complicated.
(All ones, alternating 1/0 bytes, bits, various other simple
patterns.)

In your case, I would encrypt a file of all zeros, then
decrypt it while checking the output. As soon as a non-zero
byte comes out, stop. (Maybe put an if() inside to check
for it.)

Does the 64 bit system use 64 bit int? That could be enough
to confuse things. Or 64 bit long, where the program expects
a 32 bit long (or unsigned long)?

-- glen
 
B

BartC

JohnF said:
I'm getting a tiny-cum-microscopic, but nevertheless fatal,

Fatal? I thought you said they worked.
difference in the behavior of the exact same C code compiled
on one 64-bit linux machine...
What I'm really finding weird is that ~one-byte-in-a-thousand
diff. The program uses several streams of random numbers
(generated by its own code) to xor bytes, permute bits, etc.
The slightest problem would garble up the data beyond belief.

Start with a known input file (eg. of all zeros as suggested), and of the
minimum size to show the problem (just over 1000 bytes for example).

Create a test to pick up the first difference, and make sure this difference
is repeatable.

Then try varying things: choose one stream of random numbers at a time for
example (disable the rest, or make them return zero), until files are the
same, then you can zero in on the bit that causes the problem. (Although the
difference might have moved to later in the file, or the problem is with the
interaction between particular bitpatterns from two streams.)

You might try also using, temporarily, specific widths for all integers,
instead of relying on whatever width 'int' happens to map to. However, if
int is 64-bits on one machine, then I think all intermediate results will be
widened to 64-bits, whatever you do.)

Print lots of diagnostics too, and compare them between the programs (ie.
all intermediate results). (I don't know how much output is likely for 1K of
output, or just enable output as you approach the point where you expect a
mismatch.) At the point where the output is different, some of those results
must be different too.

(You say you do this with 'verbose' output, but there must be a difference
at the point where the output changes, unless the problem is something to do
with file output itself: are you saying that you actually print each byte
that is written to the output, that this output is the same on both
machines, but when comparing the written files, that 1000th byte is
different?)
 
S

Siri Cruz

JohnF said:
I'm getting a tiny-cum-microscopic, but nevertheless fatal,
difference in the behavior of the exact same C code compiled
on one 64-bit linux machine...

Problems I found when I tried compiling from 32 bit macosx to 64 bit:

(1) var args arguments integer and pointer constants like 0, 1, 2 that were
stilled compiled into 32 bit values, but with like va_arg(argv, long int) that
now expected a 64 bit argument.

(2) Uninitialised variables that had been zeroed were now random values.

(3) Linking with differently compiled objects.

(4) Exposing bad assumptions about alignment and byte order from casts and
unions.

Also going from ppc to i386:

(5) Procedure frame allocation changed to block frame.

Also 32 bit and 64 bit were different macosx version (10.6 vs 10.8) with
differences besides word size, like randomised stack addresses that broke the
garbage collector.
 
J

JohnF

glen herrmannsfeldt said:
Some years ago, I wrote a JBIG2 coder. I could debug it
by sending the output, in the form of a PDF file, into
Adobe Reader.

As part of debugging, first I encoded a file of all zero bits.
(Well, the data part. It was a TIFF file with a bitmap image,
so all the image bits were zero.)

In many cases, as you note (and I snipped) once it goes wrong
it turns into random bits, but with the simpler file it tended
to go less wrong. It turns out that, at least in JBIG2, there
are enough things that don't happen very often that it can work
for a while before failing.

Then I made the data progressively more complicated.
(All ones, alternating 1/0 bytes, bits, various other simple
patterns.)

In your case, I would encrypt a file of all zeros, then
decrypt it while checking the output. As soon as a non-zero
byte comes out, stop. (Maybe put an if() inside to check
for it.)

Thanks, Glen. It's your "as soon as" that's a problem in my case.
I'd already tried a binary zero file, and various other test cases.
For encrypting, the program fread()'s the input file in randomly-sized
blocks, processing each block separately. It first adds a random number
of noise bytes (which can be set to 0 and which I tried),
then randomly xor or xnor's each data byte with a random byte from
your key, and finally randomly permutes the bits of the entire block
(typically a many-thousand-bit permutation). Decrypting reverses
the procedure.
The "as soon as" problem is because blocks are big garbled
messes until every step is performed on the entire block.
You can't get byte after byte of decrypted stuff. But you can
turn off some things, which I did try, but which failed to identify
the source of the problem (as far as I could tell). You can also
turn off everything by giving it no key, in which case it just
copies infile->outfile. And, yeah, that works on both 32- and 64-bit
compiled executables (that would've been too easy:).
Of course, I can modify the program to turn off stuff more
"slowly", and I'm pretty sure that'll eventually zero in on the
problem. But it's a big pain to do that properly, e.g., the random
number streams have to be kept in sync reversibly, so that
encrypted stuff can be decrypted.
So I'm trying to figure out smart guesses what to try first,
before wasting lots of time trying things that I should know
(if I were smarter) can't be problems. And I posted the question
hoping someone had tripped over similar problems. It's not
so much "try a file with all binary zeroes" that I need to hear;
it's more like "these are the operations/constructions/etc that
are likely to go unnoticed on 32-bit and then behave differently
on 64-bit". See below...
Does the 64 bit system use 64 bit int? That could be enough
to confuse things. Or 64 bit long, where the program expects
a 32 bit long (or unsigned long)? -- glen

Yeah, that's the kind of thing that could "go unnoticed and
then behave differently". And there are indeed a couple of
long long's (and %lld debugging printf's), in the program's rng,
but I've checked all that very, very carefully.
What I had also been thinking was that my xor/xnor's were
getting fouled up by strangely different endian-ness behavior,
i.e., which byte of i contains c in unsigned char c; int i = (int)c;
for 32- versus 64-bit, and how does that affect ^,~ operations?
But I checked that with a little offline test program,
and think (maybe 85% confidence) it's okay. And I'd think
it should mess up way more often if not okay.

The code's supposed to be quite portable, and was originally
checked on intel linux, netbsd, freebsd, ms windows, and on
VAX and alpha OpenVMS. It always generated identical encrypted
files...until now.
 
B

Ben Bacarisse

JohnF said:
The code is an en/de-cryption utility forkosh.com/fm.zip,
which is way too many lines to ask anybody to look at.

Well, it's only about 1100 lines in one file, and about 740 that are not
comment lines. Unfortunately it has inconsistent (and odd) spacing and
indentation. It made my head hurt reading it!

There are signs that it's not written by someone who knows C well --
some odd idioms, unnecessary casts, not using prototype declarations,
using signed types for bit operations... These things mean it's going
to be harder work that one might hope. You'll need to decide if the
pay-off is worth it.

What I'm really finding weird is that ~one-byte-in-a-thousand
diff. The program uses several streams of random numbers
(generated by its own code) to xor bytes, permute bits, etc.
The slightest problem would garble up the data beyond belief.

Not if the error is at a late stage in the decryption -- say some sort
of final permutation. The program works in blocks, so if it has a block
size of 1000 bytes, you might be seeing and entirely predictable error
that occurs every time.

<snip>
 
J

JohnF

BartC said:
Fatal? I thought you said they worked.
They're separately consistent -- each correctly decrypts what
it encrypts, but not what the other encrypts.
Start with a known input file (eg. of all zeros as suggested), and of the
minimum size to show the problem (just over 1000 bytes for example).

Create a test to pick up the first difference, and make sure this difference
is repeatable.

Sure, already done.
Then try varying things: choose one stream of random numbers at a time for
example (disable the rest, or make them return zero), until files are the
same, then you can zero in on the bit that causes the problem.

Yes, this has to eventually work. But, as pointed out to Glen,
I already tried turning off the things that were easy to turn off,
without finding the problem. What I'm looking for now, before messing
with lots of code, is suggestions about esoteric things that might
behave differently on 32- vs 64-bit. That'll suggest what code
to mess with first. Let me cut-and-paste from followup to Glen
one such thing I'd tried before posting,
"What I had also been thinking was that my xor/xnor's were
getting fouled up by strangely different endian-ness behavior,
i.e., which byte of i contains c in unsigned char c; int i = (int)c;
for 32- versus 64-bit, and how does that affect ^,~ operations?
But I checked that with a little offline test program,
and think (maybe 85% confidence) it's okay. And I'd think
it should mess up way more often if not okay."
So that's the kind of thing I'm guessing must be wrong,
but haven't been able to think of anything further along
those lines which would suggest how to best proceed.
(Although the
difference might have moved to later in the file, or the problem is with the
interaction between particular bitpatterns from two streams.)

You might try also using, temporarily, specific widths for all integers,
instead of relying on whatever width 'int' happens to map to. However, if
int is 64-bits on one machine, then I think all intermediate results will be
widened to 64-bits, whatever you do.)

Yeah, I didn't mention that I'd also tried -m32-bit on the 64-bit box,
but that compiler (--version is cc (Debian 4.3.2-1.1) 4.3.2) said
cc1: error: unrecognized command line option "-m32-bit"
and man cc isn't on that machine.
Print lots of diagnostics too, and compare them between the programs (ie.
all intermediate results). (I don't know how much output is likely for 1K of
output, or just enable output as you approach the point where you expect a
mismatch.) At the point where the output is different, some of those results
must be different too.

Yes, something must be different. But, as in followup to Glen, "approach"
isn't quite appropriate since an entire block must be completely
processed before any of the bytes within it make sense.
And, yeah, your implied guess that there's too many internal/intermediate
calculations to output them all is correct. As usual, it's got to
be a kind of "binary search" where and what to printf.
(You say you do this with 'verbose' output, but there must be a difference
at the point where the output changes, unless the problem is something to do
with file output itself: are you saying that you actually print each byte
that is written to the output, that this output is the same on both
machines, but when comparing the written files, that 1000th byte is
different?)

Yeah, exactly, I prepared a suite of test input files (including
several all binary zeroes files of different lengths), and it's roughly
(but not exactly) every 1000th byte of output that's different. Weirdly,
a test input file with 2000 0's versus one with 3000 0's has the first
output mess-up at different places. And in both cases, the program's
first random block size was 6159 bytes (because I used the same key),
so it read both files all at once. That is, the problem isn't at
block boundaries (which would've been way too easy:).
 
J

JohnF

Richard Damon said:
My guess would be that somewhere in the code "Undefined" or
"Implementation" defined behavior is happening, which the two compilers
are implementing differently.

Yeah, that's almost certainly what's happening. I'm calling
it a "bug" (in my program) because the code was meant to be
portable, and was intentionally written to work on any architecture,
relying only on standard C semantics. But I apparently didn't write
it intentionally enough, and missed something.
It might be an integer overflow
But not that, I don't think.
or something similar.
Yeah, that's the one.
 
J

JohnF

Siri Cruz said:
Problems I found when I tried compiling from 32 bit macosx to 64 bit:

Thanks, Siri, these are exactly the kinds of things I was
hoping to hear about. Somewhere or other I've made that
kind of mistake.
(1) var args arguments integer and pointer constants like 0, 1, 2 that were
stilled compiled into 32 bit values, but with like va_arg(argv, long int)
that now expected a 64 bit argument.
No variable argument lists, so not that mistake.
(2) Uninitialised variables that had been zeroed were now random values.
I don't think I leave any uninitialized variables hanging,
but could be. In that case it's not even a 32- vs 64-bit problem,
per se, just that the 64-bit compiler isn't zeroing memory like the others.
(3) Linking with differently compiled objects.
Just one module containing all funcs.
(4) Exposing bad assumptions about alignment and byte order from casts
and unions.
No unions, but some kind of alignment issue might be possible.
Also going from ppc to i386:

(5) Procedure frame allocation changed to block frame.

Also 32 bit and 64 bit were different macosx version (10.6 vs 10.8) with
differences besides word size, like randomised stack addresses that broke
the garbage collector.
Those not problems, but I will double-check your preceding suggestions.
Thanks again,
 
J

JohnF

Ben Bacarisse said:
Well, it's only about 1100 lines in one file, and about 740 that are not
comment lines. Unfortunately it has inconsistent (and odd) spacing and
indentation. It made my head hurt reading it!

Thanks, Ben, and sorry about that. I pretty much figured >100 lines
was too much to ask anybody to read.
There are signs that it's not written by someone who knows C well --
some odd idioms, unnecessary casts, not using prototype declarations,
using signed types for bit operations... These things mean it's going
to be harder work that one might hope. You'll need to decide if the
pay-off is worth it.

From the "Revision history" comment near top, note that first version
was written in 1988, ported from an even earlier program.
Your "signed types for bit operations" is indeed the one problem
I'd actually worried about (not that the others aren't problems,
but unlikely to be causing odd observed behavior), and which
I actually checked with small standalone test program, as noted
in preceding followup. But that particular problem is easy enough
to fix (I know exactly what you're referring to in xcryptstr),
and I'll do that (and kick myself if it fixes the problem).
Not if the error is at a late stage in the decryption -- say some sort
of final permutation.

Permutation will mess up all-or-nothing, as far as I can tell.
It's xcryptstr with the xor/xnor's that (as far as I can tell, again)
is the likely culprit along this line of thought. But I think
I checked that, and I think the error frequency would be greater
if it were messing up. But I could be twice wrong (that would
be the "kicking" part:).
The program works in blocks, so if it has a block
size of 1000 bytes, you might be seeing and entirely predictable error
that occurs every time.

As pointed out in preceding followup, I'd eliminated the possibility
of block boundary errors. But it had indeed crossed my mind as the
one obvious explanation for the observed byte error distribution.

P.S. I'm amazed you read the code so fast and in enough detail
with enough understanding to pick up the info for your remarks.
Scary. Thanks again,
 
B

Ben Bacarisse

JohnF said:
I don't think I leave any uninitialized variables hanging,
but could be. In that case it's not even a 32- vs 64-bit problem,
per se, just that the 64-bit compiler isn't zeroing memory like the
others.

There is a very useful program called valgrind. You will want to strew
scented petals before the feet of those who wrote it.

However, it does not report the use of any initialised data so it's
value here is negative. Of course, that was only one example run. I'd
be tempted to use it for every run until the issue is fixed.

<snip>
 
B

Ben Bacarisse

JohnF said:
Thanks, Ben, and sorry about that. I pretty much figured >100 lines
was too much to ask anybody to read.


From the "Revision history" comment near top, note that first version
was written in 1988, ported from an even earlier program.
Your "signed types for bit operations" is indeed the one problem
I'd actually worried about (not that the others aren't problems,
but unlikely to be causing odd observed behavior), and which
I actually checked with small standalone test program, as noted
in preceding followup. But that particular problem is easy enough
to fix (I know exactly what you're referring to in xcryptstr),
and I'll do that (and kick myself if it fixes the problem).

Actually I was thinking about the bit getting and setting macros myself.
Also, check that the shift amount is never equal to the bit width. The
fact that this is not defined (even for unsigned types) surprises some
people. One way to do this is to add an assert to them.

The other thing I'd do is to check all the casts and remove those that
are not needed. They can mask interesting and useful compiler warnings.
(You are asking for lots of warnings, I hope.)

P.S. I'm amazed you read the code so fast and in enough detail
with enough understanding to pick up the info for your remarks.
Scary. Thanks again,

I once had a job which was almost entirely porting C code from one
system to another very peculiar one. The result is that some (sadly not
all) portability problems leap out at me.
 
B

BartC

JohnF said:
Yeah, I didn't mention that I'd also tried -m32-bit on the 64-bit box,
but that compiler (--version is cc (Debian 4.3.2-1.1) 4.3.2) said
cc1: error: unrecognized command line option "-m32-bit"
and man cc isn't on that machine.

How big are int, long and long long on your machines?

I seem to remember that gcc 'long' was 32-bit under Windows, and 64-bit
under Linux. (If you expect long to be 64-bits, and ever want to run under
Windows, you might want to upgrade the type.)

(I've looked at your code; I've no idea what the problem might be, but if
you're porting from an int=32, long=64 implementation to an int=64, long=64
one, I'm surprised you haven't got bigger differences. Certainly my own code
would have a load of problems!)
 
J

James Kuyper

For encrypting, the program fread()'s the input file in randomly-sized
blocks, processing each block separately. It first adds a random number
of noise bytes (which can be set to 0 and which I tried),
then randomly xor or xnor's each data byte with a random byte from
your key, and finally randomly permutes the bits of the entire block
(typically a many-thousand-bit permutation). Decrypting reverses
the procedure. ....
Of course, I can modify the program to turn off stuff more
"slowly", and I'm pretty sure that'll eventually zero in on the
problem. But it's a big pain to do that properly, e.g., the random
number streams have to be kept in sync reversibly, so that
encrypted stuff can be decrypted.

The random numbers in your program make it difficult to debug, because
the behavior can be different each time it's run. For debugging
purposes, you should modify the code to use a fixed seed for your random
number generator, so that it generates exactly the same random numbers
each time it is run. Make sure that the particular seed you use is one
that will reproduce the problem!
One possible issue is that the random number generators on the two
systems might be different. You might need to write your own, just to
make sure it generates the same sequence every time, on both systems. It
doesn't have to be a very sophisticated one, so long as it does
reproducibly duplicate the problem you're seeing.
 
A

Aleksandar Kuktin

Firstly, both executables "work", i.e., if you encrypt and then decrypt,
you get back the exact same original file.
But if you encrypt using the 32-bit executable, scp the encrypted file
to the 64-bit machine (md5's match) and then decrypt, the result is
exactly the same length and almost identical except for about one byte
in a thousand that doesn't diff. Vice versa (encrypt on 64-bit, decrypt
on 32) gives the same behavior. (By the way, the 32-vs-64-bit encrypted
files are also ~one-in-a-thousand different, so both stages exhibit this
small problem.)

Wait, I don't understand this.

Correct me if I'm wrong:
1. same input file;
2. output of 32-bit executable and of 64-bit executable are bit-for-bit
identical;
3. output of executable, when decrypted by same executable that outputted
it, is bit-for-bit identical to the input;
4. output of 32-bit executable, when decrypted on 64-bit produces wrong
output;
5. output of 64-bit executable, when decrypted on 32-bit produces wrong
output.

Obviously, this can not be.
 
J

James Kuyper

Wait, I don't understand this.

Correct me if I'm wrong:
1. same input file;
2. output of 32-bit executable and of 64-bit executable are bit-for-bit
identical;
3. output of executable, when decrypted by same executable that outputted
it, is bit-for-bit identical to the input;
4. output of 32-bit executable, when decrypted on 64-bit produces wrong
output;
5. output of 64-bit executable, when decrypted on 32-bit produces wrong
output.

Obviously, this can not be.

My understanding of what he's saying (I could be mistaken) is that
input => 32-bit executable in encrypt mode => encrypted1 => 32-bit
executable in decrypt mode=> output1

input => 64-bit executable in encrypt mode => encrypted2 => 64-bit
executable in decrypt mode => output1

output1 is identical to input, for both runs, but encrypted1 differs
from encrypted2. As a result:

input => 32-bit executable in encrypt mode => encrypted1 => 64-bit
executable in decrypt mode => output2

input => 64-bit executable in encrypt mode => encrypted2 => 32-bit
executable in decrypt mode => output3

Neither output2 nor output3 is the same as output1 - but only one byte
in a thousand is different.
 
K

Keith Thompson

Aleksandar Kuktin said:
Wait, I don't understand this.

Correct me if I'm wrong:
1. same input file;
2. output of 32-bit executable and of 64-bit executable are bit-for-bit
identical;

I don't believe that's what he said.

On both 32-bit and 64-bit systems, encrypting a file *and then
decrypting it* yields an identical copy of the original file.
But the content of the encrypted file can vary (probably even from
one run to the next, since the program makes heavy use of random
numbers).

So a 32-bit encrypted file and a 64-bit encrypted file, from the
same input, will differ -- and decrypting a 32-bit encrypted file
using the 64-bit decryption program, or vice versa, yields output
that's *slightly* different from the original input.

JohnF, have I summarized the behavior correctly?

You say "about one byte in a thousand". Is there any pattern to
where the incorect bytes appear in the output? For example, if
the last byte of each 1024-byte block were incorrect, that would
be interesting. If it's seemingly random and varies from one run
to the next, that would also be interesting.
 
S

Stephen Sprunk

The code's supposed to be quite portable, and was originally
checked on intel linux, netbsd, freebsd, ms windows, and on
VAX and alpha OpenVMS. It always generated identical encrypted
files...until now.

Are all of those systems ILP32, or is the Alpha one ILP64? (I've never
used OpenVMS.) Have you tried it on Win64, which is IL32LLP64, or just
Win32?

Linux/x86-64 is I32LP64, so if that's the first such system you've
ported to, that may indicate where the problem lies. If your code works
on ILP64 systems, type problems seem unlikely, but you could still have
int/long mismatches that wouldn't show up there--or on IL32LLP64.

Also, you're using three different versions of GCC; it's possible that
one of them has a bug (or "feature") that's triggered by some UB in your
code and results in slight output changes. I'd highly suggest using the
same version on all your systems, if possible, to eliminate that as a
potential source of differences.

Aside: Why roll your own encryption algorithm rather than use a proven,
off-the-shelf algorithm, e.g. AES? Depending on OpenSSL's libcrypto is
pretty standard these days; no sense reinventing the square wheel.

S
 
O

osmium

James Kuyper said:
The random numbers in your program make it difficult to debug, because
the behavior can be different each time it's run. For debugging
purposes, you should modify the code to use a fixed seed for your random
number generator, so that it generates exactly the same random numbers
each time it is run. Make sure that the particular seed you use is one
that will reproduce the problem!
One possible issue is that the random number generators on the two
systems might be different. You might need to write your own, just to
make sure it generates the same sequence every time, on both systems. It
doesn't have to be a very sophisticated one, so long as it does
reproducibly duplicate the problem you're seeing.

As a bystander to this thread, that sounds like a Really Really Good Idea.
 
S

Stephen Sprunk

You say "about one byte in a thousand". Is there any pattern to
where the incorect bytes appear in the output? For example, if
the last byte of each 1024-byte block were incorrect, that would
be interesting. If it's seemingly random and varies from one run
to the next, that would also be interesting.

My first thought when seeing "one byte error every $blocksize" is that
there's an off-by-one error reading/modifying some buffer, but I can't
think of how porting between 32- and 64-bit variants of essentially the
same implementation would trigger such; it should always be there.

S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top