running same script on same data on two different machines -->different result

  • Thread starter Christopher Brewster
  • Start date

C

Christopher Brewster

I am running the same script on the same data on two different
machines (the folder is synchronised with Dropbox).
I get two different results. All the script does is count words in
different files and perform a simple set operation on the word lists.
The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
Python 2.5.1
The desktop is an iMac (brand new) running OS X 10.5.5 also with
Python 2.5.1

I have tried running the script on an ubuntu server with Python 2.5.2
and the results corresponded with my laptop's output.
How can I find out the cause of this anomaly? What tests can I
perform?

Thank you,

Christopher Brewster
Aston University
 
Ad

Advertisements

S

Steve Holden

Christopher said:
I am running the same script on the same data on two different
machines (the folder is synchronised with Dropbox).
I get two different results. All the script does is count words in
different files and perform a simple set operation on the word lists.
The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
Python 2.5.1
The desktop is an iMac (brand new) running OS X 10.5.5 also with
Python 2.5.1

I have tried running the script on an ubuntu server with Python 2.5.2
and the results corresponded with my laptop's output.
How can I find out the cause of this anomaly? What tests can I
perform?
OK, as a university denizen you are presumably a smart type. Do you
*really* think this is an adequate problem description for debugging?

You might drop lucky, but more information couldn't possibly hurt. We
*try* to be mindreaders, but it would help to know whether you are
talking about string handling or floating-point computations, for example.

If the latter then it's probably because one machine is based on PowerPC
architecture and the other is a more recent Intel-architecture Mac.

regards
Steve
 
C

Christopher Brewster

OK, as a university denizen you are presumably a smart type. Do you
*really* think this is an adequate problem description for debugging?

You might drop lucky, but more information couldn't possibly hurt. We
*try* to be mindreaders, but it would help to know whether you are
talking about string handling or floating-point computations, for example..

If the latter then it's probably because one machine is based on PowerPC
architecture and the other is a more recent Intel-architecture Mac.

regards
 Steve

Thanks for the suggestion but they are both Intel machines.
There is no floating point just simple additions.

No matter how smart you are, if you do not do this sort of thing
often,
you do not know exactly what sort of information to provide or what
questions to ask.
So that is exactly my question - what are the right questions?
What information do I need to provide to try to solve this?

Christopher
 
P

Philip Semanchuk

I am running the same script on the same data on two different
machines (the folder is synchronised with Dropbox).
I get two different results. All the script does is count words in
different files and perform a simple set operation on the word lists.
The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
Python 2.5.1
The desktop is an iMac (brand new) running OS X 10.5.5 also with
Python 2.5.1

I have tried running the script on an ubuntu server with Python 2.5.2
and the results corresponded with my laptop's output.
How can I find out the cause of this anomaly? What tests can I
perform?

No idea what Dropbox is, but it is a potential point of failure.
Ensure it is doing its job. Programmatically ensure that the source
files are exactly the same before you start your Python program.

Then try your program on different source files. If the problem shows
up on some source files and not on others, try to figure out the
pattern that relates the files.

Or take your "problem" data file and chop it in half by deleting the
lines from the first half of the file. See if the problem still
occurs. If not, try using the latter half of the file. By using a
binary search like this, maybe you can isolate the problem data to a
very small portion making visual detection of the problem easier.

Until you get more info, this is just generic debugging and isn't
specific to Python.

Good luck
Philip
 
S

Steven D'Aprano

I am running the same script on the same data on two different machines
(the folder is synchronised with Dropbox). I get two different results.
All the script does is count words in different files and perform a
simple set operation on the word lists. The laptop is a Macbook Pro (2
1/2 years old) running OS X 10.5.5 with Python 2.5.1
The desktop is an iMac (brand new) running OS X 10.5.5 also with Python
2.5.1

I have tried running the script on an ubuntu server with Python 2.5.2
and the results corresponded with my laptop's output. How can I find out
the cause of this anomaly? What tests can I perform?

Try eliminating files and see if you can narrow the problem down to a
single file.

Make sure the files really are synchronized. Try comparing their md5
checksums.

Create a batch of test files, copy them from one machine to the other,
and then confirm that the script calculates the same result.

Lastly, make sure that both machines really are using the same script!

And if you do find the result, please let us know... I'm intrigued.
 
Ad

Advertisements

J

John Machin

I am running the same script on the same data on two different
machines (the folder is synchronised with Dropbox).
I get two different results. All the script does is count words in
different files and perform a simple set operation on the word lists.

1. "same data" versus "different files": are you using "different" in
the same sense as in "different machines" and "different results"? How
do you know the data is the same?

2. Either show us your script, or tell us (with a reasonable degree of
precision):
* how do you define a "word"
* what is a "word list"
* what is "a simple set operation on the word lists"
* does the script use any of: random module, current date/time,
iteration over dictionaries while updating them, etc

3. (a) Which of the two sets of results is correct? (b) What is your
basis for answering (a)?
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top