change of random state when pyc created??

Discussion in 'Python' started by Alan Isaac, May 5, 2007.

  1. Alan Isaac

    Alan Isaac Guest

    This may seem very strange, but it is true.
    If I delete a .pyc file, my program executes with a different state!

    In a single directory I have
    module1 and module2.

    module1 imports random and MyClass from module2.
    module2 does not import random.

    module1 sets a seed like this::

    if __name__ == "__main__":
    random.seed(314)
    main()

    I execute module1.py from the (Windows) shell.
    I get a result, let's call it result1.
    I execute it again. I get another result, say result2.
    Running it again and again, I get result2.

    Now I delete module2.pyc.
    I execute module1.py from the shell.
    I get result1.
    I execute it again; I get result2.
    From then on I get result2,
    unless I delete module.pyc again,
    in which case I once again get result1.

    Can someone explain this to me?

    Thank you,
    Alan Isaac
     
    Alan Isaac, May 5, 2007
    #1
    1. Advertising

  2. Alan Isaac

    Dustan Guest

    On May 4, 10:48 pm, "Alan Isaac" <> wrote:
    > This may seem very strange, but it is true.
    > If I delete a .pyc file, my program executes with a different state!
    >
    > In a single directory I have
    > module1 and module2.
    >
    > module1 imports random and MyClass from module2.
    > module2 does not import random.
    >
    > module1 sets a seed like this::
    >
    > if __name__ == "__main__":
    > random.seed(314)
    > main()
    >
    > I execute module1.py from the (Windows) shell.
    > I get a result, let's call it result1.
    > I execute it again. I get another result, say result2.
    > Running it again and again, I get result2.
    >
    > Now I delete module2.pyc.
    > I execute module1.py from the shell.
    > I get result1.
    > I execute it again; I get result2.
    > From then on I get result2,
    > unless I delete module.pyc again,
    > in which case I once again get result1.
    >
    > Can someone explain this to me?
    >
    > Thank you,
    > Alan Isaac


    I can't imagine why that would be, and I was unable to reproduce that
    behavior, using Microsoft Windows XP and Python 2.5:

    <module1.py>
    import module2
    import random

    def main():
    for i in range(10): print module2.aRandom()

    if __name__ == '__main__':
    random.seed(314)
    main()
    </module1.py>

    <module2.py>
    import random
    print "module2 imported"

    def aRandom():
    return random.randrange(1000000)
    </module2.py>


    C:\Documents and Settings\DUSTAN\Desktop\apackage>module1.py
    module2 imported
    196431
    111465
    2638
    628136
    234231
    207699
    546775
    449804
    633844
    179171

    C:\Documents and Settings\DUSTAN\Desktop\apackage>module1.py
    module2 imported
    196431
    111465
    2638
    628136
    234231
    207699
    546775
    449804
    633844
    179171

    C:\Documents and Settings\DUSTAN\Desktop\apackage>module1.py
    module2 imported
    196431
    111465
    2638
    628136
    234231
    207699
    546775
    449804
    633844
    179171

    C:\Documents and Settings\DUSTAN\Desktop\apackage>module1.py
    module2 imported
    196431
    111465
    2638
    628136
    234231
    207699
    546775
    449804
    633844
    179171

    I deleted module2.pyc right before that last call.
     
    Dustan, May 5, 2007
    #2
    1. Advertising

  3. Alan Isaac

    Alan Isaac Guest

    I have documented this behavior
    on two completely different systems
    (Win 2000 and Win XP SP2), using Python 2.5.1.

    It two modules where this happens,
    as described before.
    If it should not happen, there is a bug.
    I am looking for potential explanation,
    since I realize that finding bugs is unlikely.

    Alan Isaac
     
    Alan Isaac, May 6, 2007
    #3
  4. Alan Isaac

    John Machin Guest

    On May 6, 9:00 am, "Alan Isaac" <> wrote:
    > I have documented this behavior
    > on two completely different systems
    > (Win 2000 and Win XP SP2), using Python 2.5.1.


    You can't say that you have "documented" the behaviour when you
    haven't published files that exhibit the alleged behaviour.
     
    John Machin, May 6, 2007
    #4
  5. Alan Isaac

    Alan Isaac Guest

    "John Machin" <> wrote in message
    news:...
    > You can't say that you have "documented" the behaviour when you
    > haven't published files that exhibit the alleged behaviour.


    Fine. I have "observed" this behavior.
    The files are not appropriate for posting.
    I do not yet have a "minimum" case.
    But surely I am not the first to notice this!
    Alan Isaac
    PS I'll send you the files off list.
     
    Alan Isaac, May 6, 2007
    #5
  6. Alan Isaac

    John Machin Guest

    On May 5, 1:48 pm, "Alan Isaac" <> wrote:
    > This may seem very strange, but it is true.
    > If I delete a .pyc file, my program executes with a different state!
    >
    > In a single directory I have
    > module1 and module2.
    >
    > module1 imports random and MyClass from module2.


    That's rather ambiguous. Do you mean
    (a) module1 imports random and (MyClass from module2)
    or
    (b) module1 imports (random and MyClass) from module2

    > module2 does not import random.


    This statement would *appear* to rule out option (b) but appearances
    can be deceptive :)

    It's a bit of a worry that you call the first file "module1" and not
    "the_script". Does module2 import module1, directly or indirectly?

    >
    > module1 sets a seed like this::
    >
    > if __name__ == "__main__":
    > random.seed(314)
    > main()
    >
    > I execute module1.py from the (Windows) shell.
    > I get a result, let's call it result1.
    > I execute it again. I get another result, say result2.
    > Running it again and again, I get result2.


    Stop right there. Never mind what happens when you delete module2.pyc.
    Should you not expect to get the same result each time? Is that not
    the point of setting a constant seed each time you run the script?
    ====>>> Problem 1.

    >
    > Now I delete module2.pyc.
    > I execute module1.py from the shell.
    > I get result1.
    > I execute it again; I get result2.
    > From then on I get result2,
    > unless I delete module.pyc again,
    > in which case I once again get result1.
    >
    > Can someone explain this to me?
    >
    > Thank you,
    > Alan Isaac


    Compiling module2 is causing code to be executed that probably
    shouldn't be executed. ===>>> Problem 2.

    With all due respect to your powers of description :) no, it can't be
    explained properly, without seeing the contents of the source files. I
    strongly suggest that if you continue to experience Problem1 and/or
    Problem 2, you cut your two files down to the bare minima and post
    them here.

    Meanwhile, my deja-vu detector is kicking in ...

    uh-huh (1), from 25 April:
    ===
    %%%%% test2.py %%%%%%%%%%%%%
    from random import seed
    seed(314)
    class Trivial:
    pass
    ===
    Is module2 (still) doing that?
    Is module1 importing itself (directly or indirectly)?

    uh-huh (2), the long thread about relative imports allegedly being
    broken ...

    It appears to me that you need to divorce the two concepts "module"
    and "script" in your mind.

    Modules when executed should produce only exportables: classes,
    functions, NAMED_CONSTANTS, etc. It is OK to do things like process
    the easier-to-create
    _ds = """\
    foo 1
    bar 42
    zot 666"""
    into the easier-to-use
    USEFUL_DICT = {'foo': 1, 'bar': 42, zot: 666}
    but not to change global state.

    Scripts which use functions etc from a module or package should be
    independent of the module/package such that they don't need anything
    more complicated than simple importing of the module/package. The
    notion of inspecting the script's path to derive the module/package
    path and then stuffing that into sys.paths is mind boggling. Are
    module1/script1 and module2 parts of a package?

    Here's a suggestion for how you should structure scripts:

    def main():
    # All productive code is inside a function to take advantage
    # of access to locals being faster than access to globals
    import mymodule
    mymodule.do_something()
    if __name__ == "__main__":
    main()
    else:
    raise Exception("Attempt to import script containing nothing
    importable")

    and your modules should *start* with:
    if __name__ == "__main__":
    raise Exception("Attempt to execute hopefully-pure module as a
    script")

    HTH,
    John
     
    John Machin, May 6, 2007
    #6
  7. Alan Isaac

    Alan Isaac Guest

    "John Machin" <> wrote in message
    news:...
    > (a) module1 imports random and (MyClass from module2)


    Right.

    > It's a bit of a worry that you call the first file "module1" and not
    > "the_script". Does module2 import module1, directly or indirectly?


    No.
    I call a module any file meant to be imported by others.
    Many of my modules include a "main" function,
    which allow the module to be executed as a script.
    I do not think this is unusual, even as terminology.


    > Should you not expect to get the same result each time? Is that not
    > the point of setting a constant seed each time you run the script?


    Yes. That is the problem.
    If I delete module2.pyc,
    I do not get the same result.

    > With all due respect to your powers of description :) no, it can't be
    > explained properly, without seeing the contents of the source files.


    I sent them to you.
    What behavior did you see?
    > from random import seed
    > seed(314)
    > class Trivial:
    > pass
    > ===
    > Is module2 ... doing that?
    > Is module1 importing itself (directly or indirectly)?


    No.

    Separate issue
    ==============

    > Here's a suggestion for how you should structure scripts:
    >
    > def main():
    > # All productive code is inside a function to take advantage
    > # of access to locals being faster than access to globals
    > import mymodule
    > mymodule.do_something()
    > if __name__ == "__main__":
    > main()
    > else:
    > raise Exception("Attempt to import script containing nothing
    > importable")
    >
    > and your modules should *start* with:
    > if __name__ == "__main__":
    > raise Exception("Attempt to execute hopefully-pure module as a
    > script")


    I'm not going to call this a bad practice, since it has clear virtues.
    I will say that it does not seem to be a common practice, although that
    may be my lack of exposure to other's code. And it still does not
    address the common need of playing with a "package in progress"
    or a "package under consideration" without installing it.

    Cheers,
    Alan Isaac
     
    Alan Isaac, May 6, 2007
    #7
  8. Alan Isaac

    Dustan Guest

    On May 5, 6:30 pm, "Alan Isaac" <> wrote:
    > "John Machin" <> wrote in message
    >
    > news:...
    >
    > > You can't say that you have "documented" the behaviour when you
    > > haven't published files that exhibit the alleged behaviour.

    >
    > Fine. I have "observed" this behavior.
    > The files are not appropriate for posting.
    > I do not yet have a "minimum" case.
    > But surely I am not the first to notice this!
    > Alan Isaac
    > PS I'll send you the files off list.


    I got the files and tested them, and indeed got different results
    depending on whether or not there was a pyc file. I haven't looked at
    the source files in great detail yet, but I will. I would certainly
    agree that there's a bug going on here; we just need to narrow down
    the problem (ie come up with a "minimum" case).
     
    Dustan, May 6, 2007
    #8
  9. On Sun, 06 May 2007 00:20:04 +0000, Alan Isaac wrote:

    >> Should you not expect to get the same result each time? Is that not
    >> the point of setting a constant seed each time you run the script?

    >
    > Yes. That is the problem.
    > If I delete module2.pyc,
    > I do not get the same result.



    I think you have missed what John Machin is pointing out. According to
    your original description, you get different results even if you DON'T
    delete module2.pyc.

    According to your original post, you get the _same_ behaviour the first
    time you run the script, regardless of the pyc file being deleted or not.
    You wrote:

     
    Steven D'Aprano, May 6, 2007
    #9
  10. Alan Isaac

    Dustan Guest

    On May 6, 1:00 am, Steven D'Aprano
    <> wrote:
    > On Sun, 06 May 2007 00:20:04 +0000, Alan Isaac wrote:
    > >> Should you not expect to get the same result each time? Is that not
    > >> the point of setting a constant seed each time you run the script?

    >
    > > Yes. That is the problem.
    > > If I delete module2.pyc,
    > > I do not get the same result.

    >
    > I think you have missed what John Machin is pointing out. According to
    > your original description, you get different results even if you DON'T
    > delete module2.pyc.
    >
    > According to your original post, you get the _same_ behaviour the first
    > time you run the script, regardless of the pyc file being deleted or not.
    >
    > You wrote:
    >
    >
     
    Dustan, May 6, 2007
    #10
  11. Alan Isaac

    John Machin Guest

    On May 6, 9:41 pm, Dustan <> wrote:
    > On May 6, 1:00 am, Steven D'Aprano
    >
    >
    >
    > <> wrote:
    > > On Sun, 06 May 2007 00:20:04 +0000, Alan Isaac wrote:
    > > >> Should you not expect to get the same result each time? Is that not
    > > >> the point of setting a constant seed each time you run the script?

    >
    > > > Yes. That is the problem.
    > > > If I delete module2.pyc,
    > > > I do not get the same result.

    >
    > > I think you have missed what John Machin is pointing out. According to
    > > your original description, you get different results even if you DON'T
    > > delete module2.pyc.

    >
    > > According to your original post, you get the _same_ behaviour the first
    > > time you run the script, regardless of the pyc file being deleted or not.

    >
    > > You wrote:

    >
    > >
     
    John Machin, May 6, 2007
    #11
  12. Alan Isaac

    Alan Isaac Guest

    "Steven D'Aprano" <> wrote in message
    news:p...
    > If you want to send me the modules, I will have a look at them as well.
    > Many eyes make for shallow bugs...


    Dustan and John Machin have confirmed the
    apparent bug, and I have sent you the files.
    Explanation welcome!!

    Cheers,
    Alan Isaac
     
    Alan Isaac, May 8, 2007
    #12
  13. On Tue, 08 May 2007 02:12:27 +0000, Alan Isaac wrote:

    > "Steven D'Aprano" <> wrote in
    > message
    > news:p...
    >> If you want to send me the modules, I will have a look at them as well.
    >> Many eyes make for shallow bugs...

    >
    > Dustan and John Machin have confirmed the apparent bug, and I have sent
    > you the files. Explanation welcome!!


    My testing suggests the bug is *not* to do with pyc files at all. I'm
    getting different results when running the files, even when the directory
    is read-only (and therefore no pyc files can be created).

    My results suggest that setting the seed to the same value does NOT give
    identical results, *even though* the random number generator is giving
    the same results.

    So I think we can discount the issue being anything to do with either
    the .pyc files or the random number generator.


    --
    Steven.
     
    Steven D'Aprano, May 8, 2007
    #13
  14. Alan Isaac

    Alan Isaac Guest

    "Steven D'Aprano" <> wrote in message
    news:p...
    > My testing suggests the bug is *not* to do with pyc files at all. I'm
    > getting different results when running the files, even when the directory
    > is read-only (and therefore no pyc files can be created).
    >
    > My results suggest that setting the seed to the same value does NOT give
    > identical results, *even though* the random number generator is giving
    > the same results.
    >
    > So I think we can discount the issue being anything to do with either
    > the .pyc files or the random number generator.



    I do not know how Python handles your use of a readonly directory.
    What I have seen is:

    - when a test1.pyc file is present, I always get the
    same outcome (result1)
    - when a test1.pyc file is NOT present, I always get
    the same outcome (result2)
    - the two outcomes are different (result1 != result2)

    Do you see something different than this if you run the
    test as I suggested? If not, how can in not involve the
    ..pyc file (in some sense)?

    Cheers,
    Alan Isaac
     
    Alan Isaac, May 8, 2007
    #14
  15. En Tue, 08 May 2007 14:59:27 -0300, Alan Isaac <>
    escribió:

    > What I have seen is:
    >
    > - when a test1.pyc file is present, I always get the
    > same outcome (result1)
    > - when a test1.pyc file is NOT present, I always get
    > the same outcome (result2)
    > - the two outcomes are different (result1 != result2)


    I've logged all Random calls (it appears to be only one shuffle call,
    after the initial seed) and in both cases they get the same numbers. So
    the program always starts with the same "shuffled" values.

    Perhaps there is a tiny discrepancy in the marshal representation of some
    floating point values. When there is no .pyc, Python parses the literal
    from source; when a .pyc is found, Python loads the value from there; they
    could be slightly different.
    I'll investigate further... tomorrow.

    --
    Gabriel Genellina
     
    Gabriel Genellina, May 9, 2007
    #15
  16. Alan Isaac

    Peter Otten Guest

    Alan Isaac wrote:

    > This may seem very strange, but it is true.
    > If I delete a .pyc file, my program executes with a different state!


    > Can someone explain this to me?


    There is nothing wrong with the random module -- you get the same numbers on
    every run. When there is no pyc-file Python uses some RAM to create it and
    therefore your GridPlayer instances are located in different memory
    locations and get different hash values. This in turn affects the order in
    which they occur when you iterate over the GridPlayer.players_played set.

    Here is a minimal example:

    import test # sic

    class T:
    def __init__(self, name):
    self.name = name
    def __repr__(self):
    return "T(name=%r)" % self.name

    if __name__ == "__main__":
    print set(T(i) for i in range(4))

    $ python2.5 test.py
    set([T(name=2), T(name=1), T(name=0), T(name=3)])
    $ python2.5 test.py
    set([T(name=3), T(name=1), T(name=0), T(name=2)])
    $ python2.5 test.py
    set([T(name=3), T(name=1), T(name=0), T(name=2)])
    $ rm test.pyc
    $ python2.5 test.py
    set([T(name=2), T(name=1), T(name=0), T(name=3)])

    Peter
     
    Peter Otten, May 9, 2007
    #16
  17. Alan Isaac

    Alan Isaac Guest

    "Peter Otten" <> wrote in message
    news:f1rt61$kfg$03$-online.com...
    > Alan Isaac wrote:
    > There is nothing wrong with the random module -- you get the same numbers

    on
    > every run. When there is no pyc-file Python uses some RAM to create it and
    > therefore your GridPlayer instances are located in different memory
    > locations and get different hash values. This in turn affects the order in
    > which they occur when you iterate over the GridPlayer.players_played set.


    Thanks!!
    This also explains Steven's results.

    If I sort the set before iterating over it,
    the "anomaly" disappears.

    This means that currently the use of sets
    (and, I assume, dictionaries) as iterators
    compromises replicability. Is that a fair
    statement?

    For me (and apparently for a few others)
    this was a very subtle problem. Is there
    a warning anywhere in the docs? Should
    there be?

    Thanks again!!

    Alan Isaac
     
    Alan Isaac, May 9, 2007
    #17
  18. Alan Isaac wrote:

    >
    > "Peter Otten" <> wrote in message
    > news:f1rt61$kfg$03$-online.com...
    >> Alan Isaac wrote:
    >> There is nothing wrong with the random module -- you get the same numbers

    > on
    >> every run. When there is no pyc-file Python uses some RAM to create it
    >> and therefore your GridPlayer instances are located in different memory
    >> locations and get different hash values. This in turn affects the order
    >> in which they occur when you iterate over the GridPlayer.players_played
    >> set.

    >
    > Thanks!!
    > This also explains Steven's results.
    >
    > If I sort the set before iterating over it,
    > the "anomaly" disappears.
    >
    > This means that currently the use of sets
    > (and, I assume, dictionaries) as iterators
    > compromises replicability. Is that a fair
    > statement?


    Yes.

    > For me (and apparently for a few others)
    > this was a very subtle problem. Is there
    > a warning anywhere in the docs? Should
    > there be?


    Not really, but that depends on what you know about the concept of sets and
    maps as collections of course.

    The contract for sets and dicts doesn't imply any order whatsoever. Which is
    essentially the reason why

    set(xrange(10))[0]

    doesn't exist, and quite a few times cries for an ordered dictionary as part
    of the standard libraries was made.

    Diez
     
    Diez B. Roggisch, May 9, 2007
    #18
  19. Alan Isaac

    Alan G Isaac Guest

    Diez B. Roggisch wrote:
    > Not really, but that depends on what you know about the concept of sets and
    > maps as collections of course.
    >
    > The contract for sets and dicts doesn't imply any order whatsoever. Which is
    > essentially the reason why
    >
    > set(xrange(10))[0]
    >
    > doesn't exist, and quite a few times cries for an ordered dictionary as part
    > of the standard libraries was made.



    It seems to me that you are missing the point,
    but maybe I am missing your point.

    The question of whether a set or dict guarantees
    some order seems quite different from the question
    of whether rerunning an **unchanged program** yields the
    **unchanged results**. The latter question is the question
    of replicability.

    Again I point out that some sophisticated users
    (among which I am not numbering myself) did not
    see into the source of this "anomaly". This
    suggests that an explicit warning is warranted.

    Cheers,
    Alan Isaac

    PS I know ordered dicts are under discussion;
    what about ordered sets?
     
    Alan G Isaac, May 9, 2007
    #19
  20. Alan Isaac

    Robert Kern Guest

    Alan G Isaac wrote:
    > Diez B. Roggisch wrote:
    >> Not really, but that depends on what you know about the concept of sets and
    >> maps as collections of course.
    >>
    >> The contract for sets and dicts doesn't imply any order whatsoever. Which is
    >> essentially the reason why
    >>
    >> set(xrange(10))[0]
    >>
    >> doesn't exist, and quite a few times cries for an ordered dictionary as part
    >> of the standard libraries was made.

    >
    > It seems to me that you are missing the point,
    > but maybe I am missing your point.
    >
    > The question of whether a set or dict guarantees
    > some order seems quite different from the question
    > of whether rerunning an **unchanged program** yields the
    > **unchanged results**. The latter question is the question
    > of replicability.
    >
    > Again I point out that some sophisticated users
    > (among which I am not numbering myself) did not
    > see into the source of this "anomaly". This
    > suggests that an explicit warning is warranted.


    http://docs.python.org/lib/typesmapping.html
    """
    Keys and values are listed in an arbitrary order which is non-random, varies
    across Python implementations, and depends on the dictionary's history of
    insertions and deletions.
    """

    The sets documentation is a bit less explicit, though.

    http://docs.python.org/lib/types-set.html
    """
    Like other collections, sets support x in set, len(set), and for x in set. Being
    an unordered collection, sets do not record element position or order of insertion.
    """

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, May 9, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ulf Heyder
    Replies:
    0
    Views:
    598
    Ulf Heyder
    Oct 16, 2003
  2. jimjim
    Replies:
    12
    Views:
    1,932
    Ron Natalie
    Jun 3, 2005
  3. tom c
    Replies:
    6
    Views:
    526
    tom c
    Sep 6, 2006
  4. globalrev
    Replies:
    4
    Views:
    823
    Gabriel Genellina
    Apr 20, 2008
  5. VK
    Replies:
    15
    Views:
    1,335
    Dr J R Stockton
    May 2, 2010
Loading...

Share This Page