cPickle - sharing pickled objects between scripts and imports

Discussion in 'Python' started by Rotwang, Jun 23, 2012.

  1. Rotwang

    Rotwang Guest

    Hi all, I have a module that saves and loads data using cPickle, and
    I've encountered a problem. Sometimes I want to import the module and
    use it in the interactive Python interpreter, whereas sometimes I want
    to run it as a script. But objects that have been pickled by running the
    module as a script can't be correctly unpickled by the imported module
    and vice-versa, since how they get pickled depends on whether the
    module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
    around this by adding the following to the module, before any calls to
    cPickle.load:

    if __name__ == '__main__':
    import __main__
    def load(f):
    p = cPickle.Unpickler(f)
    def fg(m, c):
    if m == 'mymodule':
    return getattr(__main__, c)
    else:
    m = __import__(m, fromlist = [c])
    return getattr(m, c)
    p.find_global = fg
    return p.load()
    else:
    def load(f):
    p = cPickle.Unpickler(f)
    def fg(m, c):
    if m == '__main__':
    return globals()[c]
    else:
    m = __import__(m, fromlist = [c])
    return getattr(m, c)
    p.find_global = fg
    return p.load()
    cPickle.load = load
    del load


    It seems to work as far as I can tell, but I'll be grateful if anyone
    knows of any circumstances where it would fail, or can suggest something
    less hacky. Also, do cPickle.Pickler instances have some attribute
    corresponding to find_global that lets one determine how instances get
    pickled? I couldn't find anything about this in the docs.


    --
    Hate music? Then you'll hate this:

    http://tinyurl.com/psymix
    Rotwang, Jun 23, 2012
    #1
    1. Advertising

  2. Rotwang

    Peter Otten Guest

    Rotwang wrote:

    > Hi all, I have a module that saves and loads data using cPickle, and
    > I've encountered a problem. Sometimes I want to import the module and
    > use it in the interactive Python interpreter, whereas sometimes I want
    > to run it as a script. But objects that have been pickled by running the
    > module as a script can't be correctly unpickled by the imported module
    > and vice-versa, since how they get pickled depends on whether the
    > module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
    > around this by adding the following to the module, before any calls to
    > cPickle.load:
    >
    > if __name__ == '__main__':
    > import __main__
    > def load(f):
    > p = cPickle.Unpickler(f)
    > def fg(m, c):
    > if m == 'mymodule':
    > return getattr(__main__, c)
    > else:
    > m = __import__(m, fromlist = [c])
    > return getattr(m, c)
    > p.find_global = fg
    > return p.load()
    > else:
    > def load(f):
    > p = cPickle.Unpickler(f)
    > def fg(m, c):
    > if m == '__main__':
    > return globals()[c]
    > else:
    > m = __import__(m, fromlist = [c])
    > return getattr(m, c)
    > p.find_global = fg
    > return p.load()
    > cPickle.load = load
    > del load
    >
    >
    > It seems to work as far as I can tell, but I'll be grateful if anyone
    > knows of any circumstances where it would fail, or can suggest something
    > less hacky. Also, do cPickle.Pickler instances have some attribute
    > corresponding to find_global that lets one determine how instances get
    > pickled? I couldn't find anything about this in the docs.


    if __name__ == "__main__":
    from mymodule import *

    But I think it would be cleaner to move the classes you want to pickle into
    another module and import that either from your main script or the
    interpreter. That may also spare you some fun with unexpected isinstance()
    results.
    Peter Otten, Jun 23, 2012
    #2
    1. Advertising

  3. Rotwang

    Dave Angel Guest

    On 06/23/2012 12:13 PM, Peter Otten wrote:
    > Rotwang wrote:
    >
    >> Hi all, I have a module that saves and loads data using cPickle, and
    >> I've encountered a problem. Sometimes I want to import the module and
    >> use it in the interactive Python interpreter, whereas sometimes I want
    >> to run it as a script. But objects that have been pickled by running the
    >> module as a script can't be correctly unpickled by the imported module
    >> and vice-versa, since how they get pickled depends on whether the
    >> module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
    >> around this by adding the following to the module, before any calls to
    >> cPickle.load:
    >>
    >> if __name__ == '__main__':
    >> import __main__
    >> def load(f):
    >> p = cPickle.Unpickler(f)
    >> def fg(m, c):
    >> if m == 'mymodule':
    >> return getattr(__main__, c)
    >> else:
    >> m = __import__(m, fromlist = [c])
    >> return getattr(m, c)
    >> p.find_global = fg
    >> return p.load()
    >> else:
    >> def load(f):
    >> p = cPickle.Unpickler(f)
    >> def fg(m, c):
    >> if m == '__main__':
    >> return globals()[c]
    >> else:
    >> m = __import__(m, fromlist = [c])
    >> return getattr(m, c)
    >> p.find_global = fg
    >> return p.load()
    >> cPickle.load = load
    >> del load
    >>
    >>
    >> It seems to work as far as I can tell, but I'll be grateful if anyone
    >> knows of any circumstances where it would fail, or can suggest something
    >> less hacky. Also, do cPickle.Pickler instances have some attribute
    >> corresponding to find_global that lets one determine how instances get
    >> pickled? I couldn't find anything about this in the docs.

    > if __name__ == "__main__":
    > from mymodule import *
    >
    > But I think it would be cleaner to move the classes you want to pickle into
    > another module and import that either from your main script or the
    > interpreter. That may also spare you some fun with unexpected isinstance()
    > results.
    >
    >




    I would second the choice to just move the code to a separately loaded
    module, and let your script simply consist of an import and a call into
    that module.

    It can be very dangerous to have the same module imported two different
    ways (as __main__ and as mymodule), so i'd avoid anything that came
    close to that notion.

    Your original problem is probably that you have classes with two leading
    underscores, which causes the names to be mangled with the module name.
    You could simply remove one of the underscores for all such names, and
    see if the pickle problem goes away.




    --

    DaveA
    Dave Angel, Jun 23, 2012
    #3
  4. Rotwang

    Rotwang Guest

    On 23/06/2012 17:13, Peter Otten wrote:
    > Rotwang wrote:
    >
    >> Hi all, I have a module that saves and loads data using cPickle, and
    >> I've encountered a problem. Sometimes I want to import the module and
    >> use it in the interactive Python interpreter, whereas sometimes I want
    >> to run it as a script. But objects that have been pickled by running the
    >> module as a script can't be correctly unpickled by the imported module
    >> and vice-versa, since how they get pickled depends on whether the
    >> module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
    >> around this by adding the following to the module, before any calls to
    >> cPickle.load:
    >>
    >> if __name__ == '__main__':
    >> import __main__
    >> def load(f):
    >> p = cPickle.Unpickler(f)
    >> def fg(m, c):
    >> if m == 'mymodule':
    >> return getattr(__main__, c)
    >> else:
    >> m = __import__(m, fromlist = [c])
    >> return getattr(m, c)
    >> p.find_global = fg
    >> return p.load()
    >> else:
    >> def load(f):
    >> p = cPickle.Unpickler(f)
    >> def fg(m, c):
    >> if m == '__main__':
    >> return globals()[c]
    >> else:
    >> m = __import__(m, fromlist = [c])
    >> return getattr(m, c)
    >> p.find_global = fg
    >> return p.load()
    >> cPickle.load = load
    >> del load
    >>
    >>
    >> It seems to work as far as I can tell, but I'll be grateful if anyone
    >> knows of any circumstances where it would fail, or can suggest something
    >> less hacky. Also, do cPickle.Pickler instances have some attribute
    >> corresponding to find_global that lets one determine how instances get
    >> pickled? I couldn't find anything about this in the docs.

    >
    > if __name__ == "__main__":
    > from mymodule import *
    >
    > But I think it would be cleaner to move the classes you want to pickle into
    > another module and import that either from your main script or the
    > interpreter. That may also spare you some fun with unexpected isinstance()
    > results.


    Thanks.

    --
    Hate music? Then you'll hate this:

    http://tinyurl.com/psymix
    Rotwang, Jun 23, 2012
    #4
  5. Rotwang

    Rotwang Guest

    On 23/06/2012 18:31, Dave Angel wrote:
    > On 06/23/2012 12:13 PM, Peter Otten wrote:
    >> Rotwang wrote:
    >>
    >>> Hi all, I have a module that saves and loads data using cPickle, and
    >>> I've encountered a problem. Sometimes I want to import the module and
    >>> use it in the interactive Python interpreter, whereas sometimes I want
    >>> to run it as a script. But objects that have been pickled by running the
    >>> module as a script can't be correctly unpickled by the imported module
    >>> and vice-versa, since how they get pickled depends on whether the
    >>> module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
    >>> around this by adding the following to the module, before any calls to
    >>> cPickle.load:
    >>>
    >>> if __name__ == '__main__':
    >>> import __main__
    >>> def load(f):
    >>> p = cPickle.Unpickler(f)
    >>> def fg(m, c):
    >>> if m == 'mymodule':
    >>> return getattr(__main__, c)
    >>> else:
    >>> m = __import__(m, fromlist = [c])
    >>> return getattr(m, c)
    >>> p.find_global = fg
    >>> return p.load()
    >>> else:
    >>> def load(f):
    >>> p = cPickle.Unpickler(f)
    >>> def fg(m, c):
    >>> if m == '__main__':
    >>> return globals()[c]
    >>> else:
    >>> m = __import__(m, fromlist = [c])
    >>> return getattr(m, c)
    >>> p.find_global = fg
    >>> return p.load()
    >>> cPickle.load = load
    >>> del load
    >>>
    >>>
    >>> It seems to work as far as I can tell, but I'll be grateful if anyone
    >>> knows of any circumstances where it would fail, or can suggest something
    >>> less hacky. Also, do cPickle.Pickler instances have some attribute
    >>> corresponding to find_global that lets one determine how instances get
    >>> pickled? I couldn't find anything about this in the docs.

    >> if __name__ == "__main__":
    >> from mymodule import *
    >>
    >> But I think it would be cleaner to move the classes you want to pickle into
    >> another module and import that either from your main script or the
    >> interpreter. That may also spare you some fun with unexpected isinstance()
    >> results.
    >>
    >>

    >
    >
    >
    > I would second the choice to just move the code to a separately loaded
    > module, and let your script simply consist of an import and a call into
    > that module.
    >
    > It can be very dangerous to have the same module imported two different
    > ways (as __main__ and as mymodule), so i'd avoid anything that came
    > close to that notion.


    OK, thanks.


    > Your original problem is probably that you have classes with two leading
    > underscores, which causes the names to be mangled with the module name.
    > You could simply remove one of the underscores for all such names, and
    > see if the pickle problem goes away.


    No, I don't have any such classes. The problem is that if the object was
    pickled by the module run as a script and then unpickled by the imported
    module, the unpickler looks in __main__ rather than mymodule for the
    object's class, and doesn't find it. Conversely if the object was
    pickled by the imported module and then unpickled by the module run as a
    script then the unpickler reloads the module and makes objects
    referenced by the original object into instances of
    mymodule.oneofmyclasses, whereas (for reasons unknown to me) the object
    itself is an instance of __main__.anotheroneofmyclasses. This means that
    any method of anotheroneofmyclasses that calls isinstance(attribute,
    oneofmyclasses) doesn't work the way it should.

    --
    Hate music? Then you'll hate this:

    http://tinyurl.com/psymix
    Rotwang, Jun 23, 2012
    #5
  6. On Sat, 23 Jun 2012 19:14:43 +0100, Rotwang wrote:

    > The problem is that if the object was
    > pickled by the module run as a script and then unpickled by the imported
    > module, the unpickler looks in __main__ rather than mymodule for the
    > object's class, and doesn't find it.


    Possibly the solution is as simple as aliasing your module and __main__.
    Untested:

    # When running as a script
    import __main__
    sys['mymodule'] = __main__


    # When running interactively
    import mymodule
    __main__ = mymodule


    of some variation thereof.

    Note that a full solution to this problem actually requires you to deal
    with three cases:

    1) interactive interpreter, __main__ normally would be the interpreter
    global scope

    2) running as a script, __main__ is your script

    3) imported into another module which is running as a script, __main__
    would be that module.

    In the last case, monkey-patching __main__ may very well break that
    script.


    --
    Steven
    Steven D'Aprano, Jun 24, 2012
    #6
  7. Rotwang

    Rotwang Guest

    On 24/06/2012 00:17, Steven D'Aprano wrote:
    > On Sat, 23 Jun 2012 19:14:43 +0100, Rotwang wrote:
    >
    >> The problem is that if the object was
    >> pickled by the module run as a script and then unpickled by the imported
    >> module, the unpickler looks in __main__ rather than mymodule for the
    >> object's class, and doesn't find it.

    >
    > Possibly the solution is as simple as aliasing your module and __main__.
    > Untested:
    >
    > # When running as a script
    > import __main__
    > sys['mymodule'] = __main__


    ??? What is "sys" here?


    > # When running interactively
    > import mymodule
    > __main__ = mymodule
    >
    >
    > of some variation thereof.
    >
    > Note that a full solution to this problem actually requires you to deal
    > with three cases:
    >
    > 1) interactive interpreter, __main__ normally would be the interpreter
    > global scope
    >
    > 2) running as a script, __main__ is your script
    >
    > 3) imported into another module which is running as a script, __main__
    > would be that module.


    I had not thought of that.


    > In the last case, monkey-patching __main__ may very well break that
    > script.


    My original solution will also cause problems in this case. Thanks.

    --
    Hate music? Then you'll hate this:

    http://tinyurl.com/psymix
    Rotwang, Jun 25, 2012
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ryan Grow
    Replies:
    1
    Views:
    331
    Tim Keating
    Nov 16, 2004
  2. Bram Stolk
    Replies:
    0
    Views:
    249
    Bram Stolk
    Dec 12, 2005
  3. krishnakant Mane
    Replies:
    2
    Views:
    263
    Daniele Varrazzo
    May 7, 2007
  4. Oltmans
    Replies:
    3
    Views:
    239
    Grant Edwards
    May 26, 2010
  5. Victor Hooi
    Replies:
    1
    Views:
    98
    Devin Jeanpierre
    Nov 25, 2013
Loading...

Share This Page