cPickle - sharing pickled objects between scripts and imports

R

Rotwang

Hi all, I have a module that saves and loads data using cPickle, and
I've encountered a problem. Sometimes I want to import the module and
use it in the interactive Python interpreter, whereas sometimes I want
to run it as a script. But objects that have been pickled by running the
module as a script can't be correctly unpickled by the imported module
and vice-versa, since how they get pickled depends on whether the
module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
around this by adding the following to the module, before any calls to
cPickle.load:

if __name__ == '__main__':
import __main__
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == 'mymodule':
return getattr(__main__, c)
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
else:
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == '__main__':
return globals()[c]
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
cPickle.load = load
del load


It seems to work as far as I can tell, but I'll be grateful if anyone
knows of any circumstances where it would fail, or can suggest something
less hacky. Also, do cPickle.Pickler instances have some attribute
corresponding to find_global that lets one determine how instances get
pickled? I couldn't find anything about this in the docs.
 
P

Peter Otten

Rotwang said:
Hi all, I have a module that saves and loads data using cPickle, and
I've encountered a problem. Sometimes I want to import the module and
use it in the interactive Python interpreter, whereas sometimes I want
to run it as a script. But objects that have been pickled by running the
module as a script can't be correctly unpickled by the imported module
and vice-versa, since how they get pickled depends on whether the
module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
around this by adding the following to the module, before any calls to
cPickle.load:

if __name__ == '__main__':
import __main__
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == 'mymodule':
return getattr(__main__, c)
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
else:
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == '__main__':
return globals()[c]
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
cPickle.load = load
del load


It seems to work as far as I can tell, but I'll be grateful if anyone
knows of any circumstances where it would fail, or can suggest something
less hacky. Also, do cPickle.Pickler instances have some attribute
corresponding to find_global that lets one determine how instances get
pickled? I couldn't find anything about this in the docs.

if __name__ == "__main__":
from mymodule import *

But I think it would be cleaner to move the classes you want to pickle into
another module and import that either from your main script or the
interpreter. That may also spare you some fun with unexpected isinstance()
results.
 
D

Dave Angel

Rotwang said:
Hi all, I have a module that saves and loads data using cPickle, and
I've encountered a problem. Sometimes I want to import the module and
use it in the interactive Python interpreter, whereas sometimes I want
to run it as a script. But objects that have been pickled by running the
module as a script can't be correctly unpickled by the imported module
and vice-versa, since how they get pickled depends on whether the
module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
around this by adding the following to the module, before any calls to
cPickle.load:

if __name__ == '__main__':
import __main__
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == 'mymodule':
return getattr(__main__, c)
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
else:
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == '__main__':
return globals()[c]
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
cPickle.load = load
del load


It seems to work as far as I can tell, but I'll be grateful if anyone
knows of any circumstances where it would fail, or can suggest something
less hacky. Also, do cPickle.Pickler instances have some attribute
corresponding to find_global that lets one determine how instances get
pickled? I couldn't find anything about this in the docs.
if __name__ == "__main__":
from mymodule import *

But I think it would be cleaner to move the classes you want to pickle into
another module and import that either from your main script or the
interpreter. That may also spare you some fun with unexpected isinstance()
results.



I would second the choice to just move the code to a separately loaded
module, and let your script simply consist of an import and a call into
that module.

It can be very dangerous to have the same module imported two different
ways (as __main__ and as mymodule), so i'd avoid anything that came
close to that notion.

Your original problem is probably that you have classes with two leading
underscores, which causes the names to be mangled with the module name.
You could simply remove one of the underscores for all such names, and
see if the pickle problem goes away.
 
R

Rotwang

Rotwang said:
Hi all, I have a module that saves and loads data using cPickle, and
I've encountered a problem. Sometimes I want to import the module and
use it in the interactive Python interpreter, whereas sometimes I want
to run it as a script. But objects that have been pickled by running the
module as a script can't be correctly unpickled by the imported module
and vice-versa, since how they get pickled depends on whether the
module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
around this by adding the following to the module, before any calls to
cPickle.load:

if __name__ == '__main__':
import __main__
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == 'mymodule':
return getattr(__main__, c)
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
else:
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == '__main__':
return globals()[c]
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
cPickle.load = load
del load


It seems to work as far as I can tell, but I'll be grateful if anyone
knows of any circumstances where it would fail, or can suggest something
less hacky. Also, do cPickle.Pickler instances have some attribute
corresponding to find_global that lets one determine how instances get
pickled? I couldn't find anything about this in the docs.

if __name__ == "__main__":
from mymodule import *

But I think it would be cleaner to move the classes you want to pickle into
another module and import that either from your main script or the
interpreter. That may also spare you some fun with unexpected isinstance()
results.

Thanks.
 
R

Rotwang

Rotwang said:
Hi all, I have a module that saves and loads data using cPickle, and
I've encountered a problem. Sometimes I want to import the module and
use it in the interactive Python interpreter, whereas sometimes I want
to run it as a script. But objects that have been pickled by running the
module as a script can't be correctly unpickled by the imported module
and vice-versa, since how they get pickled depends on whether the
module's __name__ is '__main__' or 'mymodule' (say). I've tried to get
around this by adding the following to the module, before any calls to
cPickle.load:

if __name__ == '__main__':
import __main__
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == 'mymodule':
return getattr(__main__, c)
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
else:
def load(f):
p = cPickle.Unpickler(f)
def fg(m, c):
if m == '__main__':
return globals()[c]
else:
m = __import__(m, fromlist = [c])
return getattr(m, c)
p.find_global = fg
return p.load()
cPickle.load = load
del load


It seems to work as far as I can tell, but I'll be grateful if anyone
knows of any circumstances where it would fail, or can suggest something
less hacky. Also, do cPickle.Pickler instances have some attribute
corresponding to find_global that lets one determine how instances get
pickled? I couldn't find anything about this in the docs.
if __name__ == "__main__":
from mymodule import *

But I think it would be cleaner to move the classes you want to pickle into
another module and import that either from your main script or the
interpreter. That may also spare you some fun with unexpected isinstance()
results.



I would second the choice to just move the code to a separately loaded
module, and let your script simply consist of an import and a call into
that module.

It can be very dangerous to have the same module imported two different
ways (as __main__ and as mymodule), so i'd avoid anything that came
close to that notion.

OK, thanks.

Your original problem is probably that you have classes with two leading
underscores, which causes the names to be mangled with the module name.
You could simply remove one of the underscores for all such names, and
see if the pickle problem goes away.

No, I don't have any such classes. The problem is that if the object was
pickled by the module run as a script and then unpickled by the imported
module, the unpickler looks in __main__ rather than mymodule for the
object's class, and doesn't find it. Conversely if the object was
pickled by the imported module and then unpickled by the module run as a
script then the unpickler reloads the module and makes objects
referenced by the original object into instances of
mymodule.oneofmyclasses, whereas (for reasons unknown to me) the object
itself is an instance of __main__.anotheroneofmyclasses. This means that
any method of anotheroneofmyclasses that calls isinstance(attribute,
oneofmyclasses) doesn't work the way it should.
 
S

Steven D'Aprano

The problem is that if the object was
pickled by the module run as a script and then unpickled by the imported
module, the unpickler looks in __main__ rather than mymodule for the
object's class, and doesn't find it.

Possibly the solution is as simple as aliasing your module and __main__.
Untested:

# When running as a script
import __main__
sys['mymodule'] = __main__


# When running interactively
import mymodule
__main__ = mymodule


of some variation thereof.

Note that a full solution to this problem actually requires you to deal
with three cases:

1) interactive interpreter, __main__ normally would be the interpreter
global scope

2) running as a script, __main__ is your script

3) imported into another module which is running as a script, __main__
would be that module.

In the last case, monkey-patching __main__ may very well break that
script.
 
R

Rotwang

The problem is that if the object was
pickled by the module run as a script and then unpickled by the imported
module, the unpickler looks in __main__ rather than mymodule for the
object's class, and doesn't find it.

Possibly the solution is as simple as aliasing your module and __main__.
Untested:

# When running as a script
import __main__
sys['mymodule'] = __main__

??? What is "sys" here?

# When running interactively
import mymodule
__main__ = mymodule


of some variation thereof.

Note that a full solution to this problem actually requires you to deal
with three cases:

1) interactive interpreter, __main__ normally would be the interpreter
global scope

2) running as a script, __main__ is your script

3) imported into another module which is running as a script, __main__
would be that module.

I had not thought of that.

In the last case, monkey-patching __main__ may very well break that
script.

My original solution will also cause problems in this case. Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top