Mixing protocols in pickles?

E

Erik Max Francis

Is there any prohibition against mixing different protocols within the
same pickle? I don't see anything about this in the Python Library
Reference and, after all, the pickle.dump function takes a protocol
argument for each time it's called. (This is in Python 2.3.3.)

I have a pickle containing two objects: a tag string and a (large)
object containing many children. The identifying string is there so
that you can unpickle it and decide whether you really want to unpickle
the whole data object.

So I thought it would be clever to write the tag string with protocol 0
so it would show up in a file viewer as plain text and then write the
rest of the data with protocol 1 (or 2; it doesn't use new-style
classes, though). I open the file in binary mode and then dump the tag
string in protocol 0, then the (big) instance data in protocol 1. When
loading time comes around, they're just both loaded in the same order:

def load(filename=DEFAULT_FILENAME):
try:
inputFile = gzip.GzipFile(filename + COMPRESSED_EXTENSION, 'rb')
except IOError:
inputFile = file(filename, 'rb')
tag = pickle.load(inputFile)
if DEBUG:
print >> sys.stderr, "Tag: %s" % tag
system = pickle.load(inputFile)
inputFile.close()
return system

def save(system, tag, filename=DEFAULT_FILENAME, protocol=1,
compressed=False):
if compressed:
outputFile = gzip.GzipFile(filename + COMPRESSED_EXTENSION,
'wb')
else:
outputFile = file(filename, 'wb')
pickle.dump(tag, outputFile, 0) # write the tag in text
pickle.dump(system, outputFile, protocol)
outputFile.close()

This works fine on Unix, but on Windows it generates the (utterly
puzzling) error:

C:\My Documents\botec-0.1x1>python -i ./botex.py
Tag: default.botec:20040110:11175:ebec37a7632cc7176ff359a3754750ec:0.1x1
Traceback (most recent call last):
File "./botex.py", line 70, in ?
SYSTEM = init()
File "./botex.py", line 15, in init
return load()
File "C:\My Documents\botec-0.1x1\botec.py", line 1666, in load
system = pickle.load(inputFile)
ImportError: No Module named copy_reg
.... made extra puzzling because you can type `import copy_reg' on that
prompt and it will import fine. Googling for this error comes up with a
few scattered complaints but nothing coherent about the cause. When I
modify the dumping routine to use the same protocol (1) for both dumps,
the problem goes away and everything works fine.

So I guess there are a few questions: Why is the error being generated
so obscure and seemingly incorrect? Is it really the case that mixing
multiple protocols within the same pickle isn't allowed, or is this
truly a bug (or, say, a Windows-specific problem because protocol 0
requires that the pickle file be in text mode?)?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,114
Latest member
GlucoPremiumReview
Top