A bug in cPickle?

V

Victor Kryukov

Hello list,

The following behavior is completely unexpected. Is it a bug or a by-
design feature?

Regards,
Victor.

-----------------

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799')==dumps(str(1001799))
print cdumps('1001799')==cdumps(str(1001799))
True
False

vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright", "credits" or "license" for more information.vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
vicbook:~ victor$
 
C

Chris Cioffi

Hello list,

The following behavior is completely unexpected. Is it a bug or a by-
design feature?

Regards,
Victor.

-----------------

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799')==dumps(str(1001799))
print cdumps('1001799')==cdumps(str(1001799))

True
False


Python 2.4 gives the same behavior on Windows:

ActivePython 2.4.3 Build 12 (ActiveState Software Inc.) based on
Python 2.4.3 (#69, Apr 11 2006, 15:32:42) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
S'1001799'
p1
.. S'1001799'
.. S'1001799'
p0
..
S'1001799'
p0
..

This does seem odd, at the very least.

Chris
 
I

infidel

ActivePython 2.5.1.1 as well:

PythonWin 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit
(Intel)] on win32.
Portions Copyright 1994-2006 Mark Hammond - see 'Help/About PythonWin'
for further copyright information.
S'10'
p0
.. S'10'
p0
.. S'10'
p1
..
S'10'
..
 
N

Nick Craig-Wood

Victor Kryukov said:
The following behavior is completely unexpected. Is it a bug or a by-
design feature?

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799')==dumps(str(1001799))
print cdumps('1001799')==cdumps(str(1001799))

True
False

Does it matter since it is decoded properly?
 
F

Facundo Batista

Victor Kryukov wrote:

The following behavior is completely unexpected. Is it a bug or a by-
design feature?

...

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799')==dumps(str(1001799))
print cdumps('1001799')==cdumps(str(1001799))

It's a feature, the behaviour is described in the documentation:

"""
Since the pickle data format is actually a tiny stack-oriented
programming language, and some freedom is taken in the encodings of
certain objects, it is possible that the two modules produce different
data streams for the same input objects. However it is guaranteed that
they will always be able to read each other's data streams
"""
True

Regards,
 
G

Guest

This does seem odd, at the very least.

The differences between the pn codes comes from this comment in cPickle.c:

/* Make sure memo keys are positive! */
/* XXX Why?
* XXX And does "positive" really mean non-negative?
* XXX pickle.py starts with PUT index 0, not 1. This makes for
* XXX gratuitous differences between the pickling modules.
*/
p++;

The second difference (where sometimes p1 is written and sometimes not)
comes from this block in put:

if (ob->ob_refcnt < 2 || self->fast)
return 0;

Here, a reference to the object is only marshalled if the object has
more than one reference to it. The string literal does; the dynamically
computed string does not. If there is only one reference to an object,
there is no need to store it in the memo, as it can't possibly be
referenced later on.

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top