marshal bug?

A

Anurag

I have been chasing a problem in my code since hours and it bolis down
to this
import marshal
marshal.dumps(str(123)) != marshal.dumps(str("123"))

Can someone please tell me why?
when
str(123) == str("123")

or are they different?

it also means that
if s = str(123)
marshal.dumps(s) != marshal.dumps(marshal.loads(marshal.dumps(s)))

rgds
Anurag
 
G

Gary Herron

Anurag said:
I have been chasing a problem in my code since hours and it bolis down
to this
import marshal
marshal.dumps(str(123)) != marshal.dumps(str("123"))

Can someone please tell me why?
when
str(123) == str("123")

or are they different?

it also means that
if s = str(123)
marshal.dumps(s) != marshal.dumps(marshal.loads(marshal.dumps(s)))

rgds
Anurag

Any string in Python can be "interned" or not, the difference being
how/where the value is stored internally. The marshal module includes
such information in its output. What you are doing is probably
considered a misuse of the marshal module. I'd suggest using the pickle
(or cPickle) modules instead.

Here's the relevant part of the manual for marshal:

version:
Indicates the format that the module uses. Version 0 is the
historical format, version 1 (added in Python 2.4) shares interned
strings and version 2 (added in Python 2.5) uses a binary format for
floating point numbers. The current version is 2









Gary Herron.
 
D

David

I have been chasing a problem in my code since hours and it bolis down
to this
import marshal
marshal.dumps(str(123)) != marshal.dumps(str("123"))

I'm not sure why, but marshal does dump the 2 differently. ie:
's\x03\x00\x00\x00123'
't\x03\x00\x00\x00123'

Somehow, marshal detects a difference between these 2 and changes the
first character in it's output. From the documentation:

"Details of the format are undocumented on purpose; it may change
between Python versions (although it rarely does)".

So you could check the C code to find out what the "s" and "t" mean,
but your assumptions on the format for your app could become incorrect
with new Python versions, or with different marshalling backends.
 
A

Anurag

Thanks for the reply.

It seems problem is due to
"""
Any string in Python can be "interned" or not, the difference being
how/where the value is stored internally. The marshal module includes
such information in its output. What you are doing is probably
considered a misuse of the marshal module. I'd suggest using the
pickle
(or cPickle) modules instead.
"""
as mentioned by Gary Herron

Now is there a easy way to by pass it (hack around it)
I tried various options but all fail e.g.
i= 123; marshal.dumps("%d"%123) != marshal.dumps("%d"%i)

So maybe I have to change all my code to use pickle, which also
consumes for memory per string.
 
D

David

I'm not sure why, but marshal does dump the 2 differently. ie:
't\x03\x00\x00\x00123'

I've just checked the source [1].

's' refers to a regular string, 't' refers to an interned[2] string.

In other words the difference between the 2 marshal dumps is an
internal python implementation detail. You probably shouldn't be
comparing marshal dumps in your app in the first place.

[1] http://coverage.livinglogic.de/Python/marshal.c.html. (Actually, I
checked the downloaded bz2, but this is the only URL for marshal.c I
could find)
[2] http://mindprod.com/jgloss/interned.html
 
G

Gabriel Genellina

Now is there a easy way to by pass it (hack around it)
I tried various options but all fail e.g.
i= 123; marshal.dumps("%d"%123) != marshal.dumps("%d"%i)

You can't. Don't use marshal to compare objects. You appear to assume that
two objects that compare equal, should have the same marshalled
representation, and that is simply not true.

py> [1, 2.0, 3+0j, "4"] == [1.0, 2+0j, 3, u"4"]
True
So maybe I have to change all my code to use pickle, which also
consumes for memory per string.

Neither marshal nor pickle guarantee that equal objects have equal
representations, so in this regard pickle won't help either.
Maybe if you explain what you really want to do someone could suggest a
solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top