Securing 'pickle'

B

Ben Finney

I'm writing a web app framework which stores pickles in client
cookies.

Sounds like a waste of bandwidth, in addition to the security concerns
you raise.

Why not store the pickles on the server, and set a session cookie to
refer to them? That way, you only send a short session ID instead of
the whole pickle, and messing with the cookie doesn't alter the pickles.

(Mmm, all this talk of food...)
 
D

David McNab

Hi,

I'm writing a web app framework which stores pickles in client cookies.

The obvious security risk is that some 5cr1p7 X1ddi35 will inevitably try
tampering with the cookie and malforming it in an attempt to get the
server-side python code to run arbitrary code, or something similarly
undesirable.

To protect against this, I've subclassed pickle.Unpickler, and added
overrides of the methods load_global, load_inst, load_obj and find_class.

My override methods simply raise exceptions unconditionally, which causes
any unpickle to fail if the pickle tries to unpack anything even
resembling code or an object.

I did this in preference to using the reputable 'bencode' module from
BitTorrent, because bencode doesn't support floats.

My question - have I done enough, or are there still ways where my hobbled
unpickler could be subverted by a malformed cookie?

Cheers
David
 
E

Erik Max Francis

Paul said:
Because now you need a mechanism to store the session info on the
server, and you might want it to work across multiple load-balanced
servers that fail over to one another, etc.

That's far superior to presenting the opportunity to exploits in the
first place, in my opinion. Depending on the contents of the contents
of that cookie, what you suggest may not be a problem at all (depending
on how critical the data contained therein is).
 
I

Ian Bicking

I'm writing a web app framework which stores pickles in client cookies.

The obvious security risk is that some 5cr1p7 X1ddi35 will inevitably try
tampering with the cookie and malforming it in an attempt to get the
server-side python code to run arbitrary code, or something similarly
undesirable.

To protect against this, I've subclassed pickle.Unpickler, and added
overrides of the methods load_global, load_inst, load_obj and find_class.

A much easier way to secure your pickle is to sign it, like:

cookie = dumps(object)
secret = 'really secret!'
hasher = md5.new()
hasher.update(secret)
hasher.update(cookie)
cookie_signature = md5.digest()

You may then wish to base64 encode both (.encode('base64')), pop them
into one value, and you're off. Though I suppose at that point you may
be hitting the maximum value of a cookie. Hidden fields will work
nicely, though.

Decoding and verifying is an exercise left to the reader.

Ian
 
P

Paul Rubin

Erik Max Francis said:
That's far superior to presenting the opportunity to exploits in the
first place, in my opinion. Depending on the contents of the contents
of that cookie, what you suggest may not be a problem at all (depending
on how critical the data contained therein is).

I'm not sure what you're saying here. My suggestion is to
authenticate the cookies with a cryptographic checksum and verify the
authentication before deserializing the cookies. That's probably the
simplest approach. Keeping session info on a multi-process server (or
worse, a multi-server network) needs some kind of concurrent storage
mechanism. I don't see a robust, secure, low-overhead way to do that
with out-of-the-box Python. Any suggestions?
 
I

Ian Bicking

That's far superior to presenting the opportunity to exploits in the
first place, in my opinion. Depending on the contents of the contents
of that cookie, what you suggest may not be a problem at all (depending
on how critical the data contained therein is).

Security isn't a big deal -- or rather, securing cookies isn't a big
deal. I think reliability will be a bigger problem. Cookies can cause
problems even when you are just storing a simple session ID. If you
start storing more information you're likely to run up against other
problems -- cookies can be hard to dispose of, who knows where they'll
get chopped off to preserve storage (it happens quickly), and IE has a
bug where you can't redirect and set a cookie at the same time, which
can really drive you crazy if you don't know about it.

Hidden fields are a much better way of keeping information on the
client. They tend to make for more navigable pages too. But if you
really want session, not transaction data, then you just need to figure
out server-side sessions. The biggest advantage of a web application is
that it runs in a controlled environment (the server) and you should
take advantage of that.

Ian
 
P

Paul Rubin

Ian Bicking said:
A much easier way to secure your pickle is to sign it, like:

cookie = dumps(object)
secret = 'really secret!'
hasher = md5.new()
hasher.update(secret)
hasher.update(cookie)
cookie_signature = md5.digest()

That method is vulnerable to an "appending" attack against md5. I'll
spare the gory details, but you should call md5 through the HMAC
module to make the signature instead of using md5 directly. HMAC is
designed to stop that attack.
You may then wish to base64 encode both (.encode('base64')), pop them
into one value, and you're off. Though I suppose at that point you may
be hitting the maximum value of a cookie. Hidden fields will work
nicely, though.

You could split the session info into several cookies, but in that
situation you should authenticate the whole cookie set with a single
signature. Otherwise someone could paste together several cookies
from separate sessions, and possibly confuse your server.
 
D

Dave Cole

A much easier way to secure your pickle is to sign it, like:

cookie = dumps(object)
secret = 'really secret!'
hasher = md5.new()
hasher.update(secret)
hasher.update(cookie)
cookie_signature = md5.digest()

You may then wish to base64 encode both (.encode('base64')), pop
them into one value, and you're off. Though I suppose at that point
you may be hitting the maximum value of a cookie. Hidden fields
will work nicely, though.

Decoding and verifying is an exercise left to the reader.

That is exactly what Albatross does with pickles sent to the browser.
In case it is interesting to anyone, here is the class that does the
work of signing and checking the sign.

- Dave

class PickleSignMixin:

def __init__(self, secret):
self.__secret = secret

def pickle_sign(self, text):
m = md5.new()
m.update(self.__secret)
m.update(text)
text = m.digest() + text
return text

def pickle_unsign(self, text):
digest = text[:16]
text = text[16:]
m = md5.new()
m.update(self.__secret)
m.update(text)
if m.digest() == digest:
return text
return ''
 
P

Paul Rubin

Dave Cole said:
I have been googling for information on the "appending" attack against
md5 and cannot find anything that clearly describes it. Do you have
any links handy?

I think RFC 2104 (the HMAC spec) might describe it. Basically, think
about how md5 works. You load the md5 context with the secret key
(say 20 bytes) then your data (say 20 bytes), then some padding to
fill the 64 byte context, and run the compression function:

md5_compress(key + data + 24 bytes of padding)

Call the 24 padding bytes P. They are just 16 0's plus an 8 byte
length, iirc.

The hash output is just the md5 chaining variables after running the
compression function.

Now look at the 100 byte string

E = your data + P (same as above) + 36 bytes of evil stuff

Even without knowing your secret key, if the attacker knows your data
(which may not be secret), and md5(key+data) (which you've included in
the cookie), he can compute the signature of E. It's just the result
of running the compression function on his evil stuff plus appropriate
additional padding, with the chaining variables set to the original
md5 hash that you already sent him.

This is not really a failure of md5, which is supposed to be a message
digest algorithm, not a MAC. Rather, the authentication fails because
md5 is being used in a way it was not intended to be used.

The solution is to use HMAC. See RFC 2104 for details.
 
J

John J. Lee

Ian Bicking said:
Security isn't a big deal -- or rather, securing cookies isn't a big
deal.

I don't understand. The problem is that pickles can be constructed
that can damage systems when unpickled, is that right? If that's
true, then surely unpickling cookie data is unsafe, because stuff
coming in from the network has to be regarded as malevolent. Are you
saying that web server environments are sufficiently-well bolted down
that no pickle attack will work? But belt-and-braces is the best
policy, isn't it?

and IE has a
bug where you can't redirect and set a cookie at the same time, which
can really drive you crazy if you don't know about it.
[...]

Hah. There's a slight irony there, given that they fought against
restrictions on setting cookies from 'unverified' third parties when
the (more-or-less stillborn) cookie RFCs were being written. So they
argue against that, then end up partially implementing it by
accident...


John
 
P

Paul Rubin

I don't understand. The problem is that pickles can be constructed
that can damage systems when unpickled, is that right? If that's
true, then surely unpickling cookie data is unsafe, because stuff
coming in from the network has to be regarded as malevolent. Are you
saying that web server environments are sufficiently-well bolted down
that no pickle attack will work? But belt-and-braces is the best
policy, isn't it?

The point is that you can use cryptographic signatures to make sure
any cookie you receive is one that the server actually sent, before
deciding to unpickle it. That means if the attacker constructs a
malicious cookie, you never unpickle it.
 
?

=?ISO-8859-1?Q?Nagy_L=E1szl=F3_Zsolt?=

I don't understand. The problem is that pickles can be constructed
that can damage systems when unpickled, is that right? If that's
true, then surely unpickling cookie data is unsafe, because stuff
coming in from the network has to be regarded as malevolent. Are you
saying that web server environments are sufficiently-well bolted down
that no pickle attack will work? But belt-and-braces is the best
policy, isn't it?

I'm sorry, I just caught this thread and I don't know your problem very
well.
I'm extensively using this (see below). It is unable to pickle class
instances but
you won't accidentally run constructors. I think it is safe for
sending/receiving data.
I did not try to publish this yet but it would be great to know if this
is safe or not
so please make comments.

Laci 1.0

import cStringIO
import cPickle

def dumps(obj):
f = cStringIO.StringIO()
p = cPickle.Pickler(f,1)
p.dump(obj)
return f.getvalue()

def loads(s):
f = cStringIO.StringIO(s)
p = cPickle.Unpickler(f)
p.find_global = None
return p.load()
 
A

Alan Kennedy

Paul said:
My suggestion is to
authenticate the cookies with a cryptographic checksum and verify the
authentication before deserializing the cookies. That's probably the
simplest approach. Keeping session info on a multi-process server (or
worse, a multi-server network) needs some kind of concurrent storage
mechanism.

Paul,

Do you mean transmit the checksum to the client with the cookie? And
check that they match when the cookie and checksum come back?

Or is the checksum stored on the server, in some form of lookup
dictionary keyed by some user session identifier?

regards,
 
P

Paul Rubin

Alan Kennedy said:
Do you mean transmit the checksum to the client with the cookie? And
check that they match when the cookie and checksum come back?

Yes. See other posts in the thread for sample code.
Or is the checksum stored on the server, in some form of lookup
dictionary keyed by some user session identifier?

If you have a convenient way to do that, it's best to just send a
session number in the cookie, and keep all the session data on the
server. Then you don't ever have to unpickle anything.
 
?

=?ISO-8859-1?Q?Nagy_L=E1szl=F3_Zsolt?=

Paul,

Do you mean transmit the checksum to the client with the cookie? And
check that they match when the cookie and checksum come back?

Or is the checksum stored on the server, in some form of lookup
dictionary keyed by some user session identifier?
I think he wanted to write a digital signature instead. Right?

Laci 1.0
 
P

Paul Rubin

Nagy László Zsolt said:
I think he wanted to write a digital signature instead. Right?

I used "cryptographic checksum" in a broad sense. More specifically
the suggestion is to use a secret-key authentication code like HMAC-MD5.
 
I

Ian Bicking

I don't understand. The problem is that pickles can be constructed
that can damage systems when unpickled, is that right? If that's
true, then surely unpickling cookie data is unsafe, because stuff
coming in from the network has to be regarded as malevolent. Are you
saying that web server environments are sufficiently-well bolted down
that no pickle attack will work? But belt-and-braces is the best
policy, isn't it?

I should have said "securing cookies isn't hard", so that's not the
reason not to use them (though you shouldn't just use plain-vanilla
cookies).

Ian
 
P

Paul Rubin

Ian Bicking said:
I should have said "securing cookies isn't hard", so that's not the
reason not to use them (though you shouldn't just use plain-vanilla
cookies).

The signature scheme we've discussed does cause some configuration
hassle. There has to be a host-specific secret key and it has to be
kept secret. If it leaks to an attacker, the attacker can then create
malicious cookies. So the scheme has to be used with care.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top