Parsing JavaScript to prevent maliciousness?

Mongoose Sir mongoose · Aug 16, 2009

Hello,

I'm working on a site that is implementing similar functionality to _A
Certain Large Social Networking Site_'s Apps feature.

Application developers will be able to write apps in a hybrid HTML /
"FooML" / JavaScript syntax.

This will get parsed by my servers (as the man in the middle) and then
shoved back to the user's browser as HTML.

Now, my normal inclination is just to dive in and start coding away =)

But I figured one of the smart people here might have some good pointers
on where to start.

The tricky problems, as I see them:

* Allowing access to some JavaScript functionality while stripping out
malicious calls (document.cookies ?)
* Also: how to deal with Base64 / eval / other tomfoolery that attackers
might attempt
* Parsing custom tags like <foo:username />, <foo:friend_list count="4"
/>.

The last one seems similar enough to parsing HTML trees so hopefully
there's something in ruby-land that can help with this)

Any suggestions / links / pointers would be greatly appreciated!!

- Sean

ps. if anyone is interested in working with me on some kind of open
source library that could handle this kind of thing in a
website/domain-agnostic way, feel free to hit me up.

pharrington · Aug 16, 2009

Hello,

I'm working on a site that is implementing similar functionality to _A
Certain Large Social Networking Site_'s Apps feature.

Application developers will be able to write apps in a hybrid HTML /
"FooML" / JavaScript syntax.

This will get parsed by my servers (as the man in the middle) and then
shoved back to the user's browser as HTML.

Now, my normal inclination is just to dive in and start coding away =)

But I figured one of the smart people here might have some good pointers
on where to start.

The tricky problems, as I see them:

* Allowing access to some JavaScript functionality while stripping out
malicious calls (document.cookies ?)
* Also: how to deal with Base64 / eval / other tomfoolery that attackers
might attempt

Does a Ruby Javascript parser exist? A quick google brings up
http://idontsmoke.co.uk/2005/rbnarcissus/, dunno how well it actually
works though. Either way, "stripping out malicious calls" is the
opposite of the correct approach (as attackers *will* outclever you,
100% of the time); rather you create a whitelist of acceptable
javascript, nixing everything that doesnt match your criteria. Mayhaps
it might even be easier to create your own language that users can
use, and translate that into JS?

* Parsing custom tags like <foo:username />, <foo:friend_list count="4"
/>.

The last one seems similar enough to parsing HTML trees so hopefully
there's something in ruby-land that can help with this)

This seems like the standard Hpricot/Nokogiri parsing affair; are
either of those not suiting your needs?

Aaron Patterson · Aug 16, 2009

Does a Ruby Javascript parser exist? A quick google brings up
http://idontsmoke.co.uk/2005/rbnarcissus/, dunno how well it actually
works though. Either way, "stripping out malicious calls" is the
opposite of the correct approach (as attackers *will* outclever you,
100% of the time); rather you create a whitelist of acceptable
javascript, nixing everything that doesnt match your criteria. Mayhaps
it might even be easier to create your own language that users can
use, and translate that into JS?

Yes, there are a couple javascript parsers out there:

RKelly (It's pure ruby):
http://github.com/tenderlove/rkelly

And Johnson (uses Spidermonkey's parse tree):
http://github.com/jbarnette/johnson

Both support AST manipulation as well as turning the AST back in to
javascript. Either of them should be easy enough to work with, but
properly sanitizing javascript sounds hard!

Fabian Streitel · Aug 16, 2009

[Note: parts of this message were removed to make it a legal post.]

Yep, sounds quite dangerous to me as well...

Another security problem might come from you allowing
users to manipulate the DOM (which I guess is one of the features you plan
on implementing, since without that, there isn't really much you can do in
JS
except some alerts maybe

.

I'd definitely forbid that.
1. they could inject arbitrary text on the website, including spam,
links etc. and start phishing attacks and
2. due to the browsers executing every least bit of javascript they find,
they could just inject a string containing <script> tags, executing
any JS they want and steal user sessions, forward private data etc. etc.

IMHO it's already hard enough protecting the webapp from attacks from
the outside, but you also introduce an attackvector from the inside.

Greetz!

Mongoose Sir mongoose · Aug 16, 2009

@pharrington - thanks for the pointer on Hpricot/Nokogirl. I'm familiar
with Hpricot but will have to take a look at Nokogirl.

Aaron - Thanks. I'll take a look at those. Think I'm getting in over
my head here, but should be fun times.

Fabian -

The whole point of the website is to allow third-party developers to
display HTML inside of a little content area within the site. (Not
unlike certain large social networking site's Apps feature)

One approach I've seen is namespacing all css IDs with some kind of
application id or something.

So,

$('#foo-alert').html('You just won a prize!');
...would have to become
$('#app_1234567_foo-alert').html('You just won a prize!');

If they broke out of their content area and started manipulating the DOM
on other parts of the page, this wouldn't even be the end of the world.
(they'd eventually get caught & banned)

I'm more concerned about malicious things they could do to the end-user,
e.g. cookie theft.

It sounds like a whitelist is the reasonable approach here.

Cheers,
- Sean

Tony Arcieri · Aug 17, 2009

[Note: parts of this message were removed to make it a legal post.]

For what it's worth we're using Johnson for something similar, the intent
isn't so much to prevent maliciousness but to allow multiple scripts from
different 3rd party developers running in the same environment without
worrying about clashing variable or function names.

We previously used RKelly but moved to Johnson because it facilitated us
actually testing the compiled scripts by executing them.

Fabian Streitel · Aug 17, 2009

[Note: parts of this message were removed to make it a legal post.]

If they broke out of their content area and started manipulating the DOM
on other parts of the page, this wouldn't even be the end of the world.
(they'd eventually get caught & banned)

I'm more concerned about malicious things they could do to the end-user,
e.g. cookie theft.

But if you let them manipulate the dom, how are you going to prevent script
injection?
because, that's all you need to steal cookies. And if the attacker's sly, he
will conceal
the injection, delaying the being caught part until he got enough valid
sessions....

I don't know what mysterious site you're talking about, since I'm not into
social network
stuff, but I'd sure like to know how they manage that problem...

Greetz!

I'm tempted to quit out of frustration	1	Aug 13, 2023
parsing javascript from local html file	3	Jan 11, 2007
using Javascript to insert HTML from another HTML	1	Jul 15, 2007
Javascript - sequence of events	4	Jun 4, 2006
Looking for someone to write me a custom javascript/DHTML script	0	Jan 26, 2007
JavaScript Web Application with Drag and Drop - Easy to do?	1	Feb 28, 2006
Trying to access a JavaScript variable in one frame from another frame.	11	May 16, 2006
How to outsource Javascript code from html code ?	0	Dec 30, 2003

Parsing JavaScript to prevent maliciousness?

Mongoose Sir mongoose

pharrington

Aaron Patterson

Fabian Streitel

Mongoose Sir mongoose

Tony Arcieri

Fabian Streitel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads