for user submitted content, textile or inspected html?

D

Dorren

I know use another markup language, like wiki syntax or textile is to
prevent javascript injection. But for user who don't know about wiki
syntax or textile, I'm thinking about just allow them to enter plain
html, parse the content, and reject all questionable tags and
attributes, only allow predefined (safe) tags, like bold or italic,
etc.

Is using html for markup less secure than using non-html markup?
What's the main reason people use non-html markup?
 
P

Phillip Gawlowski

Dorren said:
I know use another markup language, like wiki syntax or textile is to
prevent javascript injection. But for user who don't know about wiki
syntax or textile, I'm thinking about just allow them to enter plain
html, parse the content, and reject all questionable tags and
attributes, only allow predefined (safe) tags, like bold or italic,
etc.

Wiki-like syntax can be easily learned (and Textile is such a syntax:
markup that is non-HTML), and saves you from the hassle of sanitizing
the input. You'll have to handle a lot of special cases, due to browser
incompatibilities (IE6, for example, allows javas\ncript as a valid tag,
which, for computers, isn't the same as javascript, obviously).
Is using html for markup less secure than using non-html markup?
What's the main reason people use non-html markup?

Yes, HTML is less secure, mainly due to JS exploit issues, and otherwise
lacks readability by humans.
If you can avoid HTML input, do so.


Shameless plug:
ClothRed's aim is to convert HTML into textile, and will be able to
serve as a sanitizer in the (hopefully) not too distant future:
http://clothred.rubyforge.org

(P.S.: Out of a similar need than yours, I came up with this library)


--
Phillip "CynicalRyan" Gawlowski
http://cynicalryan.110mb.com/

Rule of Open-Source Programming #13:

Your first release can always be improved upon.
 
J

John Joyce

There are simple means to counter it.
You do still need to sanitize and validate user input.
All unknown/unmonitored input must be considered untrusted.
Even with in a another markup, there could be attacks in waiting.
The fact is, you have to do this stuff anyway.
You should also limit user input. There must be some upper bounds to
the size of the input.
You have to care about SQL injection, XSS cross site scripting
attacks, all of it.
There is no shortcut on security.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top