The following is a simplified example of a page a user has created at a site (they created it by filling out a form and then they get a URL for the page; the below is the HTML for the page they created).
In the example, I’m taking the value of a hidden input field and then putting it into the DOM as is. That results in an alert, simulating an XSS attack.
What’s the best way to prevent things like this? The value of #sourceinput was previously input by the same or a different user who’s viewing the page below, and the user’s input wasn’t filtered to remove tags. (The actual case involves the jquery.tooltip.js plugin and it’s bodyHandler callback; on mouseover a bodyHandler callback would get the hidden input and display it to the user.)
One way to deal with this would be to strip tags on input; I control what goes in the hidden textfield so that would seem to solve it.
Another way would be to strip tags in Javascript, but some of these don’t seem to be 100% effective:
Strip HTML from Text JavaScript
Is there some sort of best practice that I’m missing, or are those two the best ways?
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<head>
<title></title>
<script type="text/javascript" src="https://www.google.com/jsapi"></script>
<script>google.load("jquery", "1.7.1");</script>
<script>
$(document).ready(function() {
var badHTML = $('#sourceinput').val();
$('#destinationdiv').html( badHTML );
//$('#destinationdiv').text( badHTML );
});
</script>
</head>
<body>
<input type="hidden" id="sourceinput" value="<script>alert('hi');</script>" />
<div id="destinationdiv" style="width:10px;height:10px;background-color:red;"></div>
</body>
</html>
UPDATE: The solution I’m going with for now has three parts:
-
When the page the user has created is saved, I run PHP’s strip_tags() on their input. These are just short text strings like titles and blurbs, so few users will expect they can enter HTML. That might not be appropriate for other situations.
-
When the page the user created is displayed, instead of putting what the user had entered in an input value attribute, I put their input inside a div.
-
I take the value out of that div using .text() (not .html() ). I then run that through the underscore function (see below).
Testing this out – including simulating skipping the first step – seems to work. At least I’m hoping there isn’t something I missed.
Here’s the escape function used by Underscore.js, if you don’t want to use the entire Underscore library of functions:
Used like
It’s written well and is known to work, so I’d advise against rolling your own.