so I have a site where users can register using a username of their choosing and can submit large blocks of text and add comments. Currently, to avert XSS, I use strip_tags on the data on input to the database and I only output the data in the body, rather than in an attribute.
I’m currently making changes to the site, one of which is to make a user page which is loaded when someone clicks on the username (a link). This would look like:
<a href="example.com/user/<?php echo $username; ?>">...</a>
I’m worried that for the $username variable, someone could insert
<a href="example.com/user/user" onClick="javascript:alert('XSS');">...</a>
I’ve read a bunch of the other SO posts on this, but none gave a black-and-white answer. If I use the following on all text on output, in addition to strip_tags on input:
echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
is that going to be enough to stop all XSS attacks, including those using the inline javascript: syntax?
Also, is there any way to remove actual html tags without removing things like “Me > you”?
Thanks!
Escaping depends on the context. If it’s a URL, use URL encoding (%xx), but also check that the full URL does not start with “javascript:”. Your syntax for the onclick-attribute is not required. Onclick is a javascript event handler, so any javascript inside it will run.
See the OWASP XSS Prevention Cheat sheet to see how to escape for different contexts.