I am looking to secure my code against XSS attacks, yet all of the examples I have been reading deal with direct user input validation (such as in a contact form or a login).
I’m a bit confused as to if I need to protect my code if there was no way to input directly (ie, my website was only reading from a database and not writing to it)? I’m still thinking I need to because I class my database as an external source, and data within the variables echoed are still coming from elsewhere.
Am I right in thinking that any data read still constitutes user input and should be treated accordingly? Also, if I then added a contact form, would I need to then validate/sanitise/escape every piece of information pulled from my database in every page, or only deal with it at the form itself?
Forget the term “user input” and think in terms of “unknown strings”. Anything that you do not know for a fact what it contains is potentially dangerous or disruptive in the right context.
It’s also important to remember there is no single solution for all cases. For example these all may require different types of sanitizing or escaping:
<a href="$unknown"><p>$unknown</p><script>var B = $unknown;</script>SELECT * from $unknown.myClass { color:$unknown; }In general you should (if possible) avoid using unknown data in HTML attributes, CSS, or Javascript – because those are places where it can get complicated. For most cases, simply escaping the HTML characters is all you need to do.
The key word here is context, which is one reason why you never want to “sanitize” input, but output. The same data could be used in different contexts and require different measures of escaping or filtering.
I highly suggest using OWASP as a resource to learn about XSS and security in general: https://www.owasp.org/index.php/Cross-site_Scripting_(XSS)