Every text field in this current project is just normal plane text without ubb or html tags.
However it must be unicode so an array with characters which are supported isn’t optimal i guess.
I know there are a lot of dedicated classes for xss detection, however don’t all xss include these 2 characters:
The char <
<
%3C
<
<
&60;
<
PA==
The char ;
;
%3B
;
;
Ow==
&59;
If i check all user input (get, post, cookie) for the characters in the codeblocks above then everything should be 100% safe?
The project isn’t running on mysql, its using cassandra so mysql injection shouldn’t be a problem.
I’m sure i’m forgetting something but i don’t know what…
Or is it really so easy to build 100% safe apps when the userinput is plane text?
Edit:
List are both a little longer, found one for the first char here:
http://ha.ckers.org/xss.html
<
%3C
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
\x3c
\x3C
\u003c
\u003C
There is no point trying to blanket-forbid “evil” characters in input, and even less point trying to forbid versions encoded in various different forms. You will false-positive and block valid input, whilst not protecting yourself from every possible form of injection hole. I’m not sure what kind of attack you’re trying to prevent by banning
Ow==
but not, say,&or".The correct way to stop HTML-injection is to call
htmlspecialchars()on any text string being output into an HTML page. The correct way to stop URL-component-injection is to callrawurlencode()on text strings being output into a URL. The correct way to stop SQL-injection is to call the relevant DB escaping function (egmysql_real_escape_chars()) on any text being output into an SQL string literal.And so on, for every different for of escaping you might come across. The point is, this is an output-level function that has to be applied as and when you put text into a new context, using the right function for the type of context you have. It’s not something you can do once at the input stage and then forget about, because you don’t know at the input stage whether the text you’re handling is going to end up in an SQL literal, an HTML page, a JavaScript string literal, a URL parameter, or what.
That’s not to say input-stage validation is useless; you will want it to make sure a submitted field that’s supposed to be a number does actually look like a number, or a date a date, or whatever. But input validation is not a solution to output-escaping problems like the HTML-injection issue that causes most XSS. To make that work, you’d have to ban pretty much all punctuation, which would be pretty user-hostile.