I am trying to put together a checklist things I need to keep in mind when creating forms. I know I need to filter input content. I already am filtering for errant html and scripts, escaping mysql, and limiting to data types(phone numbers are 10+ digits with training extension digits, email has to be email, strings cannot contain html or code, etc.), and word or character limits (names max out at 4 words separated by whitespace, etc.). But what else should I be doing and what are good ways of doing them?
This validation will be taking place on the server, but I am looking for best practices across platforms. The data will be coming in using POST, so I don;t have to worry too much about mucking about with the url. Also the form presentation, with hinting, js input masking is handled, and pretty much all the client side stuff is in place.
Validation down to its simplest term: only accepting what you want.
For example, if your telephone field should only include numbers (in no particular phone number format) and no longer than 20 numbers, you can check it against regular expression to make sure that it is what you want to accept, i.e.
([0-9]{7,20})Another example, Twitter. It only accepts username up to 15 characters, alphanumeric and consisting of underscores. So the validation regex might something be:
([a-zA-Z0-9]{1})([a-zA-Z0-9\_]{0,14})Form validation can also be in the form of security check. One could be honey potting, form validity and so on.
Form Honey potting: Preventing automated/spamming of your form submissions
Form Validity: Check between the time the form has loaded and the time of form submission. If it is too short, the form might be submitted by a bot. If it took too long, the data might be old and expired.
CAPTCHA: another level of bot prevention / human only form validation.