a bit of a PHP / MySQL newbie here…
I’ve been building a PHP-based site that uses a MySQL database for storing user information, like their display names, usernames, and passwords.
I’ve been learning about escaping, prepared statements and the like, and how to prevent SQL injections like “bobby’); drop table users–“.
I’m using PDO prepared statements to get user input from forms, in order to register them into the DB. However, I need to know a few things:
-
Since I am using prepared statements, for display names, usernames,
passwords, etc, is it okay for me to allow special characters like
@, #, $, or even ‘single’ or “double” quotes? And what about spaces,
international characters, characters with accents, or things like ♥
? And when I ask if it’s “okay” to allow these characters, I’m
wondering if there are any further security risks that may arise
from allowing quotes or parentheses in people’s usernames, or things
like html tags for bold or italics? -
If it is okay to allow most special characters, but not some: are
there any specific “dangerous” characters (in the scope of MySQL)
which I absolutely need to make illegal? (I feel like quotes may fit
this agenda, but I’m getting mixed signals on that.) -
If I were to allow characters outside of the typical “alphanumeric
and underscore” range, are there any pitfalls I may experience later
(in MySQL, SQL, or PHP) from allowing strange characters? Will I
need to somehow make html tags appear as strings, rather than actual
tags, when displaying people’s usernames? Or would I need to escape
quotes in people’s usernames whenever I wanted to query with them?
Or does none of this matter since I’ll be using prepared statements
with PDO? -
Do charsets like utf8 or utf16 come in anywhere, in making it so I
can accept the widest range of display names and usernames, while
still making sure those alphabets can be rendered on my website? -
I know that there are some Cyrillic letters that look identical to
English ones. I used to copy these straight out of MS Word and use
them in my usernames. I realize that these can be used to
perceptually-impersonate other members, simply by swapping out an
English “a” for a Cyrillic “a”. Usernames with ♥ in them may be hard
to search for if someone isn’t well-versed in alt-code. Should this
be a concern? What is your opinion on this?
Thanks in advance to whoever can give me some insight on this.
First let me say that I really like your style. It appears that most people don’t take the time to think these things through, and just slap together queries with no data sanitization at all. So congratulations on being diligent. 🙂
That said, with PDO, you shouldn’t have to to worry about quotes messing up your queries. Especially if you bind your variables with bindParam, which allows strict parameter control. With that you can cast the variable type, and length. Also, special characters will not mess up your query, as PDO escapes them, too. So no need to worry about that.
As for making HTML appear as text instead of actual HTML, a very useful function is htmlspecialchars(), which will convert html code to character codes. This function can also be used with the optional ENT_QUOTES flag, which turn this
"into this". htmlspecialchars() also has an option to set the output to the encoding of your choice.