I was planning to store user comments in a database and was wondering how to store the raw text that users will provide since it could contain anything. What is a common/good practice in regards to this? Do I need to parse the text or is there a special storage type that I should be using? It seemed like a lot of overhead to be parsing user comments that could potentially be quite long and I don’t want to be tampering with the intended meaning of a message etc. It seemed strange to be treating a comment/forum post in the same manor as say a username/password and sanitize.
I am using sqlite3 and some scripts to query the db and was planning on implementing something along the lines of:
page_id post_number username content
------- ----------- -------- -------
1 1 user_23 blah there's blah "quote" blah;':".,-=.
But, of course, if I was to just expand my content param into my INSERT query, there is going to be all kinds of problems with ' " etc.
How should I be handling the content in this table; should it even be in the table like this? What data type should I be using etc.
The data type to use depends on the overall size of the input.
Text would be my first choice.
for more on sqlite3 datatypes see
http://www.sqlite.org/datatype3.html
I would recommend you
Sanitize the input, only allow necessary markup.
Encode the content so it’s safe to insert into the database.
If security is a serious concern this processing should be done server side. It won’t hurt to put some of this processing load to javascript this will reduce the work done at the server. Tho it would still catch user(s) trying to circumvent the feature.