I’m trying to optimize my PHP and MySQL, but my understanding of SQL databases is shoddy at best. I’m creating a website (mostly for learning purposes) which allows users to make different kinds of posts (image/video/text/link).
Here is the basics of what I’m storing
- Auto – int (key index)
- User ID – varchar
- Post id – varchar
- Post Type – varchar (YouTube, vimeo, image, text, link)
- File Name – varchar (original image name or link title)
- Source – varchar (external link or name of file + ext)
- Title – varchar (post title picked by user)
- Message – text (user’s actual post)
- Date – int (unix timestamp)
I have other data stored relevant to the post in other tables which I grab with the post id (like user information) but I’m really doubting if this is the method I should be storing information in. I do use PDO, but I’m afraid this format might just be extremely slow.
Would there be any sense in storing the post information in another format? I don’t want excessively large tables, so from a performance standpoint should I store some information as a blob/binary/xml/json?
I can’t seem to find any good resources on PHP/MySQL optimization. Most information I come across tends to be 5-10 years old, content you have to pay for, too low-level, or just straight documentation which can’t hold my attention for more than half an hour.
What you have seems okay, but you have missed the important bit about indexes and keys.
Firstly, I am assuming that your primary key will be field 1. Okay, no problems there, but make sure that you also stick an index on userID, PostID, Date and probably a composite on UserID, Date.
Secondly, are you planning on having search functions on these? In that case you may need to enable full text searches.
Don’t muck around trying to store data in a JSON or other such things. Store it plain and simple. The last thing you want to be doing is trying to extract a field from the database just to see what is inside. If you database can’t work it out, it is bad design.
On that note, there isn’t anything wrong with large tables. As long as they are indexed nicely, a small table or large table will make very little difference in terms of accessing it (short of huge badly written SQL joins), so worry about simplicity to be able to get the data back from it.
Edit: A Primary Key is lovely way to identify a row by a unique column of some sort. So, if you want to delete a row, in your example, you might specify a
delete from yourTable where ID=6and you know that this will only delete one row as only one row can have ID=6.On the other hand, an index is different to a key, in that it is like a cheat-sheet for the database to know where certain information is inside the table. For example, if you have an index on the UserID column, when you pass a userID in a query, the database won’t have to look though the entire table, it looks at the index and knows the location of all the rows for that user.
A composite index is taking this one step further again, if you know what you will want to constantly query data for both UserID and ContentType, you can add in a composite index (meaning an index on BOTH fields in one index) which will then allow the database to return only the data you specify in a query using both those columns without having to sift through the entire table – nor even sift through all of a users posts to find the right content type.
Now, indexes take up some extra space on the server, so keep that in mind, but if your tables grow to be larger (which is perfectly fine) the improved efficiency is staggering.