I’m wondering what’s better for my server (speed, etc), considering CPU, bandwidth and diskspace usage.
Currently my server is about to explode, too much MySQL/PHP requests, and so on, that’s why I’m optimizing my application (discussed in this question: Best way to scale data, decrease loading time, make my webhost happy).
Now, what’s the best solution to decrease CPU, bandwidth and diskspace usage?
1) Fetch a single big record from a
table (100.000+ records, let’s say
20kb/record) and handle the fetch with
PHP => only 1 request, but the result
may cause a heavy server load?2) Fetch multiple small records from a
table (1.000.000+ records, let’s say
1kb/record) => significant more MySQL
requests needed to get the same result
as the result in method 1
Method 1 will cause the database to become lots of GBs (10+). Using method 2 the database will be smaller, but I’m not sure about the effect of running a lot of queries on the performance of my application?
Returning a mysql_result() from a table of 1.000.000+ records takes more time, because it needs to scan all the rows for a specific records?
Hope you can tell me what method is better to decrease CPU, bandwidth and diskspace usage!
Edit
I currently have one table: facebook_id, friends_json.
In friends_json, the uid AND name of every friend of this facebook_id user is stored. Using this method, every record is about 10kb. Once this record is requested, I don’t have to do extra requests to fetch the name of a friend: this is already included in the friends_json.
My question is whether it is better to only store the friends’ uids in the friends_json, so that for each friend I have to run a query to another table (friends_names) to fetch the name of this friend from this table (if not available, request it from Facebook). This second method saves diskspace, but I really have to do a large amount of requests before I can show the user a result.
The goal is that I have to compare the list of friends in my database with a current list of friends. If a user deleted his/her Facebook profile, I can’t request the corresponding name anymore, that’s why I have to save the names in my database.
Since the question is not clear enough (or I cant understand it correctly) I would assume that you have 1 table having 2 columns : facebook_id, friends_json and you are requesting all the friends of friends. This is the worst case I can ever think of. still all you have to do is 2 simple query :
none of the queries above needs to scan whole table (and its the worst case)
if you can give more info about your table structure and your goal (what you want to retrieve from that data) we can help more.
Edit: Nothing can save your server if you have to do a table scan in every hit.
Edit:
As long as you get the result with hitting index the size of table or the row wouldn’t affect as much as you think. And a join just to get names when you keep uid’s normalized is not the way to go. Eighter you keep a “users” table with “uid, name” columns and friendship table “uid1, uid2” or you have normalized data including both uid and name. And about the new and old friendlist comparison, you should do it in php anyway using uid’s (not the names). get friendlist from facebook, compare it with current friendlist, find the differences and apply to database. In this case you shouldn’t have to table scan at any point of your application.
Here is the normal way to do it (without json):
fb_users table : uid, name, is_app_user (PK: uid)
fb_friends table : uid1, uid2 (PK: uid1, uid2)
get friends sql query :
and to add users you can do a neat trick to update the name everytime for name changes (which is used most of the time) :
and to add friends you can do a trick aswell so you don’t have to worry about having A B and B A at the same time :
these are just tricks if you decide to keep your data relational, but I would suggest keeping it normalized anyway. your json method is what is used in most cases, and don’t worry about space alot, since the data size is usually not the thing that blocks the servers, its the way you request data (code) and the way you grab it(sql queries) is where you should tune.