When I make queries across my database – I’m having issues with scaling where if a user has 10k friends – the same query I use for users with 2k friends doesn’t scale up, meaning it takes a very long time to process.
The typical situation is that when a user’s friend count reaches a certain threshold, I am having to end up using STRAIGHT_JOIN to pull the query, however I need to write conditional statement to first see how many friend and then go from there. The more data, the slower the query to gets.
Is there a better way to scale up your queries via MySQL so they work at the same rate no matter how much data is being generated or am I living in a fantasy world?
EDIT: Query listed below:
SELECT photos.photo_id, count(distinct photo_views.ip_address) as total_count
FROM photos
INNER JOIN friends on friends.friend_id = photos.user_id
INNER JOIN photo_views on photos.photo_id = photo_views.photo_id
WHERE friends.user_id = 1 and friends.approved = 1
and photos.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP by photos.photo_id
ORDER by total_count desc
Indexes on photos are:
[user_id, created_at, photo_id], [user_id], [photo_id] - PRIMARY
Indexes on friends are:
[user_id, approved], [friend_id], [user_id, friend_id] - PRIMARY
Index on photo_views are:
[photo_id] - PRIMARY
I would start with your Friends FIRST as your qualifier so it doesn’t try to look at the Photos as the basis which would cover everyone… Since you are doing an (Inner) join, I would go with the straight_join such as:
This way, you are starting ONLY with those friends for the one user who are approved. From that, join to the photos of the friends AND the date is qualified… from that, get how many hits per the views table