I’m trying to optimize an SQL query. Can you help me?
Basically each user has friends through a friendship table and each user has many feed_events trough a user_feed_events table.
I’m trying to list the feed_events of the friends of a given user. Shouldn’t be impossible, right? 🙂
As you can see the performance of the query depends on how many friends a user has. Right now a user with 150 friends takes almost 7 seconds to execute.
UPDATE: here is how my friendship table is built:
create_table "friendships", :force => true do |t|
t.integer "user_id", :null => false
t.integer "friend_id", :null => false
t.datetime "created_at"
t.datetime "accepted_at"
end
add_index "friendships", ["friend_id"], :name => "index_friendships_on_friend_id"
add_index "friendships", ["user_id"], :name => "index_friendships_on_user_id"
First I ask rails to give me the list of the ids of the userids of the friends of the user, then I use this string on the real query.
friends_id = current_user.friends.collect {|f| f.id}.join(",")
sql = "
SELECT
DISTINCT feed_events.id,
feed_events.event_type,
feed_events.type_id,
feed_events.data,
feed_events.created_at,
feed_events.updated_at,
user_feed_events.user_id
FROM feed_events
LEFT JOIN user_feed_events
ON feed_events.id = user_feed_events.feed_event_id
WHERE user_feed_events.user_id IN (#{friends_id})
ORDER BY feed_events.created_at DESC"
Then I acutally execute the query (paginating it and limiting to 30 results):
@events = FeedEvent.paginate_by_sql(sql, :page => params[:page], :per_page => 30)
UPDATE #2: HERE IS THE EXPLAIN ANALYZE OUTPUT:
SQL> EXPLAIN ANALYZE (SELECT DISTINCT feed_events.id, feed_events.event_type, feed_events.type_id, feed_events.data, feed_events.created_at, feed_events.updated_at, user_feed_events.user_id FROM user_feed_events INNER JOIN feed_events ON feed_events.id = user_feed_events.feed_event_id WHERE user_feed_events.user_id IN (1,7,9,8,14,15,20,35,40,39,41,42,57,84,98,109,121,74,129,64,137,77,172,182,206,201,284,31,94,232,311,168,30,114,50,174,419,403,438,464,423,513,351,349,385,622,751,359,809,838,844,962,831,786,896,1001,992,998,990,256,67,623,957,1226,1060,1009,1490,132,1467,1672,619,1459,1466,993,1599,1365,607,1381,1714,1154,2032,2230,2240,2354,598,2345,1804,634,1900,2652,1975,2164,1759,3288,1004,3487,3507,3542,3566,514,3787,3137,3803,3090,4012,855,17,2026,1463,335,1000,935,5,12,10,13,19,18,16,22,34,27,29,59,126,90,46,23,63,291,134,229,107,439,521) ORDER BY feed_events.created_at DESC)
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Unique (cost=6090.87..6162.93 rows=18014 width=389) (actual time=1641.210..1733.010 rows=29691 loops=1) |
| -> Sort (cost=6090.87..6099.88 rows=18014 width=389) (actual time=1641.206..1670.882 rows=29694 loops=1) |
| Sort Key: feed_events.created_at, feed_events.id, feed_events.event_type, feed_events.type_id, feed_events.data, feed_events.updated_at, user_feed_events.user_id |
| Sort Method: quicksort Memory: 17755kB |
| -> Hash Join (cost=3931.63..5836.21 rows=18014 width=389) (actual time=258.541..361.345 rows=29694 loops=1) |
| Hash Cond: (user_feed_events.feed_event_id = feed_events.id) |
| -> Bitmap Heap Scan on user_feed_events (cost=926.64..2745.66 rows=18014 width=8) (actual time=6.930..42.367 rows=29694 loops=1) |
| Recheck Cond: (user_id = ANY ('{1,7,9,8,14,15,20,35,40,39,41,42,57,84,98,109,121,74,129,64,137,77,172,182,206,201,284,31,94,232,311,168,30,114,50,174,419,403,438,464,423,513,351,349,385,622,751,359,809,838,844,962,831,786,896,1001,992,998,990,256,67,623,957,1226,1060,1009,1490,132,1467,1672,619,1459,1466,993,1599,1365,607,1381,1714,1154,2032,2230,2240,2354,598,2345,1804,634,1900,2652,1975,2164,1759,3288,1004,3487,3507,3542,3566,514,3787,3137,3803,3090,4012,855,17,2026,1463,335,1000,935,5,12,10,13,19,18,16,22,34,27,29,59,126,90,46,23,63,291,134,229,107,439,521}'::integer[])) |
| -> Bitmap Index Scan on index_user_feed_events_on_user_id (cost=0.00..925.74 rows=18014 width=0) (actual time=6.836..6.836 rows=29694 loops=1) |
| Index Cond: (user_id = ANY ('{1,7,9,8,14,15,20,35,40,39,41,42,57,84,98,109,121,74,129,64,137,77,172,182,206,201,284,31,94,232,311,168,30,114,50,174,419,403,438,464,423,513,351,349,385,622,751,359,809,838,844,962,831,786,896,1001,992,998,990,256,67,623,957,1226,1060,1009,1490,132,1467,1672,619,1459,1466,993,1599,1365,607,1381,1714,1154,2032,2230,2240,2354,598,2345,1804,634,1900,2652,1975,2164,1759,3288,1004,3487,3507,3542,3566,514,3787,3137,3803,3090,4012,855,17,2026,1463,335,1000,935,5,12,10,13,19,18,16,22,34,27,29,59,126,90,46,23,63,291,134,229,107,439,521}'::integer[])) |
| -> Hash (cost=2848.84..2848.84 rows=44614 width=385) (actual time=251.490..251.490 rows=44663 loops=1) |
| -> Seq Scan on feed_events (cost=0.00..2848.84 rows=44614 width=385) (actual time=0.035..77.044 rows=44663 loops=1) |
| Total runtime: 1780.200 ms |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SQL>
UPDATE #3 : The problem is that for my rails application I’m using the has_many_friends plugin (https://github.com/swemoney/has_many_friends), that is taking care of my friendships. It works like this. I’m user_id #6 and I’m asking friendship to user_id # 10. When user # 10 accepts my friendship a new row is added to the table with user_id = 6 and friend_id = 10. If user #10 ask me for friendship the row is: user_id = 10 and friend_id = 6.
This means that in order to find friends_by_me I need to search on “user_id = 6”, in order to find friends_for_me I need to “friend_id = 6”. In order to find all of my friends I need to search both columns. This makes very complicated creating joins!
How would you handle this?
The only alternative I can think of is:
"(SELECT
DISTINCT feed_events.id,
feed_events.event_type,
feed_events.type_id,
feed_events.data,
feed_events.created_at,
feed_events.updated_at,
user_feed_events.user_id
FROM feed_events
INNER JOIN user_feed_events
ON feed_events.id = user_feed_events.feed_event_id
INNER JOIN friendships
ON user_feed_events.user_id = friendships.user_id
WHERE friendships.user_id = 6
AND friendships.accepted_at IS NOT NULL)
UNION DISTINCT
(SELECT
DISTINCT additional_feed_events.id,
additional_feed_events.event_type,
additional_feed_events.type_id,
additional_feed_events.data,
additional_feed_events.created_at,
additional_feed_events.updated_at,
user_feed_events.user_id
FROM feed_events AS additional_feed_events
INNER JOIN user_feed_events
ON additional_feed_events.id = user_feed_events.feed_event_id
INNER JOIN friendships
ON user_feed_events.user_id = friendships.friend_id
WHERE friendships.friend_id = 6
AND friendships.accepted_at IS NOT NULL)
ORDER BY feed_events.created_at DESC"
But at the moment is not working and I’m also not sure is the right way to do it!
Thanks,
Augusto
Why do you use the IN list? Why don’t you start from the selected user? Also, I think your left outer join is not needed:
If you want to stay with your original statement and just optimize it, then use this:
This removes the unnecessary LEFT JOIN.
Furthermore, please make sure that you created indexes on the columns you use for the foreign keys.