I have one table that keeps track of links that each user has clicked and I have another table with links. Here is each table structure:
Links:
id | link | value | date_added
Clicked:
user_id | link_id | date_clicked
Right now this is the code I am using to make my search happen and it works, I just want to know if there is a more efficient way of doing it, since the clicked links table is going to get very large very fast.
$history_query = mysql_query("SELECT * FROM clicked_links WHERE user_id = '$id'") or die(mysql_error());
$history_array = array();
while ($h = mysql_fetch_array($history_query)) {
$history_array[] = $h['link_id'];
}
$clicked = implode(',', $history_array);
$link_query = mysql_query("SELECT * FROM chip_links WHERE id NOT IN ($clicked) ORDER BY value DESC") or die(mysql_error());
while ($r = mysql_fetch_array($link_query)) {
echo "<div id='claim{$r['id']}' style='text-align: center; font-weight: bold; font-size: 18px; float: left; width: 183px;'>
<a href='{$r['link']}' id='{$r['id']}' class='collect' target='_blank'>
Claim {$r['value']} points!
</a>
</div>";
}
It will be more efficient to run a single query to get the resultset, rather than running separate queries.
You don’t need to return all the
link_idvalues, put them in an array, put the array into a string, and push that string into another query, and shuffle it back to the database… the database already has that.This query will return a resultset equivalent to your current $link_query, without the need for the $history_query or $history_array.
If you don’t have some sort of guarantee that link_id in the clicked_links table IS NOT NULL, you’ll want to include a
link_id IS NOT NULLpredicate in that subquery, because the query won’t return any rows if a link_id value is NULL. (This is a well-known and avoidable issue withNOT IN (subquery)constructs.It’s likely that MySQL will optimize that into a (hopefully more efficient but) equivalent NOT EXISTS correlated subquery, like this:
For best performance, though, you probably want to use the anti-join pattern.
The LEFT JOIN operation basically looks for matching rows, and the
IS NOT NULLpredicate will throw out rows that match, so what you get back is rows fromchip_linkswhere there is no “matching” row fromclicked_links.The MySQL optimizer usually generates the most efficient plan with a query like this:
For good performance on large sets, you’ll also likely want indexes
That should allow the query to be satisfied entirely from the indexes, and without the need for a sort operation. The EXPLAIN output will include “Using index”, and will not include “Using filesort”).