I have a query which has some sub queries (inner selects), i’m trying work out which is better for performance, one larger query or lots of smaller queries, i’m finding it difficult to try and time the differences as it changes all the time on my server.
I use the query below to return 10 results at a time to display on my website, using pagination (offset and limit).
SELECT adverts.*, breed.breed, breed.type, sellers.profile_name, sellers.logo, users.user_level ,
round( sqrt( ( ( (adverts.latitude - '51.558430') * (adverts.latitude - '51.558430') ) * 69.1 * 69.1 ) + ( (adverts.longitude - '-0.0069345') * (adverts.longitude - '-0.0069345') * 53 * 53 ) ), 1 ) as distance,
( SELECT advert_images.image_name FROM advert_images WHERE advert_images.advert_id = adverts.advert_id AND advert_images.main = 1 LIMIT 1) as imagename,
( SELECT count(advert_images.advert_id) from advert_images WHERE advert_images.advert_id = adverts.advert_id ) AS num_photos
FROM adverts
LEFT JOIN breed ON adverts.breed_id = breed.breed_id
LEFT JOIN sellers ON (adverts.user_id = sellers.user_id)
LEFT JOIN users ON (adverts.user_id = users.user_id)
WHERE (adverts.status = 1) AND (adverts.approved = 1)
AND (adverts.latitude BETWEEN 51.2692837281 AND 51.8475762719) AND (adverts.longitude BETWEEN -0.472015213613 AND 0.458146213613)
having (distance <= '20')
ORDER BY distance ASC
LIMIT 0,10
Would it be better to remove the 2 inner selects below from the main query, and then in my php loop, call the 2 selects 10 times, once for each record in the loop?
( SELECT advert_images.image_name FROM advert_images WHERE advert_images.advert_id = adverts.advert_id AND advert_images.main = 1 LIMIT 1) as imagename,
( SELECT count(advert_images.advert_id) from advert_images WHERE advert_images.advert_id = adverts.advert_id ) AS num_photos
Avoid subqueries
As I understand your inner selects, they serve two purposes: find any name of an associated image, and count the number of associated images. You probably might achieve both using a left join instead of an inner select:
I haven’t tried this, but perhaps the MySQL engine is smart enough to only perform that part of the query for the rows you’re actually returning.
Note that there are no guarantees at all about which image name this query will return for a given set of images. If you want reproducible results, you should use some aggregating function there, e.g.
MIN(advert_images.image_name)to select the lexicographically first image.Separate select but no loop
If the above does not work, i.e. the query will still examine the
advert_imagestable for all rows of the computed result, then you’re probably really better off by performing a second query. You might however try avoiding theforloop, and instead fetch all those rows in a single query:The ten parameters in this query correspond to the ten rows of the result you’re currently generating. Note that an advert without an associated photo will not be included in that result at all. So make sure to default
num_photosto zero andimagenametoNULLin your code.Temporary table
Another way to achieve what you’re attempting to do would be to use an explicit temporary in-memory table: first you select the results you’re interested in, and then you retrieve all the associated information.
This will ensure that all the other tables are only queried for the results you’re currently working on. After all, there is little magic about the
advert_imagestable except the fact that you might want multiple rows from it.Subquery as join factor
Building on the approach from the preceding paragraph, you can even avoid managing a temporary table, and use a subquery in its stead:
Again you determine the relevant rows by using data from the
advertstable only, and join only the required rows from other tables. Most likely, that intermediate result will internally be stored in a temporary table, but that is up to the SQL server to decide.