I have a query with a long chain of CTEs which ends with
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionaryStreets
UNION ALL
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionaryRegions
The execution time of this query is 1450 ms. When I execute these 2 SELECTs separatly it takes much less time. For the query
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionaryStreets
execution time is 106 ms. And for the query
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionaryRegions
it’s 20 ms.
Why UNION ALL increases the execution time in more than 10 times? What can I do to decrease it?
Thank you for your help.
UPDATED
The whole query (I shortened it, but the problem still presents) is
WITH tFoundRegions AS
(
SELECT KladrItemName FROM dbo.tBuiltKladrItemsWithQuants
WHERE UserID = @UserID AND (indeces & 1) > 0
),
tFoundAreas AS
(
SELECT KladrItemName FROM dbo.tBuiltKladrItemsWithQuants
WHERE UserID = @UserID AND (indeces & 2) > 0
),
tFoundCities AS
(
SELECT KladrItemName FROM dbo.tBuiltKladrItemsWithQuants
WHERE UserID = @UserID AND (indeces & 4) > 0
),
tFoundSubCities AS
(
SELECT KladrItemName FROM dbo.tBuiltKladrItemsWithQuants
WHERE UserID = @UserID AND (indeces & 8) > 0
),
tFoundStreets AS
(
SELECT KladrItemName FROM dbo.tBuiltKladrItemsWithQuants
WHERE UserID = @UserID AND (indeces & 16) > 0
),
tDictionaryStreets AS
(
SELECT DISTINCT
CASE WHEN RegionName IN (SELECT KladrItemName FROM tFoundRegions) THEN RegionName ELSE NULL END RegionName
, CASE WHEN AreaName IN (SELECT KladrItemName FROM tFoundAreas) THEN AreaName ELSE NULL END AreaName
, CASE WHEN CityName IN (SELECT KladrItemName FROM tFoundCities) THEN CityName ELSE NULL END CityName
, CASE WHEN SubCityName IN (SELECT KladrItemName FROM tFoundSubCities) THEN SubCityName ELSE NULL END SubCityName
, StreetName
FROM StreetNames
WHERE StreetName IN (SELECT KladrItemName FROM tFoundStreets)
),
tMissingSubCities AS
(
SELECT KladrItemName FROM tFoundSubCities
WHERE KladrItemName NOT IN (SELECT SubCityName FROM tDictionaryStreets)
),
tDictionarySubCities AS
(
SELECT DISTINCT
CASE WHEN RegionName IN (SELECT KladrItemName FROM tFoundRegions) THEN RegionName ELSE NULL END RegionName
, CASE WHEN AreaName IN (SELECT KladrItemName FROM tFoundAreas) THEN AreaName ELSE NULL END AreaName
, CASE WHEN CityName IN (SELECT KladrItemName FROM tFoundCities) THEN CityName ELSE NULL END CityName
, SubCityName
, NULL StreetName
FROM SubCityNames
WHERE SubCityName IN (SELECT KladrItemName FROM tMissingSubCities)
)
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionaryStreets
UNION ALL
SELECT RegionName, AreaName, CityName, SubCityName, StreetName
FROM tDictionarySubCities
Make sure you clear the execution + data caches between each test run.
e.g.
If you run with the UNION ALL first, and then run the 2 selects separately afterwards, the data will already be cached in memory making performance much better (therefore giving the false impression that the subsequent approach is quicker when it may not be).
If you used a UNION then that may well be slower as it has to apply a DISTINCT, but UNION ALL doesn’t have to do that so it should be no different.
Update:
Have a look at the execution plans and compare them – see if there is any difference. You can view the execution plan by clicking the “Include Actual Execution Plan” button in SSMS before running the query
Update 2:
Based on full CTEs given, I think I’d be looking at optimising those – I don’t think the UNION ALL is actually the problem.
IMHO, best thing to try is work through the CTEs one by one and try to optimise each one individually so that when you then combine them all in the main query, they perform better.
e.g. for tDictionaryStreets, how about trying this:
KladrItemName on each table should at least have an index on.
Try reworking tDictionarySubCities in the same kind of way with joins too.