I have this query in MS SQL which is acting very weird (at least from my perspective).
I’ve user defined function called: dbo.NajblizszaDataWyceny(3, ‘2010-02-05’) which is simple check for TOP 1 entry in one table joined with couple others. The query itself takes like milliseconds so it’s not a big problem, but i show the function anyway.
CREATE FUNCTION [dbo].[NajblizszaDataWyceny] (@idPortfela INT, @dataWaluty DATETIME)
RETURNS DATETIME
AS BEGIN
RETURN (
SELECT TOP 1 [WycenaData]
FROM [BazaZarzadzanie].[dbo].[Wycena] t1
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfeleKonta] t3
ON t1.[KlienciPortfeleKontaID] = t3.[KlienciPortfeleKontaID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfele] t4
ON t3.[PortfelID] = t4.[PortfelID]
WHERE [WycenaData] <= @dataWaluty AND [t3].[PortfelID] = @idPortfela
ORDER BY [WycenaData] DESC)
END
When i use this function in following way:
DECLARE @dataWyceny DATETIME
SET @dataWyceny = dbo.NajblizszaDataWyceny(3, '2010-02-05')
SELECT t1.[KlienciPortfeleKontaID],
t4.[PortfelIdentyfikator] AS 'UmowaNr',
t5.[KlienciRachunkiNumer],
[WycenaData],
t2.[InISIN] AS 'InstrumentISIN',
t2.[InNazwa] AS 'InstrumentNazwa',
[WycenaWartosc]
FROM [BazaZarzadzanie].[dbo].[Wycena] t1
LEFT JOIN [BazaZarzadzanie].[dbo].[Instrumenty] t2
ON t1.[InID] = t2.[InID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfeleKonta] t3
ON t1.[KlienciPortfeleKontaID] = t3.[KlienciPortfeleKontaID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfele] t4
ON t3.[PortfelID] = t4.[PortfelID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciRachunki] t5
ON t3.[KlienciRachunkiID] = t5.[KlienciRachunkiID]
LEFT JOIN [BazaZarzadzanie].[dbo].[WycenaTyp] t6
ON t1.[WycenaTyp] = t6.[WycenaTyp]
WHERE WycenaData = @dataWyceny AND t3.[PortfelID] = 3
ORDER BY t5.[KlienciRachunkiNumer],
WycenaData
it takes 1 second to run. But when i put the user function directly in WHERE so it looks like:
SELECT t1.[KlienciPortfeleKontaID],
t4.[PortfelIdentyfikator] AS 'UmowaNr',
t5.[KlienciRachunkiNumer],
[WycenaData],
t2.[InISIN] AS 'InstrumentISIN',
t2.[InNazwa] AS 'InstrumentNazwa',
[WycenaWartosc]
FROM [BazaZarzadzanie].[dbo].[Wycena] t1
LEFT JOIN [BazaZarzadzanie].[dbo].[Instrumenty] t2
ON t1.[InID] = t2.[InID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfeleKonta] t3
ON t1.[KlienciPortfeleKontaID] = t3.[KlienciPortfeleKontaID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciPortfele] t4
ON t3.[PortfelID] = t4.[PortfelID]
LEFT JOIN [BazaZarzadzanie].[dbo].[KlienciRachunki] t5
ON t3.[KlienciRachunkiID] = t5.[KlienciRachunkiID]
LEFT JOIN [BazaZarzadzanie].[dbo].[WycenaTyp] t6
ON t1.[WycenaTyp] = t6.[WycenaTyp]
WHERE WycenaData = dbo.NajblizszaDataWyceny(3, '2010-02-05') AND t3.[PortfelID] = 3
ORDER BY t5.[KlienciRachunkiNumer],
WycenaData
It takes 1.5minute to finish. Can anyone explain why this is happening?
Functions are not assumed to be pure in SQL Server which means that the query optimiser will not cache the results of a function and re-use it; the function will be called every time it is referenced. This is even true for simple functions which just return numbers (as we found out to our cost on a project where we were using functions to emulate constants…).
So in the first version the function is called once when you call it, and the result is cached manually and re-used in the query. However in the second version the function will be called for each row when the
WHEREclause tries to match the row. If you have a lot of rows, then a few milliseconds per row starts to add up.(Note also, that your queries are semantically different. In the first query you’re saying “where things are the same as the result of the function I evaluated at the start” and in the second one you’re saying “where things are the same as the result of the function I’m evaluating at this specific instance in time as I consider the row”. As your function uses a
SELECTstatement then – depending on the transaction isolation level – it could possibly return different results for different rows if the underlying data changes.)