A client has reported repeated instances of Very strange behaviour when executing a stored procedure.
They have code which runs off a cached transposition of a volatile dataset. A stored proc was written to reprocess the dataset on demand if:
1. The dataset had changed since the last reprocessing
2. The datset has been unchanged for 5 minutes
(The second condition stops massive repeated recalculation during times of change.)
This worked fine for a couple of weeks, the SP was taking 1-2 seconds to complete the re-processing, and it only did it when required. Then…
- The SP suddenly “stopped working” (it just kept running and never returned)
- We changed the SP in a subtle way and it worked again
- A few days later it stopped working again
- Someone then said “we’ve seen this before, just recompile the SP”
- With no change to the code we recompiled the SP, and it worked
- A few days later it stopped working again
This has now repeated many, many times. The SP suddenly “stops working”, never returning and the client times out. (We tried running it through management studio and cancelled the query after 15 minutes.)
Yet every time we recompile the SP, it suddenly works again.
I haven’t yet tried WITH RECOMPILE on the appropriate EXEC statments, but I don’t particularly want to do that any way. It gets called hundred of times an hour and normally does Nothing (It only reprocesses the data a few times a day). If possible I want to avoid the overhead of recompiling what is a relatively complicated SP “just to avoid something which “shouldn’t” happen…
- Has anyone experienced this before?
- Does anyone have any suggestions on how to overcome it?
Cheers,
Dems.
EDIT:
The pseduo-code would be as follows:
- read “a” from table_x
- read “b” from table_x
- If (a < b) return
- BEGIN TRANSACTION
- DELETE table_y
- INSERT INTO table_y <3 selects unioned together>
- UPDATE table_x
- COMMIT TRANSACTION
The selects are “not pretty”, but when executed in-line they execute in no time. Including when the SP refuses to complete. And the profiler shows it is the INSERT at which the SP “stalls”
There are no parameters to the SP, and sp_lock shows nothing blocking the process.
As others have said, something about the way the data or the source table statistics are changing is causing the cached query plan to go stale.
WITH RECOMPILEwill probably be the quickest fix – useSET STATISTICS TIME ONto find out what the recompilation cost actually is before dismissing it out of hand.If that’s still not an acceptable solution, the best option is probably to try to refactor the insert statement.
You don’t say whether you’re using
UNIONorUNION ALLin your insert statement. I’ve seenINSERT INTOwithUNIONproduce some bizarre query plans, particularly on pre-SP2 versions of SQL 2005.Raj’s suggestion of dropping and
recreating the target table with
SELECT INTOis one way to go.You could also try selecting each of
the three source queries into their own
temporary table, then
UNIONthose temp tablestogether in the insert.
Alternatively, you could try a
combination of these suggestions –
put the results of the union into a
temporary table with
SELECT INTO,then insert from that into the target
table.
I’ve seen all of these approaches resolve performance problems in similar scenarios; testing will reveal which gives the best results with the data you have.