I’ve got an ASP.NET 2.0 web app backed by a pretty complex SQL Server database (lots of tables and lots of joins happening in lots of queries). My logs show that one day, the server-side load time for one particular page type jumped significantly. It had been usually below 100ms but sometimes up around 200ms before, and it went to above 600ms and has been there ever since (just over a week ago). The previous time we migrated new code to production was almost two weeks before this happened, and there was not a large volume of new data put into the system on that date.
I looked at this misbehaving page type in our test environment (which is less beefy than production, of course) and saw it averaging at around 450ms, which is higher than I want but not as high as production. That’s weird; I’d expect the test environment to run slower than production, since they’ve got essentially the same data set.
I narrowed it down to one database call (one line of C#) that was taking close to 200ms to call a stored procedure and assemble the results into .NET objects (the huge majority of that time is in the DB). I pulled up that sproc, copied the body of it and had SQL Server Manager tell me the estimated execution plan. It told me there was a missing nonclustered index, which I created; that brought the time on the database call in question below 100ms. Some more investigation showed that there is no other significant performance sink on that page type.
So I’ve improved things a little, and I’m tempted to create that index in production and see if page load time decreases significantly. But I still have questions that bug me and that I would prefer to answer before I mess around in production:
- Why would performance drop all of a sudden? If it was simply a missing index, I would expect it to have been a factor all along (slowly decreasing performance as more data is added).
- Why would test be performing better than production if it’s got the same data set?
I know no one can tell me about my app, but I’m hoping for some insight into what kinds of things can change like that and be installation-specific.
EDIT: I added the index to production and it brought the page load time back down to around 100ms. I still don’t quite understand what happened; maybe it’ll click someday when I learn something seemingly unrelated about databases and SQL.
The performance of an execution plan in MSSQL is complex to predict. It could be that the missing index will increase the execution time with the square of the size of the table. That would make it look like it works, works, works and then suddenly get painfully slow.
If management studio indicates a missing index, then you should add it (or another, even smarter index) to the production environment.
Still, 100ms for an sp run by an ASP page is quite long. If the page is used a lot, try optimizing the sp.