I’ve been writing a lot of one-off SQL queries to return exactly what a certain page needs and no more.
I could reuse existing queries and issue a number of SQL requests linear to the number of records on the page. As an example, I have a query to return People and a query to return Job Details for a person. To return a list of people with their job details I could query once for people and then once for each person to retrieve their job details. I’ve found that in most cases that solution returns things in a reasonable amount of time, but I don’t know how well it will scale in my environment. Instead I’ve been writing queries to join people + job details, or people + salary history, etc.
I’m looking at my models and I see how I could shave off maybe 30% of my code if I were to re-use existing queries. This is a big temptation. Is it a bad thing to go for reuse over efficiency in general or does it all come down to the specific situation? Should I first do it the easy way and then optimize later, or is it best to get the code knocked out while everything is fresh in my mind? Thoughts, experiences?
Architectural decisions such as this one always depend on the exact circumstances. But as a general rule, loops over one dataset in order to retreive additional data should always be avoided. A join normally has a lot better performance – especially if there is network latency between the web server and the DB server.
If you still want to have separate queries, you should at least construct the fetch of the second entity through a
WHERE IN (...)construct so that you fetch all lines at once.Code duplication is alwasy bad and SQL queries tend to be very similar, yet not exactly the same. I think that using an ORM tool, even such a basic as Linq-to-SQL helps a lot in reducing the overhead of specialized queries.