Alright SQL Server Gurus, fire up your analyzers.
- I have a list of titles in application memory (250 or so).
- I have a database table “books” with greater than a million records, one of the columns of the table is “title” and is of type nvarchar.
- the “books” table has another column called “ISBN”
- books.title is not a primary key, is not unique, but is indexed.
So I’d like to know which is more efficient:
WITH titles AS (select 'Catcher and the Rye' as Title
union all 'Harry Potter ...'
...
union all 'The World Is Flat')
select ISBN from books, titles where books.title = titles.title;
OR:
select ISBN from books where title in ('Catcher and the Rye','Harry Potter',...,'The World Is Flat');
OR:
???
I hope you have ISBN includes on the title index to avoid key lookups
Now, the IN vs JOIN vs EXISTs is a common question here. The CTE is irrelevant except for readability. Personally, I’d use exists because you’ll get duplicates with JOIN for books with the same title, which folk often forget.
However, one construct I’d consider is this to force “intermediate materialisation” on my list of search titles. The also applies to an exists or CTE solution too. This is likely to help the optimiser considerably.
Edit: a temp table is a better option, really, as Steve mentioned in his comment