I am a little confused about how SQL server achieves less reads and performance improvements using common table expressions and ROW_NUMBER. Why doesn’t the table actualized in the expression have to perform all the reads which a normal query would have to perform to allow the query to order using ROW_NUMBER?
Share
The
CTEis not (necessarily) “actualized”. It’s not that it will inevitably copy all rows elsewhere and will perform other operations over the copy (though it may behave so it the optimizer decides it’s better).If we take this simple query:
and look at its plan we’ll see something like this:
Here, the records are scanned (in
idorder as the table is clustered onid), assigned theROW_NUMBER(this is whatSequence Projectdoes) and passed on toTOPwhich just halts execution when a certain threshold is reached (110records in our case).Those 110 records are passed to
Filterwhich only passes the records withrngreater than 100.The query itself only scans
110records:in 3 pages.
Now let’s see the unpaginated query:
This one is pretty simple: read everything from the table and spit it out.
However, looking easily does not mean done easily. The table is quite large and we need to do many a read to return all records:
So, in a nutshell, the pagination query just knows when to stop.