Today I just read some comments and I made some experiment. I imagined a system which storing some coordinates.
Here is the situation:
I have two tables, the first is:
CREATE TABLE Points
(
ID int IDENTITY(1,1) PRIMARY KEY,
X int,
Y int,
Name varchar(20),
Created datetime
)
It is just storing coordinates (1 million rows). The second one is a helper table storing some let’s say often used points for a select (around 1100 rows)
CREATE TABLE PointSearchHelper
(
X int,
Y int
)
So far so fine.
I would like to make an easy select:
SELECT p.* FROM Points p
INNER JOIN PointSearchHelper h
ON p.X = h.X AND p.Y = h.Y
I run the script, it gets the 1100 rows in around 280 ms on average.
When I check the execution plan I see, that the SQL Server 2008 R2 recommends an index (who would have thought? 😉 ) :
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Points] ([X], [Y])
INCLUDE ([ID], [Name], [Created])
This one is a full index on the table, contains each column. It’s size is “huge” comparing, that I’m storing the data now two times!
So the query no is much faster! It is around 75 ms(!) Very great improvement BUT I need almost double space for this improvement.
My question is simple: Is there any way to tell the SQL Server on the columns how to store the values or any other trick to save yourself from a double storage?
UPDATE:
With other words: is there any trick to avoid the “full index” with the same performance?
Change your PointSearchHelper table to just use the index rather than the x, y coordinates:
When you do the join, do it on points_id instead. This should reduce space and increase performance.
PS. I’m having the weirdest problem. Adding an open paren to the code is causing an error in loading the anwer.