When I entered my current (employer’s) company a new database schema was designed and will be the base of a lot of the future tools that are/will be created.
With my limited SQL knowledge I think the table is rather well designed.
My only concern is that almost every table has a multy-part primary key. Every table has at least a CustomerId and key of it’s own. While these are indeed defining for a certain record, I have the feeling that multiple keys (we’re talking quadruple here) are very inefficient.
Today I was seeing some unimaginable CPU usage over a simple, repeated query that joins two tables, selects a single string field from the first and distincts them.
select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
on f.id = b.fId
where b.cId = @id;
Checking the execution plan (I’m no EP Hero) I noticed that there are three major CPU points. The distinct (as expected) and two seeks over the indeces.
I would personally think that the indices seek should be extremely fast, but they take up 18% of the cost each. Is this normal? Is it due to the (quadruple) clustered indexes?
–UPDATE–
The query is used for creating a Lucene index. It’s a one-time processing that happens about weekly (sounds contradictive, I know). I can’t re-use any results here as far as I see.
Could you please run the following queries and post their output:
This will help to estimate which indexes best suit your needs.
Meanwhile make sure you have the following indexes:
and rewrite your query:
This query should use an index on
f.FIELDNAMEto build theDISTINCTlist and the index onbarto filter out the non-existent values.