I have a stored procedure that can have 1 to 4 variables passed to it and it must return the rows where the most columns match or if there are no matching records it returns the default ones (which are null).
The sequence needs to be distinct.
Example table with data:
Client_Id Project_ID Phase Task Employee Sequence
--------- ---------- ----- ---- -------- --------
NULL NULL NULL NULL Chris 1
NULL NULL NULL NULL Bob 100
500 NULL NULL NULL Joe 1
500 2 NULL NULL Max 1
So the results for Client 100, any project, phase or task would simply be the default NULL records of Chris and Bob. For Client 500 the results would be Joe and Bob. For Client 500, Project 2 the result would be Max and Bob.
Right now I am doing this query by checking the task first then joining it with a query by phase and checking that no rows overlap and doing the same for project then client. It seems incredibly inefficient and there has to be a smarter way about this. Any thoughts?
EDIT – Some query examples, I check first for the case where everything matches
insert into #TempTracking
select p.employee, p.sequence
from invoices i, projects p
where i.client_id = p.client_id
and i.project_no = p.project_no
and i.phase = p.phase
and i.task = p.task
Then I make the queries less and less specific and check that the sequence does not already exist.
insert into #TempTracking
select p.employee, p.sequence
from invoices i, projects p
where (i.client_id = p.client_id or i.client_id is null)
and (i.project_no = p.project_no or i.project_no is null)
and (i.phase = p.phase or i.phase is null)
and (i.task = p.task or i.task is null)
and NOT EXISTS ( SELECT * FROM #TempTracking t WHERE t.sequence = p.sequence )
“Most of the columns match” is very vague, but I assume you mean that if they search for null, or if the value in the table is null then assume this record could be included.
If you want the most matching row or all rows that match nothing, then you will need to do something like this (it’s starting to get very long)
Note: This will return all records in the table when the number of matching fields is zero.
I’m sure they’re are ways this could be cleaned up to not be so verbose by inserting all matching rows into a temp table and including the number of columns that match (
MatchCount), there by reducing the query considerably.Now, since you want unique Sequences and the highest matching row / rows to be returned the result you’re looking for is more like this:
Or something very close to that, I don’t have a test table made up atm so I can’t kick it around and see.