Preliminaries:
Our application can read data from an attached client SQL Server 2005 or 2008 database but make no changes to it, apart from using temp tables. We can create tables in our own database on their server.
The solution must work in SQL Server 2005.
The Schema:
Here is a simplified idea of the schema.
Group – Defines characteristics of a group of locations
Location – Defines characteristics of one geographic location. It links to the Group table.
GroupCondition – Links to a Group. It defines measures that apply to a subset of locations belonging to that group.
GroupConditionCriteria – Links to GroupCondition table. It names attributes, values, relational operators and boolean operators for a single phrase in a where clause. The named attributes are all fields of the Location table. There is a sequence number. Multiple rows in the GroupConditionCriteria must be strung together in proper sequence to form a full filter condition. This filter condition is implicitly restricted to those Locations that are part of the group associated with the GroupCondition. Location records that satisfy the filter criteria are “Included” and those that do not are “Excluded”.
The Goal:
Many of our existing queries get attributes from the location table. We would like to join to something (table, temp table, query, CTE, openquery, UDF, etc.) that will give us the GroupCondition information for those Locations that are “Included”. (A location could be included in more than one rule, but that is a separate issue.)
The schema for what I want is:
CREATE TABLE #LocationConditions
(
[PolicyID] int NOT NULL,
[LocID] int NOT NULL,
[CONDITIONID] int NOT NULL,
[Satisfies Condition] bit NOT NULL,
[Included] smallint NOT NULL
)
PolicyID identifies the group, LocID identifies the Location, CONDITIONID identifies the GroupCondition, [Satisfies Condition] is 1 if the filter includes the location record. (Included is derived from a different rule table with forced overrides of the filter condition. Not important for this discussion.)
Size of Problem:
My best effort so far can create such a table, but it is slow. For the current database I am testing, there are 50,000 locations affected (either included or excluded) by potentially matching rules (GroupConditions). The execution time is 4 minutes. If we do a periodic refresh and use a permanent table, this could be workabble, but I am hoping for something faster.
What I tried:
I used a series of CTEs, one of which is recursive, to concatenate the several parts of the filter condition into one large filter condition. As an example of such a condition:
(STATECODE = 'TX' AND COUNTY = 'Harris County') OR STATECODE = 'FL'
There can be from one to five fields mentioned in the filter condition, and any number of parentheses used to group them. The operators that are supported are lt, le, gt, ge, =, <>, AND and OR.
Once I have the condition, it is still a text string, so I create an insert statement (that will have to be executed dynamically):
insert into LocationConditions
SELECT
1896,
390063,
38,
case when (STATECODE = 'TX' AND COUNTY = 'Harris County') OR STATECODE = 'FL' then 1
else 0
end,
1
FROM Location loc
WHERE loc.LocID = 390063
I first add the insert statements to their own temp table, called #InsertStatements, then loop through them with a cursor. I execute each insert using EXEC.
CREATE TABLE #InsertStatements
(
[Insert Statement] nvarchar(4000) NOT NULL
)
-- Skipping over Lots of complicated CTE's to add to #InsertStatements
DECLARE @InsertCmd nvarchar(4000)
DECLARE InsertCursor CURSOR FAST_FORWARD
FOR
SELECT [Insert Statement]
FROM #InsertStatements
OPEN InsertCursor
FETCH NEXT FROM InsertCursor
INTO @InsertCmd
WHILE @@FETCH_STATUS = 0
BEGIN
--PRINT @InsertCmd
EXEC(@InsertCmd)
FETCH NEXT FROM InsertCursor
INTO @InsertCmd
END
CLOSE InsertCursor
DEALLOCATE InsertCursor
SELECT *
FROM #LocationConditions
ORDER BY PolicyID, LocID
As you can imagine, executing 50,000 dynamic SQL inserts is slow. How can I speed this up?
you have to insert each row individually? you can’t use
? You didn’t show how you were creating your insert statements, so I can’t tell if it’s dependent on each row or not.