I have a table with 200,000 rows. I have created a View where I am removing slices of data from this table based on different criteria which fit my definition of what constitutes a duplicate record. I have the code for doing so below and I was wondering if anyone could suggest a faster/more efficient method of writing this query. It currently takes about 20 seconds to execute but I was hoping for a couple of seconds at most to execute this query if not less. I am using SQL Server 2005. My knowledge of SQL is very beginner and I appreciate any help.
WITH dsm_hardware_basic_cte AS
(
SELECT TOP 100 PERCENT
dbo.dsm_hardware_basic.[UUID]
,dbo.dsm_hardware_basic.[Name]
,dbo.dsm_hardware_basic.[LastAgentExecution]
,dbo.dsm_hardware_basic.[MaxUserRegistration]
,REPLACE(RIGHT([MaxUserRegistration], CHARINDEX('/', REVERSE([MaxUserRegistration])) - 1),'_ADMIN','') AS [MaxUserUsername]
,dbo.dsm_hardware_basic.[LastUserRegistration]
,REPLACE(RIGHT([LastUserRegistration], CHARINDEX('/', REVERSE([LastUserRegistration])) - 1),'_ADMIN','') AS [LastUserUsername]
,dbo.dsm_hardware_basic.[IPAddress]
,dbo.dsm_hardware_basic.[HostName]
,dbo.dsm_hardware_basic.[MACAddress]
FROM dbo.dsm_hardware_basic
)
SELECT TOP 100 PERCENT
dsm_hardware_basic_cte.[UUID]
,dsm_hardware_basic_cte.[Name]
,dsm_hardware_basic_cte.[LastAgentExecution]
,dsm_hardware_basic_cte.[MaxUserRegistration]
,dsm_hardware_basic_cte.[LastUserRegistration]
,dsm_hardware_basic_cte.[IPAddress]
,dsm_hardware_basic_cte.[HostName]
,dsm_hardware_basic_cte.[MACAddress]
FROM dsm_hardware_basic_cte
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [Name], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [NameMACRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_NameMAC_filtered
ON duplicate_NameMAC_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_NameMAC_filtered.[NameMACRowNum] = 1
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [Name], [HostName] ORDER BY [LastAgentExecution] DESC) AS [NameHostNameRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_NameHostName_filtered
ON duplicate_NameHostName_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_NameHostName_filtered.[NameHostNameRowNum] = 1
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [HostName], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameMACRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_HostNameMAC_filtered
ON duplicate_HostNameMAC_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_HostNameMAC_filtered.[HostNameMACRowNum] = 1
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [HostName], [IPAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameIPAddressRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_HostNameIPAddress_filtered
ON duplicate_HostNameIPAddress_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_HostNameIPAddress_filtered.[HostNameIPAddressRowNum] = 1
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [Name], [MaxUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameMaxUserRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_NameMaxUser_filtered
ON duplicate_NameMaxUser_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_NameMaxUser_filtered.[NameMaxUserRowNum] = 1
INNER JOIN
(
SELECT [UUID]
,ROW_NUMBER() OVER (PARTITION BY [Name], [LastUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameLastUserRowNum]
FROM dsm_hardware_basic_cte
) AS duplicate_NameLastUser_filtered
ON duplicate_NameLastUser_filtered.[UUID] = dsm_hardware_basic_cte.[UUID]
AND duplicate_NameLastUser_filtered.[NameLastUserRowNum] = 1
I don’t know what your needs are, but I’d try re-writing the query as such:
I think that your query and mine are logically equivalent. The optimizer might be smart enough to have reduced your query to mine, but give it a spin and see! A couple of notes: