I am working with SQL 2000. I have gotten to a point where I

Question

0

Asked: June 4, 20262026-06-04T00:38:45+00:00 2026-06-04T00:38:45+00:00

I am working with SQL 2000. I have gotten to a point where I

0

I am working with SQL 2000. I have gotten to a point where I can remove all of the unwanted duplicates based on a complicated set of criteria, but the query now takes hours to complete when it only used to take about 3.5 minutes to get the data with the duplicates included.

For Clarity:
I can have a duplicate rpt.Name field as long as either the rpt.HostName or rpt.SystemSerialNumber fields is also different. Also, I have to determine which entry to keep based on the time stamps of four different columns as some of those columns have missing time stamps.

Any help is greatly appreciated!

SELECT 
rpt.[Name],
rpt.LastAgentExecution,
rpt.GroupName,
rpt.PackageName,
rpt.PackageVersion,
rpt.ProcedureName,
rpt.HostName,
rpt.SystemSerialNumber,
rpt.JobCreationTime,
rpt.JobActivationTime,
rpt.[Job Completion Time]
FROM DSM_StandardGroupMembersProcedureActivityViewExt rpt
WHERE
(
  (
      rpt.GroupName = 'Adobe Acrobat 7 Deploy'
   OR rpt.GroupName = 'Adobe Acrobat 8 Deploy'
  )
  AND
  (
      (rpt.PackageName = 'Adobe Acrobat 7' AND rpt.PackageVersion = '-1.0')
   OR (rpt.PackageName = 'Adobe Acrobat 8' AND rpt.PackageVersion = '-3.0')
  )
)
AND NOT EXISTS
(
  SELECT *
  FROM   DSM_StandardGroupMembersProcedureActivityViewExt rpt_dupe
  WHERE
  (
    (
     rpt.GroupName = 'Adobe Acrobat 7 Deploy'
      OR rpt.GroupName = 'Adobe Acrobat 8 Deploy'
    )
    AND
    (
     (rpt.PackageName = 'Adobe Acrobat 7' AND rpt.PackageVersion = '-1.0')
      OR (rpt.PackageName = 'Adobe Acrobat 8' AND rpt.PackageVersion = '-3.0')
    )
    AND
    (
      (rpt_dupe.[Name] = rpt.[Name])
      AND
      (
       (rpt_dupe.SystemSerialNumber = rpt.SystemSerialNumber)
    OR (rpt_dupe.HostName = rpt.HostName)
      )
      AND
      (
       (rpt_dupe.LastAgentExecution    < rpt.LastAgentExecution)
    OR (rpt_dupe.JobActivationTime     < rpt.JobActivationTime)
    OR (rpt_dupe.JobCreationTime       < rpt.JobCreationTime)
    OR (rpt_dupe.[Job Completion Time] < rpt.[Job Completion Time])
      )
    )
  )
)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T00:38:46+00:00

The reason is the not exists clause.

One suggests is to rewrite this as a left outer join:

 from <big query> left outer join
      <dups query>
      on <all the fields that constitute a match>
 where <dups query>.<some field> is null

I’ve found that not exists and not in often optimize poorly.

Another suggestion is to change this query to a more direct implementation:

with t as (
    SELECT rpt.[Name], rpt.LastAgentExecution, rpt.GroupName, rpt.PackageName,
           rpt.PackageVersion, rpt.ProcedureName, rpt.HostName, rpt.SystemSerialNumber, 
            rpt.JobCreationTime, rpt.JobActivationTime, rpt.[Job Completion Time]
    FROM DSM_StandardGroupMembersProcedureActivityViewExt rpt
    WHERE rpt.GroupName in ('Adobe Acrobat 7 Deploy', 'Adobe Acrobat 8 Deploy') AND
          ((rpt.PackageName = 'Adobe Acrobat 7' AND rpt.PackageVersion = '-1.0') OR
           (rpt.PackageName = 'Adobe Acrobat 8' AND rpt.PackageVersion = '-3.0')
          )
 )
 select t.*
 from t join
      (select name, ..., max(id)
       from t
       group by name, ...
      ) tsum
      on t.id = tsum.id

That is, summarize the table by the columns that you want distinct. Choose one of the rows. Here, I assume there is an “id” field to uniquely identify each row. You might have to use a combination of fields, such as name and date. Without an id, this is more challenging. In more recent versions of SQL server, you can use row_number().

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working with SQL 2000. I have gotten to a point where I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply