I have a job setup that currently selects records from a table that does not contain a unique index. I realize this could be solved by just putting an index on the table and the relevant columns but, in this scenario for testing purposes, I need to remove the index and then do a select which will also remove duplicates based on 2 columns:
SELECT DISTINCT [author], [pubDate], [dateadded]
FROM [Feeds].[dbo].[socialPosts]
WHERE CAST(FLOOR(CAST(dateadded AS float)) AS datetime) >
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE() - 2), 0)
AND CAST(FLOOR(CAST(dateadded AS float)) AS datetime) <
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)
This selects all records from the day before and I want to dedupe the records based on author and pubdate. This could be a post select or done prior but the idea is to find out if it can be done within a select.
You can use a
GROUP BYand any aggregate function on thedateaddedcolumn to get uniqueauthor, pubdateresults.