I have a SQL server (2008 R2) that stores metadata for files in a table. Each file has its own Row, and each file has an MD5 calculated and stored for it. I want to print a list of files where the MD5 value occurs more than once in the server, so I can go through and identify files that have been duplicated over time and decide which one to delete. I have a rather messy command full of several inner joins that I found works for my MySQL server from a few years ago, but modifying it to SQL Server hasn’t worked for me yet. Any one know of any easier ways to do this? below is the modified MySQL command I was trying. Thanks
select [IGCSlidesDB].[dbo].[FilePath]
, [IGCSlidesDB].[dbo].[FileSize]
, [IGCSlidesDB].[dbo].[MD5] from [IGCSlidesDB].[dbo].[MD5Tool]
inner join ( select
[IGCSlidesDB].[dbo].[FilePath],
[IGCSlidesDB].[dbo].[FileSize],
[IGCSlidesDB].[dbo].[MD5] from [IGCSlidesDB].[dbo].[MD5Tool] group by [MD5] having count(*)>1)
as t2 on ([IGCSlidesDB].[dbo].[MD5Tool].[MD5]=[t2].[MD5])
order by [IGCSlidesDB].[dbo].[MD5Tool].[FilePath];
Try this: