I need to find and remove duplicate files (.pst) and eventually get the unique emails. Currently, I am using Powershell to recursively go through folders to find only .pst files and then export specific metadata into a .csv file. It has been suggested to me to import the .csv into SQL to do comparisons (name, dates on the files, etc…). After that, I’m stuck.
What language or program would be best suited to get the files I need and delete the rest of them? I’m pretty much working in VB.Net (could attempt C#) and powershell.
I’ll assume that you did import the .csv into an SQL database. Let’s say the table name is psts.
First, to find out how many records have the same email address,
Next, you don’t want to see the ones which have one value, so,
to get a list of those records,