I have a directory with several thousand text files that I need to process. Some of these files are identical while others are identical except the timestamp varies by a few seconds / milliseconds. I need some way to automate the deletion of identical files and only keep one copy.
I’m thinking of something like:
while there are files in the directory still
{
get file // e.g., file0001
while (file == file + 1) // e.g., file0001 == file0002 using 'fc' command
{
delete file + 1
}
move file to another directory
}
Is something like this even possible in Microsoft Windows Server 2003’s DOS?
Of course it is. Everything is possible in batch. 😀
This batch doesn’t actually delete files. It just echos the result of the comparison. You can delete either one of the files if you find two that are the same.
Save the code as
CleanDuplicates.batand start the program withCleanDuplicates {Folder}Provided AS IS, without any guarantees! I don’t want you knocking on my door because your entire server is messed up. 😉
The code actually calls itself recursively. This could maybe be done in a different way but hey, it works. It also starts itself again in a new cmd, because that makes cleaning up easier. I tested the script in Windows Vista Business, but it should work on Server 2003 as well. Hey, it even has a help function. 😉
It contains two loops that each return every file, so when you implement the actual deleting, it may report that some files don’t exist, because they are deleted in an earlier iteration.