I found a tool to find duplicate files, and now I’m ready to delete the duplicates. I stared at the format of the output file for a bit, and came up with this script.
#!/usr/bin/env ruby
contents = File.open('fdupes-result', 'rb') { |f| f.read }
duplicates = contents.split("\n\n")
duplicates.each do |set|
list = set.split("\n").reverse
list.drop(1).each do |filename|
# print "rm #{filename}"
%x[ rm #{filename} ]
end
end
The commented print statement is what I used to test the thing non-destructively, and it seemed to work great. But when I added the %x command I started getting
sh: 1: Syntax error: "(" unexpected
I didn’t know what was happening, until I realized that the file has names like /media/LilGalactus/music/Robert Johnson - Complete Recordings/Robert Johnson - The Complete Recordings (Disc2of2)[EAC-FLAC](oan)/s.gif — i.e., my data is full of unescaped spaces, brackets, and the like. For about 30 seconds I considered escaping the characters by using vim commands like :s/\(/\\\(/g and I think the limitations of that method are pretty obvious. Any ideas?
Instead of escaping them just add double-quotes like this:
%x[ rm "#{filename}" ]