I’ve got two text files, each with several hundred lines. Some of the lines exist in both files, and I want to remove those so that they exist in only one of the files. Basically, I want to reduce them to get a unique set of lines. The catch is that I can’t sort them (they are stripped-down dumps of my Chromium history).
What is the easiest way to do this?
I tried WinDiff, but that gave incorrect results. I figure that I could knock together a PHP script in a while, but am hoping that there is an easier way (preferably a command-line tool).
Well, I ended up writing a PHP script after all.
I read both files into a string, then exploded the strings into arrays using
\r\nas the delimiter. I then iterated through the arrays to remove any elements that exist, and finally dumped them back out to a file.The only problem was that by trying to refactor the stripping routine to a function, I found that passing the array that gets changed (elements removed) by reference caused it to slow down to the point of needing to be Ctrl-C’d, so I just passed by value and returned the new array (counterintuitive). Also, using
unsetto delete the elements was slow no matter what, so I just set the element to an empty string and skipped those during the dump.