I have a plain text file with words, which are separated by comma, for example:
word1, word2, word3, word2, word4, word5, word 3, word6, word7, word3
i want to delete the duplicates and to become:
word1, word2, word3, word4, word5, word6, word7
Any Ideas? I think, egrep can help me, but i’m not sure, how to use it exactly….
Assuming that the words are one per line, and the file is already sorted:
If the file’s not sorted:
If they’re not one per line, and you don’t mind them being one per line:
That doesn’t remove punctuation, though, so maybe you want:
But that removes the hyphen from hyphenated words. “man tr” for more options.