I have two files large_input and subset_input file and their contents could be
large_input
1
34
65
7643
hello
we
subset_input
65
we
hello
34
In this case sort command is not very helpful, otherwise sort | uniq on both files following by diff would had been very useful
Question
In such scenarion where data can not be sorted(because of its contents), whats the best way to find out
large_input – subset_input which would be
1
7643
works for me,
output:
Some shells don’t support
<(sort fileX), so you might have to presort file files in-place likesort -o file1 file1; sort file -o file2 file2; ....The sed expressions remove the output from
diff. To see what it is doing, first remove the sed completely, the add back 1 section (delimited by semicolon) at a time.I hope this helps.