# comm -12 /tmp/src /tmp/txt | wc -l
10338
# join /tmp/src /tmp/txt | wc -l
10355
Both the files are single columns of alphanumeric strings and sort-ed. Shouldn’t they be the same?
Updated following @Kevin-s answer below:
cat /tmp/txt | sed 's/^[:space:]*//' > /tmp/stxt
cat /tmp/src | sed 's/^[:space:]*//' > /tmp/ssrc
and the result:
#join /tmp/ssrc /tmp/stxt | wc -l
516
# comm -12 /tmp/ssrc /tmp/stxt | wc -l
513
On manual inspection of the diff-s … the results differ due to some whitespaces that were not taken out by the sed.
I haven’t used either extensively, but from a quick look at the man pages and test input, it seems that if the two files differ, comm prints both and join only prints matching lines.The -12 took care of that. You could store the output of the two into files and do a diff to see how they differ.Edit:
Join only compares the first whitespace-separated field but comm compares the whole line. Any whitespace on the line will therefore make the output differ.