This isn’t working like I expect, despite all research. I must be missing something…
File 1…
# cat file1.csv
1 123 JohnDoe
1 456 BobDylan
1 789 BillyJean
File 2…
# cat file2.csv
111 123 DaddyDoe
222 456 DaddyDylan
666 777 Stranger
555 789 DaddyJean
444 888 Stranger
333 999 Stranger
I am trying to join on both the second fields. When I perform a left outer join and only include fields from the first file, everything seems dandy.
# join -1 2 -2 2 -a 1 -o 1.2 1.3 file1.csv file2.csv
123 JohnDoe
456 BobDylan
789 BillyJean
But as soon as I include a field from the second file, it all goes wack.
# join -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
DaddyDoeoe
DaddyDylann
789 BillyJean DaddyJean
The last line looks perfect! What’s up with the others? Any idea? Thanks in advance!
EDIT: Here is my attempt with actual CSVs.
# cat file1.csv
1,123,JohnDoe
1,456,BobDylan
1,789,BillyJean
# cat file2.csv
111,123,DaddyDoe
222,456,DaddyDylan
666,777,Stranger
555,789,DaddyJean
444,888,Stranger
333,999,Stranger
# join -t, -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
,DaddyDoeoe
,DaddyDylann
789,BillyJean,DaddyJean
You used the
-aoption.In addition, the odd overwriting behavior indicates that you have embedded carriage returns (
\r). I would examine those fies closely withcat -vor a text editor that doesn’t try to be “smart” about Windows files.