I would like to match two files based on one column and combine the matching lines. But one of the files (file1.txt) has the same entry more than once. As an example:
file1.txt
chr:123 a
chr:123 b
chr:456 a
file2.txt
chr:123 aa
chr:456 bb
I would like to extract the indexes based on the first column.
The final output should look like:
chr:123 a aa
chr:123 b aa
chr:456 a bb
I tried intersect on R but couldn’t figure out how to combine matching lines when file1.txt has the same entry more than once.
I am using two for loops but the files are very big and it takes lots of time.
Is there a quicker way to do this in perl or R?
Try this: