I have 3 tsv files containing different data on my employees. I can join these data with the last name and first name of the employees, which appear in each file.
I would like to gather all the data for each employee in only one spreadsheet.
(I can’t just do copy/past of the columns because some employees are not in file number 2 for example but will be in file number 3).
So I think – I am a beginner – a script could do that, for each employee (a row), gather as much data as possible from the files in a new tsv file.
Edit.
Example of what I have (in reality I have approximatively 300 rows for each file, some emloyees are not in all files).
file 1
john hudson 03/03 male
mary kate 34/04 female
harry loup 01/01 male
file 2
harry loup 1200$
file3
mary kate atlanta
What I want :
column1 colum2 column3 column4 column5 column6
john hudson 03/03 male
mary kate 34/04 female atlanta
harry loup 01/01 male 1200$
It would help me a lot!
Use this python script:
The script loads each file into the dictionary (the first column is used as a key).
Then the script iterates through the values of the first column in the first file and
writes correspondent values from the dictionaries (that were created from the other files).
Example of usage: