I have two CSV files which use @ to divide each column. The first file (file1.csv) has two columns:
cat @ eats fish
spider @ eats insects
The second file (file2.csv) has four columns:
info @ cat @ info @ info
info @ spider @ info @ info
info @ rabbit @ info @ info
I need to add the information from the second column of the first file to a new column in the second file, in cases where the details of the first column of the first file and the second column of the second file match, e.g., the result of the above would make this:
info @ cat @ info @ info @ eats fish
info @ spider @ info @ info @ eats insects
info @ rabbit @ info @ info @
As seen above, as the first file contained no information about rabbits, a new empty column is added to the last row of the second file.
Here is what I know how to do so far:
while read line can be used to cycle through the rows in the second file, e.g.:
while read line
do
(commands)
done < file2.csv
The data from particular columns can be accessed with awk -F "@*" '{print $n}', where n is the column number.
while read line
do
columntwo=$(echo $line | awk -F "@*" '{print $2})
while read line
do
columnone=$(echo $line | awk -F "@*" '{print $1})
if [ “$columnone” == “$columntwo” ]
then
(commands)
fi
done < file1.csv
done < file2.csv
My approach seems inefficient and I am not sure how to use add the data from the second column of file1.csv1 to a new column in file2.csv.
- Items in column 1 of
file1.csv1and column 2 offile2.csvare unique to those files. There are no duplicate entries within those files. - The resulting file should have exactly 5 columns in every line, even if some columns are empty.
- The file contains a lot of characters from various languages in UTF-8.
- There is white space around
@, but if this causes problems with the script, I can delete this.
How can the data from the first file be added to the data in the second file?
And a nice, clean
awksolution:A nice one-liner. Not a short one, but not the longest I’ve seen. Note that file2 and file1 are switched. Again, as a script with explanation:
Call as
awk -f join.awk file2.csv file1.csv, or make executable and./join.awk file2.csv file1.csv.