I have zip file that I get that, when unzipped, has several subfolders each containing a csv called report.csv:
the_folder/
1234/report.csv
abcd/report.csv
jklm/report.csv
5678/report.csv
Each CSV’s has columns & content like:
almonds, biscuits, cookies, dog_biscuits
123, 321, 333, 444
555, 666, 777, 888
444, 551, 555, 999 (and so on for 75 lines or so)
I want to put their into a combined CSV file. I’d been using exec in a PHP file to do this:
exec("cat /path/the_folder/*/report.csv > /path/combined.csv");
And then using sed to remove the duplicate “almonds, biscuits, cookies, dog_biscuits” header rows.
Now I need to take the names of the subfolders and put them into the lines of combined.csv.
So, there will be a column added to the CSV (“subfolder_name”) and then in that column the name of the folder that line had come from. Something like
almonds, biscuits, cookies, dog_biscuits, subfolder_name
123, 321, 333, 444, 1234
555, 666, 777, 888, 1234
444, 551, 555, 999, abcd
333, 333, 111, 222, abcd
111, 222, 444, 333, abcd (etc and so on for 300 lines)
What is the smartest, simplest, most efficient way to go about doing this?
You can try
I hope it helps
Thanks