I have an input file as following. I need to break them into multiple files based on the columns 2,3&5. The file has more columns but i have used cut command to get only the required columns.
12,Accounts,India,free,Internal
13,Finance,China,used,Internal
16,Finance,China,free,Internal
12,HR,India,free,External
19,HR,China,used,Internal
33,Finance,Japan,free,Internal
39,Accounts,US,used,External
14,Accounts,Japan,used,External
11,Finance,India,used,External
11,HR,US,used,External
10,HR,India,used,External
Output files:
Accounts_India_Internal --
12,Accounts,India,free,Internal
Finance_China_Internal --
13,Finance,China,used,Internal
16,Finance,China,free,Internal
HR_India_External --
12,HR,India,free,External
10,HR,India,used,External
HR_China_Internal --
19,HR,China,used,Internal
and so on..
Please let me know how to achieve this.
As of now, I am thinking to sort the file based on these columns (2,3,5) and then run a loop on each record and start creating files. If a file does not exist, then create and add the record. Otherwise open the old file and add the record.
Is it possible to do this using shell scripting (bash)?
If you simply want to split the files based on fields 2, 3 and 5 you can do that quickly with
awk:That appends each line to a file whose name is made up of fields 2, 3 and 5.
Example:
If you do want output sorted, you can first run the file through
sort.That sorts the lines on fields 2, 3, and 5 before passing them on to the
awkcommand.Do note that the we’re appending to the files so if you repeat the command without deleting the output files, you’ll end up with duplicate data in the output files. To address this, as well as include your additional requirements (using first line as header for all new files) as mentioned in the chat, see this solution.