I make the problem shorter. Actually I have data much longer than this.
I have a file like:
aa, bb, cc, dd, ee, 4
ff, gg, hh, ii, jj, 5
kk, ll, mm, nn, oo, 3
pp, qq, rr, ss, tt, 2
uu, vv, ww, xx, yy, 5
aa, bb, cc, dd, ee, 2
now I want to use awk to select each line with the same number in last column and redirect it into a new file, these new files will be different depending on the number in the last column.
eg. t2.txt, t3.txt, t4.txt, t5.txt will hold the lines with last number as 2,3,4,5 respectively.
in t2.txt:
pp, qq, rr, ss, tt, 2
aa, bb, cc, dd, ee, 2
in t3.txt:
kk, ll, mm, nn, oo, 3
in t4.txt:
aa, bb, cc, dd, ee, 4
in t5.txt:
ff, gg, hh, ii, jj, 5
uu, vv, ww, xx, yy, 5
I guess I need something like this:
BEGIN {FS=","}
{
for (n=2; n<=5; n++)
if ($6 ~/\$n/) {print > "t\$n.txt"}
}
But I just don’t know how to make it work.
This bash file do what I want, yet the problem is, each time it extracts lines with a specific number, it has to read in all the lines. How can I check ONLY TIME of the file and extract files for all numbers?
#!/bin/bash
for num in {2..5}; do
gawk --assign FS="," "\$6 ~/${num}/" infile >> t${num}.txt
done
I get the answer, with the following it works:
but any further explanation will be welcomed.