In AWK, is it possible to specify “ranges” of fields?
Example. Given a tab-separated file “foo” with 100 fields per line, I want to print only the fields 32 to 57 for each line, and save the result in a file “bar”. What I do now:
awk 'BEGIN{OFS="\t"}{print $32, $33, $34, $35, $36, $37, $38, $39, $40, $41, $42, $43, $44, $45, $46, $47, $48, $49, $50, $51, $52, $53, $54, $55, $56, $57}' foo > bar
The problem with this is that it is tedious to type and prone to errors.
Is there some syntactic form which allows me to say the same in a more concise and less error prone fashion (like “$32..$57”) ?
You can do it in awk by using RE intervals. For example, to print fields 3-6 of the records in this file:
would be:
I’m creating an RE segment f to represent every field plus it’s succeeding field separator (for convenience), then I’m using that in the gensub to delete 2 of those (i.e the first 2 fields), remember the next 4 for reference later using \3, and then delete what comes after them. For your tab-separated file where you want to print fields 32-57 (i.e. the 26 fields after the first 31) you’d use:
The above uses GNU awk for it’s gensub() function. With other awks you’d use sub() or match() and substr().
EDIT: Here’s how to write a function to do the job:
Just set FS as appropriate. Note that this will need a tweak for the default FS if your input file can start with spaces and/or have multiple spaces between fields and will only work if your FS is a single character.