So I have about 1000 files that are multiple columns, but I’m only interested

Question

0

Asked: June 6, 20262026-06-06T04:55:51+00:00 2026-06-06T04:55:51+00:00

So I have about 1000 files that are multiple columns, but I’m only interested

0

So I have about 1000 files that are multiple columns, but I’m only interested in some stats of two of those columns. If $4 was something like a star’s spectral class (ie a unique value) and $5 in each of these files was a result, like seen, unseen, unknown, etc, is there a recommended way to grep or awk out the stats like so across the 1000 or so files so I get something like:

Type O, #verified, #not-verified, #property-j, ...
Type B, ...
Type A, ...
.
.
.
Type i,

Where, in each file, you’d see something like:

$1, $2, $3, Spectral Type, Result
foo, foo, foo, A, verified
foo, foo, foo, G, verified
foo, foo, foo, A, unknown
foo, foo, foo, F, verified
foo, foo, foo, G, verified
foo, foo, foo, K, verified
foo, foo, foo, K, seen

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T04:55:54+00:00

If your question is: “How do I generate output of the form “Type $4, $5″ where $4 and $5 are the 4th and 5th columns of the input, respectively?” one solution is:

for i in list of input file; do
  awk '{print "Type "$4, $5}' $i > $i.result
done

This gives the output that it seems you want, but relies on the all columns not containing whitespace. If there may be whitespace, you can do:

 awk '{printf( "Type %s, %s", $4, $5 )}' FS=, $i > $i.result

but you may want to trim the extra whitespace that this will generate. Please note that although in the example I have hardcoded the list of input files to be the 4 files names “list”, “of”, “input”, and “file”, I do not expect you to type the names in. Instead, you should generate them in some fashion, and I’m merely demonstrated one (of many!) methods of iterating over a set of files. It seems that the heart of this question is the portion dealing with awk, and not the iteration.

A second reading of the question indicates that you have exactly one row per input file and you want to summarize the results in a single file. In that case, just do:

cat list of all files | awk '{print "Type "$4, $5}'

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So I have about 1000 files that are multiple columns, but I’m only interested

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply