I have a fairly simple task I need to automate for an analysis. I have found similar questions on this forum but not applied to a plain text file, and as I am a python newbie I am not sure how to convert these solutions directly to my needs. So I’d appreciate any help.
I have a series of files in this format:
11 5012 1000 10036040.000000 1.089555 4.529811 0.150000
11 5013 1000 10038040.000000 1.089783 4.340549 0.150000
11 5014 1000 10039040.000000 1.090000 4.733367 0.150000
11 5015 1000 10044040.000000 1.090217 4.601943 0.150000
11 5016 1000 10044040.000000 1.090435 5.048237 0.150000
11 5017 1000 10046040.000000 1.090652 1.280908 0.050000
each file is named “data1-1”, “data1-2”, “data1-3” etc
The data is separated by single spaces and there is no header
I would like a script to go into each file, find the row with the max value in column 5 (eg value 5.048237 above) and to print that row into a new output file.
In the end I need one output file that contains the rows with the max value in column 5 from each of the input files. So if there were 5 input files the output file would have 5 rows.
I hope this is clear, any help is really appreciated!
1 Answer