I asked previously how to correct errors in count data using awk, where the

Question

0

Asked: June 9, 20262026-06-09T02:00:41+00:00 2026-06-09T02:00:41+00:00

I asked previously how to correct errors in count data using awk, where the

0

I asked previously how to correct errors in count data using awk, where the first column of my data is a number used to identify the sub-arena that’s being measured, and the second column is the count data from that sub-arena. The counting is automated and the program makes errors (indicated below with #), where it will occasionally ‘miscount’ because the animals that are being counted have moved outside the range of the specific sub-arena.

1       0
1       2
1       6
1       7
1       7
1       8
1       7 #
1       7 #
1       9
2       0
2       0
2       1
2       4
2       3 #
2       3 #
2       4
2       4
2       6

I’d like to correct the above like so:

The code that was kindly suggested didn’t include a for loop for correcting within the data for each arena (there are 20 total per file) and I’ve been trying to figure this out but am having an incredibly hard time, with syntax errors some times and illegal statement errors other times. I’d appreciate any hints as to why the following won’t work (sorry I’m such a newbie, this is one of the many iterations that I’ve tried and none of them are pretty):

awk 'i=1; i<=20; i++; $1=i {NR > 1 && $2 < p {$2 = p} {p = $2} 1}' infile > outfile

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T02:00:42+00:00

Rather than counting the lines, why not have another variable tracking the line number which resets p if the line number increments:

awk '$1 > l { l = $1; p = 0 } $2 < p { $2 = p } { p = $2 } 1' input-file

First the first position ($1) is compared to the value in the l variable (that defaults to 0). If it’s greater, l is set to $1, and p is reset to 0. Then the second position ($2) is compared to p, and if it’s less set to p. Finally, p is set to the value of the (possibly changed) $2. The final 1 just means “print”; otherwise the command would do all the processing but not print any of it.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I asked previously how to correct errors in count data using awk, where the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply