I have a large dataset that looks like this: 5 6 5 6 3

Question

0

Asked: June 14, 20262026-06-14T05:09:57+00:00 2026-06-14T05:09:57+00:00

I have a large dataset that looks like this: 5 6 5 6 3

0

I have a large dataset that looks like this:

5 6 5 6 3 5
2 5 3 7 1 6
4 8 1 8 6 9
1 5 2 9 4 5

For every line, I want to subtract the first field from the second, third from fourth and so on deepening on the number of fields (always even). Then, I want to report those lines for which difference from all the pairs exceeds a certain limit (say 2). I should also be able to report next best lines i.e., lines in which one pairwise comparison fails to meet the limit, but all other pairs meet the limit.

from the above example, if I set a limit to 2 then, my output file should contain
best lines:

2 5 3 7 1 6    # because (5-2), (7-3), (6-1) are all > 2
4 8 1 8 6 9    # because (8-4), (8-1), (9-6) are all > 2

next best line(s)

1 5 2 9 4 5    # because except (5-4), both (5-1) and (9-2) are > 2

My current approach is to read every line, save each field as a variable, do subtraction.
But I don’t know how to proceed further.

Thanks,

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T05:09:59+00:00

Prints “best” lines to the file “best”, and prints “next best” lines to the file “nextbest”

awk '
{
        fail_count=0
        for (i=1; i<NF; i+=2){
                if ( ($(i+1) - $i) <= threshold )
                        fail_count++
        }
        if (fail_count == 0)
                print $0 > "best"
        else if (fail_count == 1)
                print $0 > "nextbest"
}
' threshold=2 inputfile

Pretty straightforward stuff.

Loop through fields 2 at a time.
If (next field – current field) does not exceed threshold, increment fail_count
If that line’s fail_count is zero, that means it belongs to “best” lines.

Else if that line’s fail_count is one, it belongs to “next best” lines.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large dataset that looks like this: 5 6 5 6 3

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply