I have a text file with words and positive numbers, separated by some whitespace, e.g.
A dog has a ball number 49 number 34 number A
Cats number 58
...
I want to sum up all the numbers that occur after the string “number”. If after a string “number” is not a number, then it doesn’t matter.
For example, in this case the answer would be 49+34+58, which is 141.
Awk reads the file, line per line. For every line, the blocks marked by
{}are executed. blocks can be guarded by a condition: a regular expression, …, andBEGINandEND, which are ‘true’ for the first line, and the last line, respectively.This means that awk exeutes the first block for every line (because it is unguarded).
Furthermore, awk does not really have a type system — all strings. But you can use arithmetic on the strings – in which case they are magically converted to numbers. If you do arithmetic on strings, which are not numbers, they evaluate to ‘0’.
This means: “asdf” + 1 = 1; 2+4 = 6; “asdf” + 0 = 0;
Variables don’t have to be declared – and default to the empty string, which has the numerical value of ‘0’.
The next awesomeness of awk is that it automatically splits the current input line into fields. The field separator can be specified, but defaults to whitespaces. The single fields can be accessed by
$1,$2, …$NF, i.e.NFis the number of fields.$0is the contents of the full input line.And there you have it: you look over all ‘fields’ of the current line. The numerical values of all fields (which are 0 for strings) are accumulated in the variable
s. After reading everything (END), the sum is printed.EDIT: this might conveniently work, but does not really answer the question, because it does not consider ‘number’ – sorry.
A fix:
that way, it also results in 141 for input like:
10 A dog has a ball number 49 number 34 number A
Cats 1000 number 58