Is there such a thing? Can anyone kindly elucidate on this? I have been using AWK to perform simple tasks such as printing columns and merging large data file, but not for calculations? I was thinking if one can run AWK parallel using all the nodes and CPUs in my computer or in the network. But how? What is the primary aim using parallel AWK?
Thank you for your input.
After having posted the question, I found out Parallel AWK does exist. You can find more about it. Here is the link http://www.parallel-awk.org/
The problem with a parallel awk implementation is that the semantics explicitly assume that operations are processed in order. For example:
gives you output akin to
cat -n. The difficulty with processing this in parallel is that NR is the total number of lines processed, not just the number of lines in the given file (FNR)Also, there are more complicated tricks involving commands like getline, which cannot be parallelized (for example, a script can be short-circuited to emulate the gnu
nextfileextension)