I am trying to parse some csv files using awk. I am new to shell scripting and awk.
The csv file i am working on looks something like this :
fnName,minAccessTime,maxAccessTime
getInfo,300,600
getStage,600,800
getStage,600,800
getInfo,250,620
getInfo,200,700
getStage,700,1000
getInfo,280,600
I need to find the average AccessTimes of the different functions.
I have been working with awk and have been able to get the average times provided the exact column numbers are specified like $2, $3 etc.
However I need to have a general script in which if i input “minAccessTime” in the command argument, I need the script to print the average AccessTime (instead of explicitly specifying $2 or $3 while using awk).
I have been googling about this and saw in various forums but none of them seems to work.
Can someone tell me how to do this ? It would be of great help !
Thanks in advance!!
This
awkscript should give you all that you want.It first evaluates which column you’re interested in by using the name passed in as the
COLMvariable and checking against the first line. It converts this into an index (it’s left as the default 0 if it couldn’t find the column).It then basically runs through all other lines in your input file. On all these other lines (assuming you’ve specified a valid column), it updates the count, sum, minimum and maximum for both the overall data plus each individual function name.
The former is stored in
count,sum,minandmax. The latter are stored in associative arrays with similar names (with_arrappended).Then, once all records are read, the
ENDsection outputs the information.Storing that script into
qq.awkand placing your sample data intoqq.in, then running:generates the following output, which I’m relatively certain will give you every possible piece of information you need:
For `maxAccessTime, you get:
And, for
xyzzy(a non-existent column), you’ll see: