I have an Apache log format file. Example string:
fj5020.inktomisearch.com - - [01/Oct/2006:06:35:59 -0700] "GET /example/When/200x/2005/04/27/A380 HTTP/1.0" 200 4776 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
where 4776 is served page size in bytes. I’d like to output top 10 URLs by served traffic. I’m stuck with the problem of summing all sizes of each unique page (the size of a page can also be variable). Any ideas how to do it in Bash or/and AWK?
does this work for you?