I have a Apache access.log file, which is around 35GB in size. Grepping through it is not an option any more, without waiting a great deal.
I wanted to split it in many small files, by using date as splitting criteria.
Date is in format [15/Oct/2011:12:02:02 +0000]. Any idea how could I do it using only bash scripting, standard text manipulation programs (grep, awk, sed, and likes), piping and redirection?
Input file name is access.log. I’d like output files to have format such as access.apache.15_Oct_2011.log (that would do the trick, although not nice when sorting.)
One way using
awk:This will output files like:
Against a 150 MB log file, the answer by chepner took 70 seconds on an 3.4 GHz 8 Core Xeon E31270, while this method took 5 seconds.
Original inspiration: “How to split existing apache logfile by month?“