Our application receives log files via email and so the lines are often broken up by the email client. Once I’ve read the body of the email in I have a string variable $log in the following format.
Fri Aug 26 11:52:30 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]
PKCS11] built Fri Aug 26 11:52:30 2011 NOTE: OpenVPN 2.1 requires '--script-security 2'
or higher to call user-defined scripts or executables Fri Aug 26 11:52:30 2011
Control Channel Authentication: using 'ta.key' as a OpenVPN static key file
Fri Aug 26 11:52:30 2011 Outgoing Control Channel Authentication: Using 160
bit message hash 'SHA1' for HMAC authentication Fri Aug 26 11:52:30
2011 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1'
for HMAC authentication Fri Aug 26 11:52:30 2011 LZO compression initialized
Fri Aug 26 11:52:30 2011 Control Channel MTU parms [ L:1558 D:166 EF:66 EB:0
ET:0 EL:0 ] Fri Aug 26 11:52:30 2011 Socket Buffers: R=[8192->8192] S=[8192->8192]
As shown above the date does not always start on a newline. I’d like to generate an array containing the dates and log messages so that I can output a table with these fields in their own columns. I understand that I would need a regex to match the date field but how do I go about building the array?
I’m just going to update my answer with a new version entirely, since the example log file has changed a lot. Since the log seems to be line broken just about anywhere, this approach – now including a bit of regexp works:
It produces the expected: