I’m trying to extract some information from web server log and it’s not very structured so I run into troubles, I’m trying to match :
Example 1 :
2011-11-29 11:30:23,685 DEBUG [my.fully.qualified.package.Service] Added Action Item: M= 2 Success
Example 2 :
2011-11-29 11:30:23,685 DEBUG [my.fully.qualified.package.Service] Added Action Item: M=10 Success
This regex works for example 1 :
(\d\d\d\d-\d\d-\d\d)\s[\d|:]+,\d+\s([A-Z]+)\s\[(.+)\]\s.+:\sM=\s(\d).+
Where first group is the date, second is log level, third is class name and third one is the value of M.
You may have noticed that in example 1, after M= there is a space before digit and in the other example there is not that’s why this regex is not working.
I did try something like M=[\s|d]+ but I get some more characters matched that I asked for, anyone have suggestion how to match both of these examples with one regex?
You want
M=\s*(\d+)that will allow zero or more whitespace immediately after the=, but not any whitespace after the digits.