Hey all, this is a follow up to this question: Regex syntax highlighting question
I’m not sure what the procedure is when a question is spawned off an answer to a previous question, so if this is the wrong way to go about it let me know.
Basically, I was unclear in my previous question. I have been messing around in http://www.rubular.com/ trying to get my RegEx to work, but to no avail. The problem lies in the messages I’m trying to parse, which are pretty irregular and have lots of messy nested quotes. Here is an example message, which is pretty much worst case scenario:
2011/03/04 10:27:17 [STUFF] subject=STUFF, message={ANNOYINGFIELD1="STUFF HEADER=(STUFF,STUFF,STUFF) FIELD=STUFF FIELD=0 FIELD= FIELD=84HDH.1 FIELD=9.6 FIELD="more stuff here" FIELD=- FIELD=NO FIELD="-" ANNOYINGFIELD2="A WHOLE BUNCH OF STUFF""}
As you can the confusing parts (for me, at least) are with ANNOYINGFIELD1, whose quote encompasses the whole rest of the message (And I don’t want to color it, because the things inside need to be colored), HEADER which throws an awesome parenthesis curveball, and ANNOYINGFIELD2, which is similar to the first but I actually do want these to be colored (that is, fields with quoted strings INSIDE ANNOYINGFIELD1. To further clarify, I want the end result to be something like this… (don’t have to stick to this, cause I don’t know what RegEx is capable of, but something close).
(Bold will take the place of color 1, and Italics color 2)
2011/03/04 10:27:17 [STUFF] subject=STUFF, message={ANNOYINGFIELD1=”STUFF HEADER=(STUFF,STUFF,STUFF) FIELD=STUFF FIELD=0 FIELD= FIELD=84HDH.1 FIELD=9.6 FIELD=”inside needs to be italics, editor giving me problems” FIELD=– FIELD=NO FIELD=”–” ANNOYINGFIELD2=”more italics””}
Since this was confusing just to write, please let me know if I need to clarify anything.
EDIT
Ive been modifying some of the suggestions from my first attempt at asking the question and this is REALLY close: ((\S+)=((?:\x22[^\x22]+\x22|[^>\s]+)))\s The only thing its messing up on are fields with no value (IE: FIELD1= FIELD2= are not receveing color) and an occational edge case where the last field has quotes, so it looks like this: (FIELD1=”stuff stuff stuff””}) Any thoughts?
Figured it out! Just need a | at the end to grab the no value cases… Still dosnt handle that one end case but this is a good enough compromise for me!
Final RegEx:
((\S+)=((?:\x22[^\x22]+\x22|[^>\s]+)|))\s