I’m using a long regular expression which is pretty hard to grok if you didn’t write it in the previous 5 minutes –
"/([^\s]+)\s*[^\[]+\[([^\]]+)\]\s*"([^\s]+)\s*([^\s]+)\s*([^"]+)"\s*([^\s]+)\s*([^\s]+) \s*"([^"]+)"\s*"([^"]+)"/
Is there a commonly adopted way of formatting long regular expressions in code that makes for better readability?
I thought of putting each capture group on its own line, e.g.
/([^\s]+)
\s*[^\[]+\[([^\]]+)
\]\s*"([^\s]+)
\s*([^\s]+)
\s*([^"]+)
"\s*([^\s]+)
\s*([^\s]+)
\s*"([^"]+)
"\s*"([^"]+)"/
This would be excellent if I could put comments line by line on each section of the regex, but Ruby won’t let me.
I’m more interested in the general question of what to do with big regex than in better ways to parse text… this particular case was just part of an exercise I set myself while learning a bit of Ruby.
Just use the
xflag (Which means ignore whitespace).And then you can also put comments. See example:
See: http://codepad.org/PDSxQUQf