It is clear that there are lots of problems that look like a simple regex expression will solve, but which prove to be very hard to solve with regex.
So how does someone that is not an expert in regex, know if he/she should be learning regex to solve a given problem?
(See “Regex to parse C# source code to find all strings” for way I am asking this question.)
This seems to sums it up well:
Some people, when confronted with a problem, think “I know, I’ll use
regular expressions.”
Now they have two problems…
(I have just changed the title of the question to make it more specific, as some of the problems with Regex in C# are solved in Perl and JScript, for example the fact that the two levels of quoting makes a Regex so unreadable.)
Don’t try to use regex to parse hierarchical text like program source (or nested XML): they are proven to be not powerful enough for that, for example, they can’t, for a string of parens, figure out whether they’re balanced or not.
Use parser generators (or similar technologies) for that.
Also, I’d not recommend using regex to validate data with strict formal standards, like e-mail addresses.
They’re harder than you want, and you’ll either have unaccurate or a very long regex.