As stated in the title, is there a way, using regular expressions, to match a text pattern for text that appears outside of quotes. Ideally, given the following examples, I would want to be able to match the comma that is outside of the quotes, but not the one in the quotes.
This is some text, followed by ‘text, in quotes!’
or
This is some text, followed by ‘text, in quotes’ with more ‘text, in quotes!’
Additionally, it would be nice if the expression would respect nested quotes as in the following example. However, if this is technically not feasible with regular expressions then it wold simply be nice to know if that is the case.
The programmer looked up from his desk, ‘This can’t be good,’ he exclaimed, ‘the system is saying ‘File not found!”
I have found some expressions for matching something that would be in the quotes, but nothing quite for something outside of the quotes.
This can be done with modern regexes due to the massive number of hacks to regex engines that exist, but let me be the one to post the ‘Don’t Do This With Regular Expressions’ answer.
This is not a job for regular expressions. This is a job for a full-blown parser. As an example of something you can’t do with (classical) regular expressions, consider this:
No (classical) regex can determine if those parenthesis are matched properly, but doing so without a regex is trivial:
See how simple it was to write some non-regex code to do the job for you?
EDIT: Okay, back from seeing Adventureland. 🙂 Try this (written in Perl, commented to help you understand what I’m doing if you don’t know Perl):
Another way to do it:
(I give two because, in another language, one solution may be easier to implement than the other, not just because There’s More Than One Way To Do It™.)
Of course, as your problems grow in complexity, there will arise certain benefits of constructing a full-blown parser, but that’s a different horse. For now, this will suffice.