I want to write a regular expression which can match following specification for string literals. For the last 10 hours, I’ve gone crazy over formulating various regular expressions which none seem to work. Finally I’ve boiled down to this one:
([^"]|(\\[.\n]))*\"
Basically, requirements are following:
- A String literal has to be matched so I’m matching everything upto the last “, in between there could be a \”, which should not end the string.
- We could also be able to escape anything including a \n with a ‘\’
- Only an unescaped ‘”‘ character can end the match, nothing else.
Some sample strings which I need to correctly match are following:
- \a\b\”\n” => I should match following character ‘\’, ‘a’, ‘\’, ‘b’, ‘\’,'”‘,’\’, ‘n’, ‘”‘
- \”this is still inside the string” => should match whole text including last ‘”‘
- ‘m about to escape to a newline \’\n'” => There’s a \n character in this string, but still the string should match everything from starting ‘m’ to ending ‘”‘.
Kindly someone please help me formulate such a Regex. In my opinion that Regex I’ve provided should do the job, but it’s rather failing for no reason.
Your regular expression is almost right, you just need to be aware that inside a character class the period
.is just a literal.and not any character except newline. So:Or: