I’m trying to build a .NET regex to match SQL Server constant strings… but not Unicode strings.
Here’s a bit of SQL:
select * from SomeTable where SomeKey = 'abc''def' and AnotherField = n'another''value'
Note that within a string two single quotes escapes a single quote.
The regex should match ‘abc”def’ but not n’another”value’.
I have a regex now that manages to locate a string, but it also matches the Unicode string (starting just after the N):
'('{2})*([^']*)('{2})*([^']*)('{2})*'
Thanks!
This pattern will do most of what you are looking to do:
The upside is that it should accurately match any number of escaped quotes. (
SomeKey = 'abc''''def'''will matchabc''''def''.)The downside is it also matches Unicode strings, although it captures the leading
nto identify it as a Unicode string. When you process the regular expression, you can ignore matches where the match group “unicode” was successful.The pattern creates the following groups for each match:
If you are using .NET regular expressions, you could add
(?(unicode)(?<-value>))to the end of the pattern to suppress matching the value, although the pattern as a whole would still match.Edit
Having thought about it some more, the following pattern should do exactly what you wanted; it will not match Unicode strings at all. The above approach might still be more readable, however.