I am working in VB.Net and trying to use Regex.Replace to format a string I am using to query Sql. What Im going for is to cut out comments “–“. I’ve found that in most cases the below works for what I need.
string = Regex.Replace(command, "--.*\n", "")
and
string = Regex.Replace(command, "--.*$", "")
However I have ran into a problem. If I have a string inside of my query that contains the double dash string it doesn’t work, the replace will just cut out the whole line starting at the double dash. It makes since to me as to why but I can’t figure out the regular expression i need to match on.
logically I need to match on a string that starts with “–” and is not proceeded by “‘” and not followed by “‘” with any number of characters inbetween. But Im not sure how to express that in a regular expression. I have tried variations of:
string = Regex.Replace(cmd, "[^('.*)]--.*\n[^(.*')]", "")
Which I know is obviously wrong. I have looked at a couple of online resources including http://www.codeproject.com/KB/dotnet/regextutorial.aspx
but due to my lack of understanding I can’t seem to figure this one out.
I think you meant “match on a string that starts with
--and is notproceededpreceeded by'and not followed by'with any number of characters inbetween”If so, then this is what you are looking for:
Of course, it means you can’t have apostrophes in your comments… and would be exceedingly easy to hack if someone wanted to (you aren’t thinking of using this to protect against injection attacks, are you? ARE YOU!??! 😀 )
I can break down the expression if you’d like, but it’s essentially the same as my modified quote above!
EDIT:
I modified the expression a little, so it does not consume any carriage return, only the comment itself… the expression says:
.which means everything in this case) if you cannot match the expression that follows. Thus the.*?in.*?--(when applied against the stringabc--) will consumea, then check to see if the--can be matched and fail; it will then consumeab, but stop again to see if the--can be matched and fail; once it consumesabcand the--can be matched (success), it will finally consume the entireabc--.*without the?will matchabc--with the.*, then try to match the end of the string with--and fail; it will then backtrack until it can match the--.“anything” does not by default include newlines (carriage-return/line-feed), which is needed for this to work properly (there is a switch that will allow.to match newlines and it will break this expression)A good resource – where I’ve learned 90% of what I know about regex – is Regular-Expressions.info
Tread carefully and good luck!