I have a string which is in the following format, When there is a match the regex.macth static method runs pretty quickly. However often at times there will be a string which will not match and I was running some scenarios and it goes into backtracking and regex.match never seems to end. that specific case when the fields were not in order and some of the fields were not there. i have to use regex and was wondering if someone has any tips? Also I am only retrieving a few groups values for instance 7.
Okay my data looks like the above, when It matches exaclty it runs fine, forexample 100 hundred of this no problem and I am satisfied with that, When the format could be different for instance some of the fields for example the last four ones are not there or some of the fileds are ordered differenty, in this regex.match just runs forever. In this case if the format is not as my static string, I just want to end the process.
I have a string which is in the following format, When there is a
Share
I don’t have your data where I could test a failure.
Update
Thanks for the sample, I see the problem now. Basically, the regex has overlapping sets that are optionally matching. This is the subexpression
\s*([^}]*?). When overlapping character classs like these are combined, it can be a reciepe for catastrophic backtracking. In this case the regex is awash with whitespace references.The solution is to force certain parts to not give back to the backtracker when an optional part doesn’t benefit from backtracking (but only hurts it). Making a section atomic has the affect of making it a literal. In this case the trimming sections are causing problems, remove them from backtracking and it fixes the problem.
But, in order to trim correctly, the \s* expression needs to be alterred.
@Alan Moore mentions there is no possesive quantifiers in .NET.
He’s right, so use the ‘Atomic Grouping’ regex below.
Atomic Grouping version:
Possesive quantifiers Version:
Expanded with group numbers:
(this may be a bit painfull to look at)