In answer to the following question:
How to convert MatchCollection to string array
Given The two Linq expressions:
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.OfType<Match>() //OfType
.Select(m => m.Groups[0].Value)
.ToArray();
and
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.Cast<Match>() //Cast
.Select(m => m.Groups[0].Value)
.ToArray();
OfType<> was benchmarked by user Alex to be slightly faster (and confirmed by myself).
This seems counterintuitive to me, as I’d have thought OfType<> would have to do both an ‘is’ comparison, and a cast (T).
Any enlightenment would be appreciated as to why this is the case 🙂
My benchmarking does not agree with your benchmarking.
I ran an identical benchmark to Alex’s and got the opposite result. I then tweaked the benchmark somewhat and again observed
Castbeing faster thanOfType.There’s not much in it, but I believe thatCastdoes have the edge, as it should because its iterator is simpler. (Noischeck.)Edit: Actually after some further tweaking I managed to getCastto be 50x faster thanOfType.Below is the code of the benchmark that gives the biggest discrepancy I’ve found so far:
Tweaks I’ve made:
On my machine this results in ~350ms for
Castand ~18000ms forOfType.I think the biggest difference is that we’re no longer timing how longMatchCollectiontakes to find the next match. (Or, in my code, how longint.ToString()takes.) This drastically reduces the signal-to-noise ratio.Edit: As sixlettervariables pointed out, the reason for this massive difference is that
Castwill short-circuit and not bother casting individual items if it can cast the wholeIEnumerable. When I switched from usingRegex.Matchesto an array to avoid measuring the regex processing time, I also switched to using something castable toIEnumerable<string>and thus activated this short-circuiting. When I altered my benchmark to disable this short-circuiting, I get a slight advantage toCastrather than a massive one.