I’m looking for a regular expression library in .Net that supports lazy evaluation.
Note: I’m specifically looking for lazy evaluation (i.e., the library, instead of immediately returning all matches in a document, only consumes as much of the document as necessary to determine the next match per request), NOT support for lazy quantifiers – though if it also supports lazy quantifiers, I wouldn’t object!
Specific details: I want to be able to run regexes against very large documents with potentially hundreds of thousands of regex matches, and iterate across the results using IEnumerable<> semantics, without having to take the up-front cost of finding all matches.
Ideally FOSS in C#, but the only requirement is usability from a .Net 3.5 app.
The Match class’
NextMatchmethod should meet your needs:A quick look at it in Reflector confirms this behavior:
Check out the linked MSDN reference for an example of its usage. Briefly, the flow would resemble: