I have several regexes (actually several thousands), and I must check if one string matches any of these regexes. It is not very efficient, so I would like to merge all these regexes as a single regex.
For example, if a have these regexes:
- ‘foo *bar’
- ‘foo *zip’
- ‘zap *bar’
I would like to obtain something like ‘foo *(bar|zip)|zap *bar’.
Is there some algorithm, library or tool to do this?
You can just concatenate the regexes using or (
|) (and anchors for the beginning/end of string).Most good regex libraries optimize their finite state automata after they build it from your regex. PCRE does that, for instance.
This step usually takes care of your optimization problem, ie. they apply most of the transformations you would have to do “by hand”.