So I’ve been learning a bit about Boost.Spirit to replace the use of regular expressions in a lot of my code. The main reason is pure speed. I’ve found Boost.Spirit to be up to 50 times faster than PCRE for some relatively simple tasks.
One thing that is a big bottleneck in one of my apps is taking some HTML, finding all “img” tags, and extracting the “src” attribute.
This is my current regex:
(?i:<img\s[^\>]*src\s*=\s*[""']([^<][^""']+)[^\>]*\s*/*>)
I’ve been playing around with it trying to get something to work in Spirit, but so far I’ve come up empty. Any tips on how to create a set of Spirit rules that will accomplish the same thing as this regex would be awesome.
Out of curiosity I redid my regex sample based on Boost Xpressive, using statically compiled regexes:
Interestingly, there is no discernable speed difference when using the dynamic regular expression; however, on the whole the Xpressive version performs better than the Boost Regex version (by roughly 10%)
The relevant code is as follows: (full code at https://gist.github.com/c16725584493b021ba5b)