I was playing around with Boost.Regex to parse strings for words and numbers. This is what I have so far:
#include <iostream> #include <string> #include <boost/foreach.hpp> #include <boost/regex.hpp> #include <boost/range.hpp> using namespace std; using namespace boost; int main() { regex re ( '(' '([a-z]+)|' '(-?[0-9]+(\\.[0-9]+)?)' ')' ); string s = 'here is a\t list of Words. and some 1239.32 numbers to 3323 parse.'; sregex_iterator m1(s.begin(), s.end(), re), m2; BOOST_FOREACH (const match_results<string::const_iterator>& what, make_iterator_range(m1, m2)) { cout << ':' << what[1].str() << ':' << what.position(1) << ':' << what.length(1) << endl; } return 0; }
Is there a way to tell regex to parse from a stream rather than a string? It seems like it should be possible to use any iterator.
Boost.IOStreams has a regex_filter allowing one to perform the equivalent of a regex_replace on a stream. However, looking at the implementation, it seems to ‘cheat’ in that it simply loads the whole stream into a buffer and then calls Boost.Regex on that buffer.
Making a regex search on a stream’s contents without having to entirely load it in memory can be done with the ‘partial match‘ support of Boost.Regex. Look at the example at the end of the page.