I need to convert several million dates stored as wide strings into boost dates
The following code works. However, it generates a horrible compiler warning and does not seem efficient.
Is there a better way?
#include 'boost/date_time/gregorian/gregorian.hpp' using namespace boost::gregorian; #include <string> using namespace std; wstring ws( L'2008/01/01' ); string temp(ws.length(), '\0'); copy(ws.begin(), ws.end(), temp.begin()); date d1( from_simple_string( temp ) ); cout << d1;
The better way turns out to be to use the standard C++ library locale, which is a collection of facets. A facet is a service which allows the stream operators to handle a particular choice for date or time representation or just about anything else. All the choices about diferent things, each handled by its own facet, are gathered together in a locale.
This solution was pointed out to me by litb who gave me enough help to use facets in my production code, making it terser and faster. Thanks.
There is an excellent tutorial on locales and facets by Nathan Myers who designed facets. He has a light style which makes his tutorial easy to read, though this is advanced stuff and your brain may hurt after the first read through – mine did. I suggest you go there now. For anyone who just wants the practicalities of converting wide character strings to boost dates, the rest of this post describes the minimum necessary to make it work.
litb first offered the following simple solution that removes the compiler warning. ( The solution was edited before I got around to accepting it. ) This looks like it does the same thing, converting wide characters one by one, but it avoids mucking around with temp strings and therefore is much clearer, I think. I really like that the compiler warning is gone.
#include 'boost/date_time/gregorian/gregorian.hpp' using namespace boost::gregorian; #include <string> using namespace std; wstring ws( L'2008/01/01' ); date d1( from_simple_string( string( ws.begin(), ws.end() ) ); cout << d1;
litb went on to suggest using ‘facets’, which I had never heard of before. They seem to do the job, producing incredibly terse code inside the loop, at the cost of a prologue where the locale is set up.
wstring ws( L'2008/01/01' ); // construct a locale to collect all the particulars of the 'greek' style locale greek_locale; // construct a facet to handle greek dates - wide characters in 2008/Dec/31 format wdate_input_facet greek_date_facet(L'%Y/%m/%d'); // add facet to locale greek_locale = locale( greek_locale, &greek_date_facet ); // construct stringstream to use greek locale std::wstringstream greek_ss; greek_ss.imbue( greek_locale ); date d2; greek_ss << ws; greek_ss >> d2; cout << d2;
This, it turns out, is also more efficient:
clock_t start, finish; double duration; start = clock(); for( int k = 0; k < 100000; k++ ) { string temp(ws.length(), '\0'); copy(ws.begin(), ws.end(), temp.begin()); date d1( from_simple_string( temp ) ); } finish = clock(); duration = (double)(finish - start) / CLOCKS_PER_SEC; cout << '1st method: ' << duration << endl; start = clock(); for( int k = 0; k < 100000; k++ ) { date d1( from_simple_string( string( ws.begin(), ws.end() ) ) ); } finish = clock(); duration = (double)(finish - start) / CLOCKS_PER_SEC; cout << '2nd method: ' << duration << endl; start = clock(); for( int k = 0; k < 100000; k++ ) { greek_ss << ws; greek_ss >> d2; ss.clear(); } finish = clock(); duration = (double)(finish - start) / CLOCKS_PER_SEC; cout << '3rd method: ' << duration << endl;
Produces the following output:
1st method: 2.453 2nd method: 2.422 3rd method: 1.968
OK, this is now in the production code and passing regression tests. It looks like this:
// .. construct greek locale and stringstream // ... loop over input extracting date strings // convert range to boost dates date d1; greek_ss<< sd1; greek_ss >> d1; if( greek_ss.fail() ) { // input is garbled wcout << L'do not understand ' << sl << endl; exit(1); } greek_ss.clear(); // finish processing and end loop
I have one final question about this. Adding the facet to the locale seems to require two invocations of the locale copy constructor
// add facet to locale greek_locale = locale( greek_locale, &greek_date_facet );
Why is there not an add( facet* ) method? ( _Addfac() is complex, undocumented and deprecated )
efotinis found a good way using from_stream .
I’ve looked into the manual of
date_timeand found it supports facets:You could also go with that.
I’ve looked up how date facets work:
boost::date_time::date_input_facettemplate implements a facet.std::locale::facetand every one has an unique id.std::localeusing the form i showed, you give it an existing locale, and a pointer to facet. The given facet will replace any existing facet of the same type in the locale given. (so, it would replace any other date_input_facet used).std::has_facet<Facet>(some_locale)to check whether the given locale has some given facet type.std::use_facet<Facet>(some_locale).some_member....The below is essentially done by
operator>>by boost::date_type :