Windows XP SP3. Core 2 Duo 2.0 GHz.
I’m finding the boost::lexical_cast performance to be extremely slow. Wanted to find out ways to speed up the code. Using /O2 optimizations on visual c++ 2008 and comparing with java 1.6 and python 2.6.2 I see the following results.
Integer casting:
c++:
std::string s ;
for(int i = 0; i < 10000000; ++i)
{
s = boost::lexical_cast<string>(i);
}
java:
String s = new String();
for(int i = 0; i < 10000000; ++i)
{
s = new Integer(i).toString();
}
python:
for i in xrange(1,10000000):
s = str(i)
The times I’m seeing are
c++: 6700 milliseconds
java: 1178 milliseconds
python: 6702 milliseconds
c++ is as slow as python and 6 times slower than java.
Double casting:
c++:
std::string s ;
for(int i = 0; i < 10000000; ++i)
{
s = boost::lexical_cast<string>(d);
}
java:
String s = new String();
for(int i = 0; i < 10000000; ++i)
{
double d = i*1.0;
s = new Double(d).toString();
}
python:
for i in xrange(1,10000000):
d = i*1.0
s = str(d)
The times I’m seeing are
c++: 56129 milliseconds
java: 2852 milliseconds
python: 30780 milliseconds
So for doubles c++ is actually half the speed of python and 20 times slower than the java solution!!. Any ideas on improving the boost::lexical_cast performance? Does this stem from the poor stringstream implementation or can we expect a general 10x decrease in performance from using the boost libraries.
Edit 2012-04-11
rve quite rightly commented about lexical_cast’s performance, providing a link:
http://www.boost.org/doc/libs/1_49_0/doc/html/boost_lexical_cast/performance.html
I don’t have access right now to boost 1.49, but I do remember making my code faster on an older version. So I guess:
Original answer
Just to add info on Barry’s and Motti’s excellent answers:
Some background
Please remember Boost is written by the best C++ developers on this planet, and reviewed by the same best developers. If
lexical_castwas so wrong, someone would have hacked the library either with criticism or with code.I guess you missed the point of
lexical_cast‘s real value…Comparing apples and oranges.
In Java, you are casting an integer into a Java String. You’ll note I’m not talking about an array of characters, or a user defined string. You’ll note, too, I’m not talking about your user-defined integer. I’m talking about strict Java Integer and strict Java String.
In Python, you are more or less doing the same.
As said by other posts, you are, in essence, using the Java and Python equivalents of
sprintf(or the less standarditoa).In C++, you are using a very powerful cast. Not powerful in the sense of raw speed performance (if you want speed, perhaps
sprintfwould be better suited), but powerful in the sense of extensibility.Comparing apples.
If you want to compare a Java
Integer.toStringmethod, then you should compare it with either Csprintfor C++ostreamfacilities.The C++ stream solution would be 6 times faster (on my g++) than
lexical_cast, and quite less extensible:The C
sprintfsolution would be 8 times faster (on my g++) thanlexical_castbut a lot less safe:Both solutions are either as fast or faster than your Java solution (according to your data).
Comparing oranges.
If you want to compare a C++
lexical_cast, then you should compare it with this Java pseudo code:Source and Target being of whatever type you want, including built-in types like
booleanorint, which is possible in C++ because of templates.Extensibility? Is that a dirty word?
No, but it has a well known cost: When written by the same coder, general solutions to specific problems are usually slower than specific solutions written for their specific problems.
In the current case, in a naive viewpoint,
lexical_castwill use the stream facilities to convert from a typeAinto a string stream, and then from this string stream into a typeB.This means that as long as your object can be output into a stream, and input from a stream, you’ll be able to use
lexical_caston it, without touching any single line of code.So, what are the uses of
lexical_cast?The main uses of lexical casting are:
The point 2 is very very important here, because it means we have one and only one interface/function to cast a value of a type into an equal or similar value of another type.
This is the real point you missed, and this is the point that costs in performance terms.
But it’s so slooooooowwww!
If you want raw speed performance, remember you’re dealing with C++, and that you have a lot of facilities to handle conversion efficiently, and still, keep the
lexical_castease-of-use feature.It took me some minutes to look at the lexical_cast source, and come with a viable solution. Add to your C++ code the following code:
By enabling this specialization of lexical_cast for strings and ints (by defining the macro
SPECIALIZE_BOOST_LEXICAL_CAST_FOR_STRING_AND_INT), my code went 5 time faster on my g++ compiler, which means, according to your data, its performance should be similar to Java’s.And it took me 10 minutes of looking at boost code, and write a remotely efficient and correct 32-bit version. And with some work, it could probably go faster and safer (if we had direct write access to the
std::stringinternal buffer, we could avoid a temporary external buffer, for example).