I’m trying to optimize the following. The code bellow does this :
If a = 0.775 and I need precision 2 dp then a => 0.78
Basically, if the last digit is 5, it rounds upwards the next digit, otherwise it doesn’t.
My problem was that 0.45 doesnt round to 0.5 with 1 decimalpoint, as the value is saved as 0.44999999343…. and setprecision rounds it to 0.4.
Thats why setprecision is forced to be higher setprecision(p+10) and then if it really ends in a 5, add the small amount in order to round up correctly.
Once done, it compares a with string b and returns the result. The problem is, this function is called a few billion times, making the program craw. Any better ideas on how to rewrite / optimize this and what functions in the code are so heavy on the machine?
bool match(double a,string b,int p) { //p = precision no greater than 7dp
double t[] = {0.2, 0.02, 0.002, 0.0002, 0.00002, 0.000002, 0.0000002, 0.00000002};
stringstream buff;
string temp;
buff << setprecision(p+10) << setiosflags(ios_base::fixed) << a; // 10 decimal precision
buff >> temp;
if(temp[temp.size()-10] == '5') a += t[p]; // help to round upwards
ostringstream test;
test << setprecision(p) << setiosflags(ios_base::fixed) << a;
temp = test.str();
if(b.compare(temp) == 0) return true;
return false;
}
I wrote an integer square root subroutine with nothing more than a couple dozen lines of ASM, with no API calls whatsoever – and it still could only do about 50 million SqRoots/second (this was about five years ago …).
The point I’m making is that if you’re going for billions of calls, even today’s technology is going to choke.
But if you really want to make an effort to speed it up, remove as many API usages as humanly possible. This may require you to perform API tasks manually, instead of letting the libraries do it for you. Specifically, remove any type of stream operation. Those are slower than dirt in this context. You may really have to improvise there.
The only thing left to do after that is to replace as many lines of C++ as you can with custom ASM – but you’ll have to be a perfectionist about it. Make sure you are taking full advantage of every CPU cycle and register – as well as every byte of CPU cache and stack space.
You may consider using integer values instead of floating-points, as these are far more ASM-friendly and much more efficient. You’d have to multiply the number by 10^7 (or 10^p, depending on how you decide to form your logic) to move the decimal all the way over to the right. Then you could safely convert the floating-point into a basic integer.
You’ll have to rely on the computer hardware to do the rest.
<--Microsoft Specific-->I’ll also add that C++ identifiers (including static ones, as Donnie DeBoer mentioned) are directly accessible from ASM blocks nested into your C++ code. This makes inline ASM a breeze.
<--End Microsoft Specific-->