I often read that unique_ptr would be preferred in most situations over shared_ptr because unique_ptr is non-copyable and has move semantics; shared_ptr would add an overhead due to copy and ref-counting;
But when I test unique_ptr in some situations, it appears it’s noticably slower (in access) than its counterparts
For example, under gcc 4.5 :
edit : the print method doesn’t print anything actually
#include <iostream>
#include <string>
#include <memory>
#include <chrono>
#include <vector>
class Print{
public:
void print(){}
};
void test()
{
typedef vector<shared_ptr<Print>> sh_vec;
typedef vector<unique_ptr<Print>> u_vec;
sh_vec shvec;
u_vec uvec;
//can't use initializer_list with unique_ptr
for (int var = 0; var < 100; ++var) {
shared_ptr<Print> p(new Print());
shvec.push_back(p);
unique_ptr<Print> p1(new Print());
uvec.push_back(move(p1));
}
//-------------test shared_ptr-------------------------
auto time_sh_1 = std::chrono::system_clock::now();
for (auto var = 0; var < 1000; ++var)
{
for(auto it = shvec.begin(), end = shvec.end(); it!= end; ++it)
{
(*it)->print();
}
}
auto time_sh_2 = std::chrono::system_clock::now();
cout <<"test shared_ptr : "<< (time_sh_2 - time_sh_1).count() << " microseconds." << endl;
//-------------test unique_ptr-------------------------
auto time_u_1 = std::chrono::system_clock::now();
for (auto var = 0; var < 1000; ++var)
{
for(auto it = uvec.begin(), end = uvec.end(); it!= end; ++it)
{
(*it)->print();
}
}
auto time_u_2 = std::chrono::system_clock::now();
cout <<"test unique_ptr : "<< (time_u_2 - time_u_1).count() << " microseconds." << endl;
}
On average I get (g++ -O0) :
- shared_ptr : 1480 microseconds
- unique_ptr : 3350 microseconds
where does the difference come from ? is it explainable ?
All you did in the timed blocks is access them. That won’t involve any additional overhead at all. The increased time probably comes from the console output scrolling. You can never, ever do I/O in a timed benchmark.
And if you want to test the overhead of ref counting, then actually do some ref counting. How is the increased time for construction, destruction, assignment and other mutating operations of
shared_ptrgoing to factor in to your time at all if you never mutateshared_ptr?Edit: If there’s no I/O then where are the compiler optimizations? They should have nuked the whole thing. Even ideone junked the lot.