Is there a function (SSEx intrinsics is OK) which will fill the memory with a specified int32_t value? For instance, when this value is equal to 0xAABBCC00 the result memory should look like:
AABBCC00AABBCC00AABBCC00AABBCC00AABBCC00
AABBCC00AABBCC00AABBCC00AABBCC00AABBCC00
AABBCC00AABBCC00AABBCC00AABBCC00AABBCC00
AABBCC00AABBCC00AABBCC00AABBCC00AABBCC00
...
I could use std::fill or simple for-loop, but it is not fast enough.
Resizing of a vector performed only once in the beginning of program, this is not an issue. The bottleneck is filling the memory.
Simplified code:
struct X
{
typedef std::vector<int32_t> int_vec_t;
int_vec_t buffer;
X() : buffer( 5000000 ) { /* some more action */ }
~X() { /* some code here */ }
// the following function is called 25 times per second
const int_vec_t& process( int32_t background, const SOME_DATA& data );
};
const X::int_vec_t& X::process( int32_t background, const SOME_DATA& data )
{
// the following one string takes 30% of total time of #process function
std::fill( buffer.begin(), buffer.end(), background );
// some processing
// ...
return buffer;
}
Thanks to everyone for your answers. I’ve checked wj32’s solution , but it shows very similar time as
std::filldo. My current solution works 4 times faster (in Visual Studio 2008) thanstd::fillwith help of the functionmemcpy:In the production code one needs to add check if
buffer.size()is divisible by 4 and add appropriate handling for that.