I ask this question out of curiosity rather than difficulty, as I always learn from you, even on unrelated topics.
So, consider the following method, written in C++ and linked with g++. This method works fine, as everything is initialized to the correct size.
extern "C"
{
void retrieveObject( int id, char * buffer )
{
Object::Object obj;
extractObject( id, obj );
memcpy( buffer, &obj, sizeof(obj) );
}
}
// Prototype of extractObject
const bool extractObject( const int& id, Object::Object& obj ) const;
Now, I would like to avoid declaration of a local Object and use of memcpy.
I tried to replace retrieveObject with something like :
void retrieveObject( int id, char * buffer )
{
// Also tried dynamic_cast and C-Style cast
extractObject( id, *(reinterpret_cast<Object::Object *>(buffer)) );
}
It compiles and links successfully, but crashes right away. Considering that my buffer is large enough to hold an Object, does C++ need to call the constructor to “shape” the memory ? Is there another way to replace local variable and memcpy ?
I hope I was clear enough for you to answer, thank you in advance.
In your first effort…
…you still had the compiler create the local variable obj, which guarantees correct alignment. In the second effort…
…you’re promising the compiler the buffer points to a byte that’s aligned appropriately for an Object::Object. But will it be? Probably not, given your run-time crash. Generally, char*s can start on any given byte, where-as more complex objects are often aligned to the word size or with the largest alignment needed by their data members. Reading/writing ints, doubles, pointers etc. inside Object::Object may only work when the memory is properly aligned – it depends a bit on your CPU etc., but on UNIX/Linux, misalignment could generate e.g. a SIGBUS or SIGSEGV signal.
To explain this, let’s consider a simple CPU/memory architecture. Say the memory allows, in any given operation, 4 bytes (a 32-bit architecture) to be read from addresses 0-3, 4-7, or 8-11 etc, but you can’t read 4-byte chucks at addresses 1-4, 2-5, 3-6, 5-8…. Sounds strange, but that’s actually quite a common limitation for memory, so just accept it and consider the consequences. If we want to read a 4-byte number in memory – if it’s at one of those multiple-of-4 addresses we can get it in one memory read, otherwise we have to read twice: from one 4-byte area containing part of the data, then the other 4-byte area containing the rest, then throwing away the bits we don’t want and reassembling the rest in the proper places to get the 32-bit value into the CPU register/memory. That’s too slow, so languages typically take care to put values we want where the memory can access them in one operation. Even the CPUs are designed with this expectation, as they often have instructions that operate on values in memory directly, without explicitly loading them into registers (i.e. that’s an implementation detail beneath even the level of assembly/machine code). Code that asks the CPU to operate on data that’s not aligned like this typically results in the CPU generating an interrupt, which the OS might manifest as a signal.
That said, the other caveats about the safety of using this on non-POD data are also valid.