Say you are writing compilers for different architectures.
The architectures have different endianness.
You have memory read and write instructions
Take example of a store instruction, where you want to store the value 0xAA0xBB0xCC0xDD.
Now while writing the assembly for this, do you write two different instructions for the
different architectures e.g.
For the little endian: st (reg), 0xDD0xCC0xBB0xAA
For the big endian: st (reg), 0xAA0xBB0xCC0xDD
Or you write the same instruction, say, st, (reg), 0xAA0xBB0xCC0xDD for both the architectures and let the instruction be parsed by the processor such that it takes care of the endianness of the system?
The reason why I ask this question is I don’t know what a binary translator would do when it has to translate code between architectures of different endianness. If in Architecture A, you see the following line st, (reg), XY do you convert it into st, (reg), YX for the Architecture B ?? If that is the case, then what happens to memory reads?
I would like to know how to take care of endianness, considering memory reads and writes in binary translation.
I’m not sure I understand your question fully, but it sounds like you want to translate some assembly-language code or a disassembled binary?
Every assembler I’ve ever worked with handles the endianness of constants in the sane way. That is to say, if you want to store 0xAABBCCDD, you would write:
And the assembler will swizzle the contstant if necessary for the appropriate opcode. Where endianness becomes a concern is where you want to store multiple single-byte values using that one operation. Something like writing a short null-terminated string
"123"to memory using the same opcode. You have to swizzle that constant in your assembly code to get it output to memory in the right order for little- vs. big-endian systems:The safe way is to just store the bytes in the order you want them:
But that takes four instructions, instead.