I have the following code to decode the bytes 0x66 0x5b 0xc3 (pop ebx / ret) with distorm64 (code was taken from this example)
// Holds the result of the decoding.
_DecodeResult res;
// Decoded instruction information.
_DecodedInst decodedInstructions[MAX_INSTRUCTIONS];
// next is used for instruction's offset synchronization.
// decodedInstructionsCount holds the count of filled instructions' array by the decoder.
unsigned int decodedInstructionsCount = 0, i, next;
// Default decoding mode is 32 bits, could be set by command line.
_DecodeType dt;
if(!x64)
dt = Decode32Bits;
else
dt = Decode64Bits;
// Default offset for buffer is 0, could be set in command line.
_OffsetType offset = 0;
char* errch = NULL;
char tempBuf[500];
// Decode the buffer at given offset (virtual address).
while (1)
{
// If you get an unresolved external symbol linker error for the following line,
// change the SUPPORT_64BIT_OFFSET in distorm.h.
res = distorm_decode(offset, (const unsigned char*)byteCodeBuffer, byteCodeBufferSize, dt, decodedInstructions, MAX_INSTRUCTIONS, &decodedInstructionsCount);
if (res == DECRES_INPUTERR)
{
// Null buffer? Decode type not 16/32/64?
printf("Input error, halting!");
return EXIT_FAILURE;
}
for (i = 0; i < decodedInstructionsCount; i++)
{
#ifdef SUPPORT_64BIT_OFFSET
sprintf_s(tempBuf, 500, "%0*I64x (%02d) %-24s %s%s%s\n", dt != Decode64Bits ? 8 : 16, decodedInstructions[i].offset, decodedInstructions[i].size, (char*)decodedInstructions[i].instructionHex.p, (char*)decodedInstructions[i].mnemonic.p, decodedInstructions[i].operands.length != 0 ? " " : "", (char*)decodedInstructions[i].operands.p);
outputText.append(tempBuf);
#else
printf("%08x (%02d) %-24s %s%s%s\n", decodedInstructions[i].offset, decodedInstructions[i].size, (char*)decodedInstructions[i].instructionHex.p, (char*)decodedInstructions[i].mnemonic.p, decodedInstructions[i].operands.length != 0 ? " " : "", (char*)decodedInstructions[i].operands.p);
#endif
}
if (res == DECRES_SUCCESS) break; // All instructions were decoded.
else if (decodedInstructionsCount == 0) break;
// Synchronize:
next = (unsigned long)(decodedInstructions[decodedInstructionsCount-1].offset - offset);
next += decodedInstructions[decodedInstructionsCount-1].size;
// Advance ptr and recalc offset.
byteCodeBuffer += next;
byteCodeBufferSize -= next;
offset += next;
}
return EXIT_SUCCESS;
the result is
00000000 (02) 665b POP BX
00000002 (01) c3 RET
which is wrong since the register isn’t BX but EBX.
If I try to compile (with nasm) the “pop bx / ret” sequence, I get 0x5b 0xc3 and distorm translates it into
00000000 (01) 5b POP EBX
00000001 (01) c3 RET
which is equally wrong (not EBX, but BX should be returned!)
Where am I getting wrong? Is it a distorm64 bug or what?
66 5b is POP BX when the processor is in 32-bit mode (and an invalid opcode in 64-bit mode, as only “whole” 64-bit registers can be pushed and popped in 64-bit mode). If you are disassembling 16-bit code with a 32-bit disassembler, then you can expect wrong results.
Note that the 66 prefix “toggles” the 32/16-bit flag for one instruction, so if you have 32-bit code, 66 turns the next instruction to a 16-bit one, and if you have 16-bit code, it turns it into a 32-bit instruction.
So I can only assume there is some confusion as to what mode your code is in – and that the disassembler is interpreting something that is 16-bit code as 32-bit code, or something like thiat.