I’ve been working on a custom PE binfmt handler for Ubuntu Linux 12.04, Intel x86_64 architecture (if this sounds familiar, I’ve posted a few questions related to this project already). I’ll apologize in advance if the amount of information I’m giving is overkill.
The binfmt handler is pretty standard; I read in the PE headers and sections and then write those sections into userspace memory at the addresses specified in the section table. Then, when everything is ready, I call
start_thread(regs, entry_addr, current->mm->start_stack);
exactly like the built-in Linux handlers do; in my case, regs = 0xcf7dffb4, entry_addr = 0x401000, and start_stack = 0xbffff59b.
I have the following code, in Intel x86 assembly:
push ebp
mov ebp, esp
mov eax, 4
add eax, 5
pop ebp
ret
I compile this program with fasm to a PE format executable (math1.exe) and install my binfmt handler with insmod. If I debug this program in gdb, I see:
(gdb) set disassembly-flavor intel
(gdb) x/6i 0x401000
0x401000: push ebp
0x401001: mov ebp,esp
0x401003: mov eax,0x4
0x401008: add eax,0x5
0x40100b: pop ebp
0x40100c: ret
so I know the code is loaded to the correct address. Then:
(gdb) run
Starting program: /media/sf_Sandbox/math1.exe
Program received signal SIGSEGV, Segmentation fault.
0x0040100c in ?? ()
When I do a register dump:
(gdb) info registers
eax 0x9 9
ecx 0x81394e8 135501032
edx 0x8137808 135493640
ebx 0x8139548 135501128
esp 0xbfffe59b 0xbfffe59b
ebp 0x0 0x0
esi 0x81394e8 135501032
edi 0x2f7ff4 3112948
eip 0x40100c 0x40100c
...other registers...
You can see that the code did execute because eax = 0x9, as it should. On the surface, I can’t find any reason for this to segfault at the ret statement, though. Examining dmesg, I found
math1.exe[1864] general protection ip:40100c sp:bffff5bd error:0
but I’ve found very little documentation on what might be causing this. I know the problem isn’t the code itself, because the same code compiled with the same assembler to ELF format runs with no trouble whatsoever.
My current theories about this problem are:
- I don’t really mess with the stack pointer in my handler. The built-in Linux handlers (for ELF, a.out, and flat formats, to name three) have a function
create_*_tables()that processes theargc,argv, andenvparguments. I didn’t include this function at first because the test program doesn’t take any input, but implementing thecreate_flat_tables()function (from the flat handler) doesn’t solve the problem so far. (I know that blindly pasting and calling functions from other modules is a bad idea, but the a.out and flat versions of that function are essentially identical, so it doesn’t seem to be very dependent on the executable format; I thought I’d give it a try.) - I found this article about the chain of function calls that occurs prior to and following the execution of
main(). Theobjdumpof math1.exe contains only the assembly code given above, but theobjdumpof the same program after assembly to ELF format (which yields a*.ofile) and linking withgcc(to get an ELF binary) contains the other functions mentioned in the article (_start(),__libc_start_main(), etc.). Perhaps those functions are more mandatory on the Linux platform than I previously thought.
I’m looking for any explanations/suggestions/further troubleshooting steps I could take. Thanks in advance!
You do need to implement at least one aspect of the
_startand__libc_start_mainsequence: calling the _exit syscall. You can’t just execute a “ret” from the frame that was created by execve and expect that to cause the process to terminate cleanly. Returning from main to exit the process is a C feature, and your program isn’t C.My memory is a little fuzzy on the syscall interface but I believe it goes like this:
_NR_exit)additional args to %ecx, %edx, … I’m not sure of the order. But _exit() only takes one arg and I’m pretty sure it goes in %ebx.