I was reading a codebreakers journal article on self-modifying code and there was this code snippet:
void Demo(int (*_printf) (const char *,...))
{
_printf("Hello, OSIX!n");
return;
}
int main(int argc, char* argv[])
{
char buff[1000];
int (*_printf) (const char *,...);
int (*_main) (int, char **);
void (*_Demo) (int (*) (const char *,...));
_printf=printf;
int func_len = (unsigned int) _main - (unsigned int) _Demo;
for (int a=0; a<func_len; a++)
buff[a] = ((char *) _Demo)[a];
_Demo = (void (*) (int (*) (const char *,...))) &buff[0];
_Demo(_printf);
return 0;
}
This code supposedly executed Demo() on the stack. I understand most of the code, but the part where they assign ‘func_len’ confuses me. As far as i can tell, they’re subtracting one random pointer address from another random pointer address.
Someone care to explain?
The code is relying on knowledge of the layout of functions from the compiler – which may not be reliable with other compilers.
The
func_lenline, once corrected to include the-that was originally missing, determines the length of the functionDemoby subtracting the address in_Demo(which is is supposed to contain the start address ofDemo()) from the address in_main(which is supposed to contain the start address ofmain()). This is presumed to be the length of the functionDemo, which is then copied byte-wise into the bufferbuff. The address ofbuffis then coerced into a function pointer and the function then called. However, since neither_Demonor_mainis actually initialized, the code is buggy in the extreme. Also, it is not clear that anunsigned intis big enough to hold pointers accurately; the cast should probably be to auintptr_tfrom<stdint.h>or<inttypes.h>.This works if the bugs are fixed, if the assumptions about the code layout are correct, if the code is position-independent code, and if there are no protections against executing data space. It is unreliable, non-portable and not recommended. But it does illustrate, if it works, that code and data are very similar.
I remember pulling a similar stunt between two processes, copying a function from one program into shared memory, and then having the other program execute that function from shared memory. It was about a quarter of a century ago, but the technique was similar and ‘worked’ for the machine it was tried on. I’ve never needed to use the technique since, thank goodness!