I have written the following code, can you explain me what does the assembly tell here.
typedef struct
{
int abcd[5];
} hh;
void main()
{
printf("%d", ((hh*)0)+1);
}
Assembly:
.file "aa.c"
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
save %sp, -112, %sp
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
mov 20, %o1
call printf, 0
nop
return %i7+8
nop
.size main, .-main
.ident "GCC: (GNU) 4.2.1"
Oh wow, SPARC assembly language, I haven’t seen that in years.
I guess we go line by line? I’m going to skip some of the uninteresting boilerplate.
This is the string constant you used in
printf(so obvious, I know!) The important things to notice are that it’s in the.rodatasection (sections are divisions of the eventual executable image; this one is for “read-only data” and will in fact be immutable at runtime) and that it’s been given the label.LLC0. Labels that begin with a dot are private to the object file. Later, the compiler will refer to that label when it wants to load the address of the string constant..textis the section for actual machine code. This is the boilerplate header for defining the global function namedmain, which at the assembly level is no different from any other function (in C — not necessarily so in C++). I don’t remember what.proc 020does.Save the previous register window and adjust the stack pointer downward. If you don’t know what a register window is, you need to read the architecture manual: http://sparc.org/wp-content/uploads/2014/01/v8.pdf.gz. (V8 is the last 32-bit iteration of SPARC, V9 is the first 64-bit one. This appears to be 32-bit code.)
This two-instruction sequence has the net effect of loading the address
.LLC0(that’s your string constant) into register%o0, which is the first outgoing argument register. (The arguments to this function are in the incoming argument registers.)Load the immediate constant 100 into
%o1, the second outgoing argument register. This is the value computed by((foo *)0)+1. It’s 20 because yourstruct foois 20 bytes long (five 4-byteints) and you asked for the second one within the array starting at address zero.Incidentally, computing an offset from a pointer is only well-defined in C when there is actually a sufficiently large array at the address of the base pointer;
((foo *)0)is a null pointer, so there isn’t an array there, so the expression((foo *)0)+1technically has undefined behavior. GCC 4.2.1, targeting hosted SPARC, happens to have interpreted it as “pretend there is an arbitrarily large array offoos at address zero and compute the expected offset for array member 1″, but other (especially newer) compilers may do something completely different.Call
printf. I don’t remember what the zero is for. Thecallinstruction has a delay slot (again, read the architecture manual) which is filled in with a do-nothing instruction,nop.Jump to the address in register
%i7plus eight. This has the effect of returning from the current function.returnalso has a delay slot, which is filled in with anothernop. There is supposed to be arestoreinstruction in this delay slot, matching thesaveat the top of the function, so thatmain‘s caller gets its register window back. I don’t know why it’s not there. Discussion in the comments talks aboutmainpossibly not needing to pop the register window, and/or your having declaredmainasvoid main()(which is not guaranteed to work with any C implementation, unless its documentation specifically says so, and is always bad style) … but pushing and not popping the register window is such a troublesome thing to do on a SPARC that I don’t find either explanation convincing. I might even call it a compiler bug.