I have some code that is giving me relocation errors when compiling, below is an example which illustrates the problem:
program main
common/baz/a,b,c
real a,b,c
b = 0.0
call foo()
print*, b
end
subroutine foo()
common/baz/a,b,c
real a,b,c
integer, parameter :: nx = 450
integer, parameter :: ny = 144
integer, parameter :: nz = 144
integer, parameter :: nf = 23*3
real :: bar(nf,nx*ny*nz)
!real, allocatable,dimension(:,:) :: bar
!allocate(bar(nf,nx*ny*nz))
bar = 1.0
b = bar(12,32*138*42)
return
end
Compiling this with gfortran -O3 -g -o test test.f, I get the following error:
relocation truncated to fit: R_X86_64_PC32 against symbol `baz_' defined in COMMON section in /tmp/ccIkj6tt.o
But it works if I use gfortran -O3 -mcmodel=medium -g -o test test.f. Also note that it works if I make the array allocatable and allocate it within the subroutine.
My question is what exactly does -mcmodel=medium do? I was under the impression that the two versions of the code (the one with allocatable arrays and the one without) were more or less equivalent …
Since
baris quite large the compiler generates static allocation instead of automatic allocation on the stack. Static arrays are created with the.commassembly directive which creates an allocation in the so-called COMMON section. Symbols from that section are gathered, same-named symbols are merged (reduced to one symbol request with size equal to the largest size requested) and then what is rest is mapped to the BSS (uninitialised data) section in most executable formats. With ELF executables the.bsssection is located in the data segment, just before the data segment part of the heap (there is another heap part managed by anonymous memory mappings which does not reside in the data segment).With the
smallmemory model 32-bit addressing instructions are used to address symbols on x86_64. This makes code smaller and also faster. Some assembly output when usingsmallmemory model:This uses a 32-bit move instruction (5 bytes long) to put the value of the
bar.1535symbol (this value equals to the address of the symbol location) into the lower 32 bits of theRBXregister (the upper 32 bits get zeroed). Thebar.1535symbol itself is allocated using the.commdirective. Memory for thebazCOMMON block is allocated afterwards. Becausebar.1535is very large,baz_ends up more than 2 GiB from the start of the.bsssection. This poses a problem in the secondmovlinstruction since a non-32bit (signed) offset fromRIPshould be used to address thebvariable where the value ofEAXhas to be moved into. This is only detected during link time. The assembler itself does not know the appropriate offset since it doesn’t know what the value of the instruction pointer (RIP) would be (it depends on the absolute virtual address where the code is loaded and this is determined by the linker), so it simply puts an offset of0and then creates a relocation request of typeR_X86_64_PC32. It instructs the linker to patch the value of0with the real offset value. But it cannot do that since the offset value would not fit inside a signed 32-bit integer and hence bails out.With the
mediummemory model in place things look like this:First a 64-bit immediate move instruction (10 bytes long) is used to put the 64-bit value which represents the address of
bar.1535into registerR10. Memory for thebar.1535symbol is allocated using the.largecommdirective and thus it ends in the.lbsssection of the ELF exectuable..lbssis used to store symbols which might not fit in the first 2 GiB (and hence should not be addressed using 32-bit instructions or RIP-relative addressing), while smaller things go to.bss(baz_is still allocated using.command not.largecomm). Since the.lbsssection is placed after the.bsssection in the ELF linker script,baz_would not end up being inaccessible using 32-bit RIP-related addressing.All addressing modes are described in the System V ABI: AMD64 Architecture Processor Supplement. It is a heavy technical reading but a must read for anybody who really wants to understand how 64-bit code works on most x86_64 Unixes.
When an
ALLOCATABLEarray is used instead,gfortranallocates heap memory (most likely implemented as an anonymous memory map given the large size of the allocation):This is basically
RDI = malloc(2575411200). From then on elements ofbarare accessed by using positive offsets from the value stored inRDI:For locations that are more than 2 GiB from the start of
bar, a more elaborate method is used. E.g. to implementb = bar(12,144*144*450)gfortranemits:This code is not affected by the memory model since nothing is assumed about the address where the dynamic allocation would be made. Also, since the array is not passed around, no descriptor is being built. If you add another function that takes an assumed-shaped array and pass
barto it, a descriptor forbaris created as an automatic variable (i.e. on the stack offoo). If the array is made static with theSAVEattribute, the descriptor is placed in the.bsssection:The first move prepares the argument of a function call (in my sample case
call boo(bar)whereboohas an interface that declares it as taking an assumed-shape array). It moves the address of the array descriptor ofbarintoEDI. This is a 32-bit immediate move so the descriptor is expected to be in the first 2 GiB. Indeed, it is allocated in the.bssin bothsmallandmediummemory models like this: