The LLVM language specifies integer types as iN, where N is the bit-width of the integer, and ranges from 1 to 2^23-1 (According to: http://llvm.org/docs/LangRef.html#integer-type)
I have 2 questions:
-
When compiling a C program down to LLVM IR level, what types may be lowered to i1, i2, i3, etc? It seems like the types i8, i16, i32, i64 must be enough, so I was wondering what all the other nearly 8 million integer types are for.
-
Is it true that both signed and unsigned integer types are lowered to i32? What is the reason for that, and why does it not apply to something like 32-bit float (which is represented as f32 in LLVM)?
First of all, be aware both arbitrary-sized integers and no distinction between signed and unsigned integers are modifications added to LLVM 2.0. Earlier versions had only a few integer types, with a signed/unsigned distinction.
Now, to your questions:
LLVM, though designed with C/C++ in mind, is not specific to these languages. Having more possible integer types gives you more flexibility. You don’t have to use these types, of course – and I’m guessing that, as you’ve mentioned, any C/C++ frontend to LLVM (i.e. Clang) would probably only generate i1, i8, i16, i32 and i64.
Edit: apparently I’m mistaken and Clang does use some other integer types as well, see Jens’s comment below.
Yes, LLVM does not make a distinction between signed and unsigned integer type, so both will be lowered to i32. The operations on the unsigned integer, though, will be translated according to the original type; e.g. a division between unsigned integers will be
udivwhile between signed will besdiv. Because integers are represented as two’s complement, though, many operations (e.g.add) don’t care about signed/unsigned and so only have a single version.As for why no distinction was made in LLVM between signed and unsigned, read the details on this enhancement request – in short, having both signed and unsigned versions led to a large IR bloat and was detrimental to some optimizations, so it was dropped.
Finally, you ask about why no
f32– the answer is that I don’t know, maybe it was deemed to be less useful than arbitrarily-sized integers. However, notice thatf32is not really descriptive – if you want arbitrary floating-point types you need to at least specify the size of the base number and the size of the exponent, something likef23e8instead offloatandf52e11instead ofdouble. That’s a bit cumbersome if you ask me, though I guessfloatanddoublecould have been made synonymous with those.