As far as I’ve understood, when a program (written in C for example) is compiled, it is first translated into assembly language and then into machine language. Why can’t (isn’t) the “assembly language step” be skipped?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
It is easier for the compiler developers.
It is possible to write a compiler that reads C and writes object code. However, this requires the compiler writer to write all the computations that encode instructions. Instruction encodings are intricate on some machines. Additionally, there are fields to fill in that depend on other interactions, such as how far away a branch target is, which depends on what instructions are between the branch and the target.
Additionally, part of the way a compiler is written is with patterns that say things like “To increment an object x, issue an increment instruction.” In order to write object code directly, you have to encode all the instructions you want to write into those patterns. That means your patterns must have some sort of language for describing instructions.
Well, we already have a language for that: assembly language. So it is simply easier to write your patterns in ways like “To increment an object x, issue
inc x.”Modern compilers have many layers. There is a front end that reads C text (or other languages) and turns it into a language internal to the compiler. There is an optimizer that operates on the internal language (or a representation of it) and tries to improve the code. There is a back end that turns the internal language into assembly language. There is an assembler that turns the assembly into object code. And there is a linker that links object code into an executable file.
As with many complex tasks, it is simply easier for human minds to work with a complex task when it is separated into nice pieces. This reduces bugs and improves the time it takes to work with software. It also makes software flexible, because we can change the front end to support a new language (e.g., Java instead of C) or change the back end to support a new processor (change from Intel assembly to PowerPC assembly). And changing one optimizer improves all the compilers, for Java and C and Intel and PowerPC.
The gcc command that we use to compile is actually just a driver that calls other programs that perform the front-end processing, the optimization, the assembly, and the linking. You can also call most of these phases separately, or use a switch to tell gcc to show you the commands it is using.
Additionally, GCC has a feature that allows developers to insert assembly language directly intermixed with the C code. This compels GCC to include an assembler.