I have learnd Processor Architecture 3 years ago.
Until today , I can’t figure out why execute located before memory in the sequential instructions.
While executing the instruction [ mov (%eax) %ebx] , does it needn’t to access memory?
Thanks!
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Let’s remember classic RISC pipeline, which is usually studied: http://en.wikipedia.org/wiki/Classic_RISC_pipeline. Here are its stages:
In RISC you can only have
loads andstores to work with memory. AndEXstage for memory access instruction will compute the address in memory (take address from register file, scale it or add offset). Then address will be passed toMEMstage.Your example,
mov (%eax), %ebxis actually a load from memory without any additional computation and it can be represented even in RISC pipeline:IF– get the instruction from instruction memoryID– decode instruction, pass “eax” register to ALU as operand; remember “ebx” as output for WB (in control unit);EX– compute “eax+0” in ALU and pass result to next stageMEM(as address in memory)MEM– take address fromEXstage (from ALU), go to memory and take value (this stage can take several ticks to reach memory with blocking of the pipeline). Pass value toWBWB– take value fromMEMand pass it back to register file. Control unit should set the register file into mode: “Writing”+”EBX selected”Situation is more complex in true CISC instruction, e.g.
add (%eax), %ebx(load wordTfrom[%eax]memory, then store T+%ebxto%ebx). This instruction needs both address computation and addition in ALU. This can’t be easily represented in simplest RISC (MIPS) pipelines.First x86 cpu (8086) was not pipelined, it executed only single instruction at any moment. But since 80386 there is pipeline with 6 stages, which is more complex than in RISC. There is presentation about its pipeline, comparing it with MIPS: http://www.academic.marist.edu/~jzbv/architecture/Projects/projects2004/INTEL%20X86%20PIPELINING.ppt
Slide 17 says:
memandEXstages to avoid loads and stalls, but does create stalls for address computationIn my example,
addwill be executed in that combined “MEM+EX” stage for several CPU ticks, generating many stalls.Modern x86 CPUs have very long pipeline (16 stages is typical), and they are RISC-like cpus internally. Decoder stages (3 stage or more) will break most complex x86 instructions into series of internal RISC-like micro-operations (sometimes up to 450 microoperations per instruction are generated with help of microcode; more typical is 2-3 microoperations). For complex ALU/MEM operations, there will be microop for address computation, then microop for memory load and then microop for ALU action. Microoperations will have depends between them, and planned to different execution ports.