I have a question about very low-level thing. We were analyzing how the microprocessor performs simple assembler programs, we were using logic analyzer, so I have .law file. This is the code we used (in comments I placed opcodes):
mov ax, 1000
mov ds, ax
mov bx, 2000
mov ax, 0aa
mov cx, 100
petla
push cx ;51
mov [bx],al ;8807
mov ax,[bx] ;8B07
inc al;FEc0
pop cx;59
loop ;here goes address
we wrote it in debug program, assembled and watched the output. Here is the image:
http://img805.imageshack.us/img805/241/mikro.png
now, here is the strange (at least for me) thing:
Data bus:51 - push cx
Data bus:8807 - mov [bx],al
Data bus:0001 - writing to 1EF6A
Data bus:8B07 - mov ax,[bx]
Data bus AA, address bus:12000 - that is writing al to [bx] (ds - 1000, bx - 2000)
All of a sudden he writes to some place in memory the value that is in the CX register (I suspect that 1EF6A is the physical address of the SS:SP). Is it because of the
push CX?
If yes, why does he do it after the
mov [bx],al
and why the writing to [bx] occurred so late?
I was thinking that pushing value to stack should be done immediately after the push instruction.
(Sorry, I don’t yet have enough rep to comment, so I’m resorting to writing this an an answer.)
@Andna: this is an 8088, right? That’s why memory access in the analyzer trace is byte-at-a-time. So what you’re seeing is the result of the 8088’s prefetch unit, which blindly reads instruction bytes from memory and holds them in a short (4-byte) prefetch queue in the hope that the execution unit will want to use them later.
Data operations that result from the instructions that the execution unit actually executes will appear on the bus some time later. That’s why the CX value written to memory doesn’t show up immediately after the
push CXinstruction is read, and why the AL write doesn’t appear on the bus until after theMOV AX,[BX]instruction has been read. It’s also why, at the end of the loop (which unfortunately is not shown in this trace snapshot), you’ll see the prefetch unit reading instructions that come after the loop instruction. However, the execution unit will not execute those instructions.You’re correct to worry about possible bad side effects of the prefetch unit’s readahead, but the danger arises only when you’re dealing with a memory location that is written after the prefetch unit has already collected the previous value from that location, and that can only happen when you’re dealing with a memory location just above the current point of program execution. If you’re ever in that situation then you must do something to invalidate the content of the prefetch queue before you try to read that newly-written location. Executing a
JMPwill do that.@Zack: there’s no out-of-order execution here, no multicore or multithreading, not even any cacheing. Just a tiny amount of blind, speculative prefetch. Yes, the prefetch does make following the trace very slightly trickier than, say, an 8085.