There was this problem that has been asked about implementing a load byte into a single cycle datapath without having to change the data memory, and the solution was something below.
alt text http://img214.imageshack.us/img214/7107/99897101.jpg
This is actually quite a realistic
question; most memory systems are
entirely word-based, and individual
bytes are typically only dealt with
inside the processor. When you see a
“bus error” on many computers, this
often means that the processor tried
to access a memory address that was
not properly word-aligned, and the
memory system raised an exception.
Anyway, because byte addresses might
not be a multiple of 4, we cannot pass
them to memory directly. However, we
can still get at any byte, because
every byte can be found within some
word, and all word addresses are
multiples of 4. So the first thing we
do is to make sure we get the right
word. If we take the high 30 bits of
the address (i.e., ALUresult[31-2])
and combine them with two 0 bits at
the low end (this is what the “left
shift 2” unit is really doing), we
have the byte address of the word that
contains the desired byte. This is
just the byte’s own address, rounded
down to a multiple of 4. This change
means that lw will now also round
addresses down to multiples of 4, but
that’s OK since non-aligned addresses
wouldn’t work for lw anyway with this
memory unit. OK, now we get the data
word back from memory. How do we get
the byte we want out of it? Well,
note that the byte’s byte-offset
within the word is just given by the
low-order 2 bits of the byte’s
address. So, we simply use those 2
bits to select the appropriate byte
out of the word using a mux. Note the
use of big-endian byte numbering, as
is appropriate for MIPS. Next, we
have to zero-extend the byte to 32
bits (i.e., just combine it with 24
zeros at its high end), because the
problem specifies to do so. Actually,
this was a slight mistake in the
question: in reality, the lbu
instruction zero-extends the byte, but
lb sign-extends it. Oh, well.
Finally, we have to extend the
MemtoReg-controlled mux to accept one
new input: the zero-extended byte for
the lb case. The MemtoReg control
signal must be widened to 2 bits. The
original 0 and 1 cases change to 00
and 01, respectively, and we add a new
case 10 which is only used in the case
of lb.
I don’t quite actually understand on how this works even after reading the explanation, especially about left shift the ALU result by 2 would give the byte address… how is this possible?? so if I would like to load a half word then I would do one left shift and I would get the address of the half word?? what would be a better way to do load byte, load half word by modifying the data memory? (the question above puts constraints that we can’t modify the data memory)
The original author simply seems to be adding a byte multiplexer to the 32-bit data being read from the memory. This memory allows a full 32-bit naturally aligned load (lw instruction) and the additional byte multiplexer and zero extension allows for load byte instructions as well (lbu instruction).
The left shift of the ALU result yields a word address, NOT a byte address, and accounts for the implicit right shift by two in the signal routing. The end result is simply the lower two bits of the ALU result being masked (zeroed) before being sent to the memory. The two LSBs of the ALU value are fed down-stream of the memory to the byte multiplexer, allowing the word memory to read arbitrary bytes.
There is no direct support in the logic shown for loading half-words (16-bits), just bytes and full 32-bit words. You could, however, easily modify the byte addressing logic to support words instead of bytes (or even both) using a similar approach.