Please refers to the edit portion for my explanation. This is a bit long

Question

0

Editorial Team

Asked: May 26, 20262026-05-26T15:55:17+00:00 2026-05-26T15:55:17+00:00

Please refers to the edit portion for my explanation. This is a bit long

0

Please refers to the edit portion for my explanation.

This is a bit long and difficult to illustrate. But I appreciate taking your time to read this. Please bear with me.

Suppose I have this:

.data
    str1: .asciiz "A"
    str2: .asciiz "1"
    myInt:
          .word 42      # allocate an integer word: 42
    myChar:
          .word 'Q'     # allocate a char word

    .text    
    .align 2
    .globl main

main:
    lw      $t0, myInt          # load myInt into register $t0

    lw      $t3, str1           # load str1 into register $t3

    lw      $t4, str2           #load str2 into register $t4

    la      $a0, str1           # load address str1

    la      $a1, str2           # load address str2

Then in SPIM, the user text segment is

User Text Segment [00400000]..[00440000]
[00400000] 8fa40000  lw $4, 0($29)            ; 183: lw $a0 0($sp) # argc 
[00400004] 27a50004  addiu $5, $29, 4         ; 184: addiu $a1 $sp 4 # argv 
[00400008] 24a60004  addiu $6, $5, 4          ; 185: addiu $a2 $a1 4 # envp 
[0040000c] 00041080  sll $2, $4, 2            ; 186: sll $v0 $a0 2 
[00400010] 00c23021  addu $6, $6, $2          ; 187: addu $a2 $a2 $v0 
[00400014] 0c100009  jal 0x00400024 [main]    ; 188: jal main 
[00400018] 00000000  nop                      ; 189: nop 
[0040001c] 3402000a  ori $2, $0, 10           ; 191: li $v0 10 
[00400020] 0000000c  syscall                  ; 192: syscall # syscall 10 (exit) 
[00400024] 3c011001  lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
[00400028] 8c280004  lw $8, 4($1)             
[0040002c] 3c011001  lui $1, 4097             ; 25: lw $t3, str1 # load str1 into register $t3 
[00400030] 8c2b0000  lw $11, 0($1)            
[00400034] 3c011001  lui $1, 4097             ; 27: lw $t4, str2 #load str2 into register $t4 
[00400038] 8c2c0002  lw $12, 2($1)            
[0040003c] 3c041001  lui $4, 4097 [str1]      ; 29: la $a0, str1 # load address str1 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400044] 34250002  ori $5, $1, 2 [str2]

I understand that lw is a pseudocode so it needs to be broken down into two instructions. I understand this part. We use the entry address of data segment as a “base pointer” and relatively accessing other data (including the first data).

I also observed that loading address of str1 and str2 used two different registers: $4 and $1. $4 is $a0.
Why is that?

If I swap the last two instructions, on SPIM I see

...        
[0040003c] 3c011001  lui $1, 4097 [str2]      ; 31: la $a1, str2 # load address str2 
[00400040] 34250002  ori $5, $1, 2 [str2]     
[00400044] 3c041001  lui $4, 4097 [str1]      ; 32: la $a0, str1 # load address str1

So why is load address so strange? Why did str2 use $1 ???
How can I go about explaining how lui $1, 4097 [str2] and lui $4, 4097 [str1] are different?

PS: Can someone also explain to me why we need the bracket [str2] ?

lui, $1, 4097, [str2] only pushes the entry address of data segment to register $1. That is, 0x10010000 .

Thank you very much!

EDIT

I rewrote the entire script to simplify the situation.

Script: http://pastebin.com/BHh89iqt
Text Segment: http://pastebin.com/t2eDEs1f

Let me remind you that we write in pseudo instructions, rather than true MIPS machine code. That is, “lw”, “jal”, “addi”, etc are all pseudo instructions.

For example, lw (load word) is broken down into two machine instructions (look at the text segement):

lui $1, 4097             ; 23: lw $t0, myInt # load myInt into register $t0 
lw $8, 4($1)

MIPS is 32-bit, we therefore break it down into two instructions. The total of addressing a 32-bit address will result in 43 bits instruction set.. this is why we break down into 2 parts.
A label is a memory address pointing at the thing we assigned to.

MIPS can only read instructions of the form lw $rt, offset($rs). So most of the load instructions follow this approach and use $at to convert pseudoinstructions that involve labels to MIPS machine instructions.

For lw it’s pretty easy. For la load address it’s a bit tricky.
Pay attention to the last four load address instructions. MIPS translates them into this:

[0040003c] 3c041001  lui $4, 4097 [str1]      ; 27: la $a0, str1 # load address str2 
[00400040] 3c011001  lui $1, 4097 [str2]      ; 28: la $a0, str2 # load address str1 
[00400044] 34240002  ori $4, $1, 2 [str2]     
[00400048] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 # load address str2 
[0040004c] 34240002  ori $4, $1, 2 [str2]     
[00400050] 3c041001  lui $4, 4097 [str1]      ; 31: la $a0, str1 # load address str1

$4 refers to $a0. If you look at the instructions, I swapped the first two load instructions and the result is the last two instructions.
I purposely did this to illustrate the strange behavior: before swapping, lui uses $4 to store the address of str1, but if I want to load the address of str2, I will use $at and then apply offset.

I couldn’t figure out why last night, and just now, I realized that this is done because the compiler is smart enough to know that str1 is the first data in the data segement, so there is no need to convert anything.

Yet this is also strange because how does the compiler know at what byte to stop printing the string? (if we want to print a string…)

My guess: Null character to terminate print.

Anyhow. I guess this is just a convention that the MIPS uses.

Second Edit

In fact if you just add a new data on top of str1, you will see that
my explanation is correct.

New script: http://pastebin.com/8DuzFrk0

New Text Segment: http://pastebin.com/YXbvzc4z

I only added myCharB to the top of the data segment.

[0040003c] 3c011001  lui $1, 4097 [str1]      ; 29: la $a0, str1 #
load address str2
[00400040] 34240004  ori $4, $1, 4 [str1]
[00400044] 3c011001  lui $1, 4097 [str2]      ; 30: la $a0, str2 #
load address str1
[00400048] 34240006  ori $4, $1, 6 [str2]

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T15:55:18+00:00

I also observed that loading address of str1 and str2 used two
different registers: $4 and $1. $4 is $a0. Why is that?

Well, who cares? xD It’s internal SPIM implementation and it’s free to use any register as long as it does not break MIPS ABI. I just suggest you not relying too much on pseudo-instructions to make sure what registers have changed/what values they hold. Also usually LW is not a pseudo-instruction, but in the way you’re using it is.

Can someone also explain to me why we need the bracket [str2] ?

You don’t need any brackets. That’s just a SPIM information to the programmer to show this instruction is loading the str2 address. It’s not part of the assembly.

lui, $1, 4097, [str2] only pushes the entry address of data segment to
register $1. That is, 0x10010000

Well actually it only load upper half-word of $1. It just happens that the lower half-word is plain zeroes. Keep in mind LUI does not modify lower half-word, so you have to make sure it holds the value you want (reseting register or using LI).

Yet this is also strange because how does the compiler know at what
byte to stop printing the string? (if we want to print a string…)

Null-terminated, as you guessed right.

I guess this is just a convention that the MIPS uses.

This is way older than MIPS. And MIPS doesn’t define anything about this, either does any other architecture. This is data handling and it’s defined on a upper layer like OS. In this case it is SPIM convention on its own syscalls. Anyway null-terminated strings are pretty common. C programming language uses so for strings.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Please refers to the edit portion for my explanation. This is a bit long

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply