I’m slightly confused as to why clang is emitting different code for the following

Question

0

Asked: May 29, 20262026-05-29T07:04:26+00:00 2026-05-29T07:04:26+00:00

I’m slightly confused as to why clang is emitting different code for the following

0

I’m slightly confused as to why clang is emitting different code for the following two method:

@interface ClassA : NSObject
@end

@implementation ClassA
+ (ClassA*)giveMeAnObject1 {
    return [[ClassA alloc] init];
}
+ (id)giveMeAnObject2 {
    return [[ClassA alloc] init];
}
@end

If we look at the ARMv7 emitted then we see this, at O3, with ARC enabled:

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject1]"
"+[ClassA giveMeAnObject1]":
        push    {r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC0_0+4))
        mov     r7, sp
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC0_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC0_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC0_1+4))
LPC0_0:
        add     r1, pc
LPC0_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC0_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC0_2+4))
LPC0_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r7, lr}
        b.w     _objc_autorelease

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject2]"
"+[ClassA giveMeAnObject2]":
        push    {r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        mov     r7, sp
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
LPC2_0:
        add     r1, pc
LPC2_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
LPC2_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r7, lr}
        b.w     _objc_autoreleaseReturnValue

The only difference is the tail call to objc_autoreleaseReturnValue vs objc_autorelease. I would expect both to call objc_autoreleaseReturnValue to be honest. In-fact the first method not using objc_autoreleaseReturnValue means that it will potentially be slower than the second because there will definitely be an autorelease then a retain by the caller, rather than the faster bypass of this redundant call that ARC can do if it’s supported in the runtime.

The LLVM which is emitted gives some kind of reason why it’s like that:

define internal %1* @"\01+[ClassA giveMeAnObject1]"(i8* nocapture %self, i8* nocapture %_cmd) {
  %1 = load %struct._class_t** @"\01L_OBJC_CLASSLIST_REFERENCES_$_", align 4
  %2 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_", align 4
  %3 = bitcast %struct._class_t* %1 to i8*
  %4 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %3, i8* %2)
  %5 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_2", align 4
  %6 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %4, i8* %5)
  %7 = tail call i8* @objc_autorelease(i8* %6) nounwind
  %8 = bitcast i8* %6 to %1*
  ret %1* %8
}

define internal i8* @"\01+[ClassA giveMeAnObject2]"(i8* nocapture %self, i8* nocapture %_cmd) {
  %1 = load %struct._class_t** @"\01L_OBJC_CLASSLIST_REFERENCES_$_", align 4
  %2 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_", align 4
  %3 = bitcast %struct._class_t* %1 to i8*
  %4 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %3, i8* %2)
  %5 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_2", align 4
  %6 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %4, i8* %5)
  %7 = tail call i8* @objc_autoreleaseReturnValue(i8* %6) nounwind
  ret i8* %6
}

But I’m struggling to see why it’s decided to compile these two method differently. Can anyone shed some light onto it?

Update:

Even weirder is these other methods:

+ (ClassA*)giveMeAnObject3 {
    ClassA *a = [[ClassA alloc] init];
    return a;
}

+ (id)giveMeAnObject4 {
    ClassA *a = [[ClassA alloc] init];
    return a;
}

These compile to:

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject3]"
"+[ClassA giveMeAnObject3]":
        push    {r4, r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        add     r7, sp, #4
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
LPC2_0:
        add     r1, pc
LPC2_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
LPC2_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        blx     _objc_retainAutoreleasedReturnValue
        mov     r4, r0
        mov     r0, r4
        blx     _objc_release
        mov     r0, r4
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject4]"
"+[ClassA giveMeAnObject4]":
        push    {r4, r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC3_0+4))
        add     r7, sp, #4
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC3_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC3_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC3_1+4))
LPC3_0:
        add     r1, pc
LPC3_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC3_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC3_2+4))
LPC3_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        blx     _objc_retainAutoreleasedReturnValue
        mov     r4, r0
        mov     r0, r4
        blx     _objc_release
        mov     r0, r4
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

This time, they are identical however there’s a few things which could be optimised even more here:

There’s a redundant mov r4, r0 followed by mov r0, r4.
There’s a retain followed by a release.

Surely, the bottom bit of both of those methods can turn into:

LPC3_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

Obviously we could then also omit popping r4 because we don’t actually clobber it any more. Then the method would turn into the exact same as giveMeAnObject2 which is exactly what we’d expect.

Why is clang not being clever and doing this?!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T07:04:27+00:00

Editorial Team

2026-05-29T07:04:27+00:00Added an answer on May 29, 2026 at 7:04 am

This appears to be a bug in the optimizer and is being tracked as rdar://problem/10813093.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m slightly confused as to why clang is emitting different code for the following

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply