Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3844526
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T16:06:12+00:00 2026-05-19T16:06:12+00:00

In this compiler output, I’m trying to understand how machine-code encoding of the nopw

  • 0

In this compiler output, I’m trying to understand how machine-code encoding of the nopw instruction works:

00000000004004d0 <main>:
  4004d0:       eb fe                   jmp    4004d0 <main>
  4004d2:       66 66 66 66 66 2e 0f    nopw   %cs:0x0(%rax,%rax,1)
  4004d9:       1f 84 00 00 00 00 00

There is some discussion about “nopw” at http://john.freml.in/amd64-nopl. Can anybody explain the meaning of 4004d2-4004e0? From looking at the opcode list, it seems that 66 .. codes are multi-byte expansions. I feel I could probably get a better answer to this here than I would unless I tried to grok the opcode list for a few hours.


That asm output is from the following (insane) code in C, which optimizes down to a simple infinite loop:

long i = 0;

main() {
    recurse();
}

recurse() {
    i++;
    recurse();
}

When compiled with gcc -O2, the compiler recognizes the infinite recursion and turns it into an infinite loop; it does this so well, in fact, that it actually loops in the main() without calling the recurse() function.


editor’s note: padding functions with NOPs isn’t specific to infinite loops. Here’s a set of functions with a range of lengths of NOPs, on the Godbolt compiler explorer.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T16:06:12+00:00Added an answer on May 19, 2026 at 4:06 pm

    The 0x66 bytes are an “Operand-Size Override” prefix. Having more than one of these is equivalent to having one.

    The 0x2e is a ‘null prefix’ in 64-bit mode (it’s a CS: segment override otherwise – which is why it shows up in the assembly mnemonic).

    0x0f 0x1f is a 2 byte opcode for a NOP that takes a ModRM byte

    0x84 is ModRM byte which in this case codes for an addressing mode that uses 5 more bytes.

    Some CPUs are slow to decode instructions with many prefixes (e.g. more than three), so a ModRM byte that specifies a SIB + disp32 is a much better way to use up an extra 5 bytes than five more prefix bytes.

    AMD K8 decoders in Agner Fog’s microarch pdf:

    Each of the instruction decoders can handle three prefixes per clock
    cycle. This means that three instructions with three prefixes each can
    be decoded in the same clock cycle. An instruction with 4 – 6 prefixes
    takes an extra clock cycle to decode.


    Essentially, those bytes are one long NOP instruction that will never get executed anyway. It’s in there to ensure that the next function is aligned on a 16-byte boundary, because the compiler emitted a .p2align 4 directive, so the assembler padded with a NOP. gcc’s default for x86 is
    -falign-functions=16
    . For NOPs that will be executed, the optimal choice of long-NOP depends on the microarchitecture. For a microarchitecture that chokes on many prefixes, like Intel Silvermont or AMD K8, two NOPs with 3 prefixes each might have decoded faster.

    The blog article the question linked to ( http://john.freml.in/amd64-nopl ) explains why the compiler uses a complicated single NOP instruction instead of a bunch of single-byte 0x90 NOP instructions.

    You can find the details on the instruction encoding in AMD’s tech ref documents:

    • http://developer.amd.com/documentation/guides/pages/default.aspx#manuals

    Mainly in the “AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions”. I’m sure Intel’s technical references for the x64 architecture will have the same information (and might even be more understandable).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Whats wrong with the following c# code? Compiler reports this error: Inconsistent accessibility: parameter
Why do I get compiler errors with this Java code? 1 public List<? extends
I'm trying to use the this keyword in a static method, but the compiler
I am trying to get SED to transform the output from a TMS320C55x compiler
So, I've got this code I'm trying to update. It was written for visual
Following Scala mailing lists, different people often say: compiler rewrites this [scala] code into
Anyone know this compiler feature? It seems GCC support that. How does it work?
This is a compiler error (slightly changed for readability). This one always puzzled me.
All this originated from me poking at a compiler warning message (C4267) when attempting
For instance, does the compiler know to translate string s = test + this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.