Is there something like a modulo operator or instruction in x86 assembly?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
If your modulus / divisor is a known constant, and you care about performance, see this and this. A multiplicative inverse is even possible for loop-invariant values that aren’t known until runtime, e.g. see https://libdivide.com/ (But without JIT code-gen, that’s less efficient than hard-coding just the steps necessary for one constant.)
Never use
divfor known powers of 2: it’s much slower thanandfor remainder, or right-shift for divide. Look at C compiler output for examples of unsigned or signed division by powers of 2, e.g. on the Godbolt compiler explorer. If you know a runtime input is a power of 2, uselea eax, [esi-1];and eax, edior something like that to dox & (y-1). Modulo 256 is even more efficient:movzx eax, clhas zero latency on recent Intel CPUs (mov-elimination), as long as the two registers are separate.In the simple/general case: unknown value at runtime
The
DIVinstruction (and its counterpartIDIVfor signed numbers) gives both the quotient and remainder. For unsigned, remainder and modulus are the same thing. For signedidiv, it gives you the remainder (not modulus) which can be negative:e.g.
-5 / 2 = -2 rem -1. x86 division semantics exactly match C99’s%operator.DIV r32divides a 64-bit number inEDX:EAXby a 32-bit operand (in any register or memory) and stores the quotient inEAXand the remainder inEDX. It faults on overflow of the quotient.Unsigned 32-bit example (works in any mode)
In 16-bit assembly you can do
div bxto divide a 32-bit operand inDX:AXbyBX. See Intel’s Architectures Software Developer’s Manuals for more information.Normally always use
xor edx,edxbefore unsigneddivto zero-extend EAX into EDX:EAX. This is how you do "normal" 32-bit / 32-bit => 32-bit division.For signed division, use
cdqbeforeidivto sign-extend EAX into EDX:EAX. See also Why should EDX be 0 before using the DIV instruction?. For other operand-sizes, usecbw(AL->AX),cwd(AX->DX:AX),cdq(EAX->EDX:EAX), orcqo(RAX->RDX:RAX) to set the top half to0or-1according to the sign bit of the low half.div/idivare available in operand-sizes of 8, 16, 32, and (in 64-bit mode) 64-bit. 64-bit operand-size is much slower than 32-bit or smaller on current Intel CPUs, but AMD CPUs only care about the actual magnitude of the numbers, regardless of operand-size.Note that 8-bit operand-size is special: the implicit inputs/outputs are in AH:AL (aka AX), not DL:AL. See 8086 assembly on DOSBox: Bug with idiv instruction? for an example.
Signed 64-bit division example (requires 64-bit mode)
Limitations / common mistakes
div dword 10is not encodeable into machine code (so your assembler will report an error about invalid operands).Unlike with
mul/imul(where you should normally use faster 2-operandimul r32, r/m32or 3-operandimul r32, r/m32, imm8/32instead that don’t waste time writing a high-half result), there is no newer opcode for division by an immediate, or 32-bit/32-bit => 32-bit division or remainder without the high-half dividend input.Division is so slow and (hopefully) rare that they didn’t bother to add a way to let you avoid EAX and EDX, or to use an immediate directly.
div and idiv will fault if the quotient doesn’t fit into one register (AL / AX / EAX / RAX, the same width as the dividend). This includes division by zero, but will also happen with a non-zero EDX and a smaller divisor. This is why C compilers just zero-extend or sign-extend instead of splitting up a 32-bit value into DX:AX.
And also why
INT_MIN / -1is C undefined behaviour: it overflows the signed quotient on 2’s complement systems like x86. See Why does integer division by -1 (negative one) result in FPE? for an example of x86 vs. ARM. x86idivdoes indeed fault in this case.The x86 exception is
#DE– divide exception. On Unix/Linux systems, the kernel delivers a SIGFPE arithmetic exception signal to processes that cause a #DE exception. (On which platforms does integer divide by zero trigger a floating point exception?)For
div, using a dividend withhigh_half < divisoris safe. e.g.0x11:23 / 0x12is less than0xffso it fits in an 8-bit quotient.Extended-precision division of a huge number by a small number can be implemented by using the remainder from one chunk as the high-half dividend (EDX) for the next chunk. This is probably why they chose remainder=EDX quotient=EAX instead of the other way around.