Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6341015
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T19:56:44+00:00 2026-05-24T19:56:44+00:00

If I have an instruction buffer for x86 is there an easy way to

  • 0

If I have an instruction buffer for x86 is there an easy way to check if an instruction is an SSE instruction without having to check if the opcode is within the ranges for the SSE instructions? By this I mean is there a common instruction prefix or processor state (such as a register) that can be checked?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T19:56:45+00:00Added an answer on May 24, 2026 at 7:56 pm

    (Updated)

    Depending on how you define easy the answer is either yes or no 🙂

    The instruction format is described in section 2 of the Intel 64 and IA-32 Architectures Software Developer’s Manual
    Combined Volumes 2A and 2B: Instruction Set Reference, A-Z
    . One of the problematic parts is the prefixes. Some of these are mandatory for some SSE instructions (66 F2 F3), while they have a different meaning for other opcodes (operand size override, REPNZ and REPZ).

    To see how the prefixes are used to distinguish between different instructions, consider these 4 forms of adding two xmm registers together (output obtained with objdump -D -b binary -m i386:x86-64:intel --insn-width=12):

    0f 58 c0                                addps  xmm0,xmm0
    66 0f 58 c0                             addpd  xmm0,xmm0
    f3 0f 58 c0                             addss  xmm0,xmm0
    f2 0f 58 c0                             addsd  xmm0,xmm0
    

    It seems that the default is to add two single precision scalars, 66 (normally: operand size override prefix) selects the double precision version, F3 (repz) selects the packed single version and finally F2 (repnz) selects the packed double version.

    Additionally they can sometimes be combined and in 64-bit mode you also have to worry about the REX prefix (pg. 2-9). Here is an example are different versions of roughly the same base instructions with different prefixes in 64-bit mode. I don’t know if you care about AVX instructions but I included one anyway as an example:

    0f 51 ca                                sqrtps xmm1,xmm2
    0f 51 0c 85 0a 00 00 00                 sqrtps xmm1,XMMWORD PTR [rax*4+0xa]
    65 0f 51 0c 85 0a 00 00 00              sqrtps xmm1,XMMWORD PTR gs:[rax*4+0xa]
    67 0f 51 0c 85 0a 00 00 00              sqrtps xmm1,XMMWORD PTR [eax*4+0xa]
    65 67 0f 51 0c 85 0a 00 00 00           sqrtps xmm1,XMMWORD PTR gs:[eax*4+0xa]
    f0 65 67 0f 51 0c 85 0a 00 00 00        lock sqrtps xmm1,XMMWORD PTR gs:[eax*4+0xa]
    c5 fd 51 ca                             vsqrtpd ymm1,ymm2
    c5 fc 51 0c 85 0a 00 00 00              vsqrtps ymm1,YMMWORD PTR [rax*4+0xa]
    65 c5 fc 51 0c 85 0a 00 00 00           vsqrtps ymm1,YMMWORD PTR gs:[rax*4+0xa]
    67 c5 fc 51 0c 85 0a 00 00 00           vsqrtps ymm1,YMMWORD PTR [eax*4+0xa]
    65 67 c5 fc 51 0c 85 0a 00 00 00        vsqrtps ymm1,YMMWORD PTR gs:[eax*4+0xa]
    f0 65 67 c5 fc 51 0c 85 0a 00 00 00     lock vsqrtps ymm1,YMMWORD PTR gs:[eax*4+0xa]
    

    So as far as I can see you will always have to loop over all prefixes to determine if an instruction is an SSE instruction.

    Update:
    An additional complication is the existence of instructions that only differ in their ModRM encoding. Consider:

    df 00 fild              WORD PTR [rax] # Non-SSE instruction: DF /0
    df 08 fisttp            WORD PTR [rax] # SSE instruction: DF /1
    

    To find these and all the other ways they can be encoded it’s easiest to use an opcode map.

    Because I’ve been meaning to look at writing a disassembler anyway I figured it would be a fun challenge to see what it takes. It should find most SSE instructions, though obviously I can’t and won’t guarantee it. I transformed the above opcode map into a series of tests that the code passes (tests.c – too big for inlining). The code tests a series of text strings containing hex digits of the opcode encoding (it stops parsing at the first non-hex digit, the last character in the string signifies whether it is an SSE instruction or not).

    It first scans all prefixes, then uses the opcode tables to test if the instruction matches with extra logic to handle the nested tables needed for multi-byte opcodes and the need to match digits in the following modrm byte.

    ssedetect.c:

    #include <stdio.h> 
    #include <stdlib.h>
    #include <string.h>
    #include <stdint.h>
    #include <ctype.h>
    
    #include "inst_table.h"
    
    enum { PREFIX_66=OP_66_SSE, PREFIX_F2=OP_F2_SSE, PREFIX_F3=OP_F3_SSE  };
    
    static int check_prefixes(int prefixes, int op_type) {
        if (op_type & OP_ALWAYS_SSE) return 1;
        if ((op_type & OP_66_SSE) && (prefixes & PREFIX_66)) return 1;
        if ((op_type & OP_F2_SSE) && (prefixes & PREFIX_F2)) return 1;
        if ((op_type & OP_F3_SSE) && (prefixes & PREFIX_F3)) return 1;
        return 0;
    }
    
    int isInstructionSSE(const uint8_t* code, int length)
    {
        int position = 0;
    
        // read prefixes
        int prefixes = 0;
        while (position < length) {
            uint8_t b = code[position];
    
            if (b == 0x66) {
                prefixes |= PREFIX_66;
                position++;
            } else if (b == 0xF2) { 
                prefixes |= PREFIX_F2;
                position++;
            } else if (b == 0xF3) { 
                prefixes |= PREFIX_F3; 
                position++;
            } else if (b >= 0x40 && b <= 0x4F) {
                //prefixes |= PREFIX_REX;
                position++;
                break; // opcode must follow REX
            } else if (b == 0x2E || b == 0x3E || b == 0x26 || b == 0x36 || b == 0x64 || b == 0x65 || b == 0x67 || b == 0xF0) {
                // ignored prefix
                position++;    
            } else {
                break;
            }
        }
    
        // read opcode
        const uint16_t* op_table = op;
        int op_length = 0;
        while (position < length) {
            uint8_t b = code[position];
            uint16_t op_type = op_table[b];
            if (op_type & OP_EXTENDED) {
                op_length++;
                position++;
                // hackish
                if (op_length == 1 && b == 0x0F) op_table = op_0F;
                else if (op_length == 2 && b == 0x01) op_table = op_0F_01;
                else if (op_length == 2 && b == 0x38) op_table = op_0F_38;
                else if (op_length == 2 && b == 0x3A) op_table = op_0F_3A;
                else { printf("\n\n%2.2X\n",b); abort(); }
            } else if (op_type & OP_DIGIT) {
                break;
            } else {
                return check_prefixes(prefixes, op_type);
            }
        } 
    
        // optionally read a digit
    
        // find digits we need can match in table
        uint8_t match_digits = (op_table[code[position]] & OP_DIGIT_MASK) >> OP_DIGIT_SHIFT;
    
        // consume the byte
        op_length++;
        position++;
        if (position >= length) {
            return 0;
        }
    
        uint8_t digit = (code[position]>>3)&7; // reg part of modrm
    
        return (match_digits & (1 << digit)) != 0;
    }
    
    static int read_code(const char* str, uint8_t** code, int* length)
    {
        int size = 1000;
        *length = 0;
        *code = malloc(size);
        if (!*code) {
            printf("out of memory\n");
            return 0;
        }
    
        while (*str) {
            char* endptr;
            unsigned long val = strtoul(str, &endptr, 16);
            if (str == endptr) {
                break;
            } 
    
            if (val > 255) {
                printf("%lX is out of range\n", val);
                goto error;
                return 0;
            }
    
            (*code)[*length] = (uint8_t)val;
    
            if (++*length >= size) {
                printf("needs resize, not implemented\n");
                goto error;
            }
    
            str = endptr;
        }
    
        if (*length == 0) {
            printf("No instruction bytes found\n");
            goto error;
        }
    
        return 1;
    
    error:
        free(*code);
        return 0;
    }
    
    static void test(const char* str)
    {
        uint8_t* code;
        int length;
        if (!read_code(str, &code, &length)) {
            puts(str);
            exit(1);
        }
        char is_sse = isInstructionSSE(code, length) ? 'Y' : 'N';
        char should_be_sse = str[strlen(str)-1];
        free(code);
        if (should_be_sse != is_sse) {
            printf("(%c) %c %s\n", should_be_sse, is_sse, str);
            exit(1);
        }
    }
    
    int main() 
    {
    #include "tests.c"
        test("48 ba 39 00 00 00 00 00 00 00           # movabs rdx,0x39 N");
        test("48 b8 00 00 00 00 00 00 00 00           # movabs rax,0x0 N");
        test("48 b9 14 00 00 00 00 00 00 00           # movabs rcx,0x14 N");
        test("48 6b c0 0a                             # imul   rax,rax,0xa N");
        test("48 83 ea 30                             # sub    rdx,0x30 N");
        test("48 01 d0                                # add    rax,rdx N");
        test("48 ff c9                                # dec    rcx N");
        test("75 f0                                   # jne    0x1e N");
        test("0f 51 ca                                # sqrtps xmm1,xmm2 Y");
        test("0f 51 0c 85 0a 00 00 00                 # sqrtps xmm1,XMMWORD PTR [rax*4+0xa] Y");
        test("65 0f 51 0c 85 0a 00 00 00              # sqrtps xmm1,XMMWORD PTR gs:[rax*4+0xa] Y");
        test("67 0f 51 0c 85 0a 00 00 00              # sqrtps xmm1,XMMWORD PTR [eax*4+0xa] Y");
        test("65 67 0f 51 0c 85 0a 00 00 00           # sqrtps xmm1,XMMWORD PTR gs:[eax*4+0xa] Y");
        test("f0 65 67 0f 51 0c 85 0a 00 00 00        # lock sqrtps xmm1,XMMWORD PTR gs:[eax*4+0xa] Y");
        test("f0 65 67 f3 43 0f 5c 8c 81 2a 2a 00 00  # lock subss xmm1, [gs:r8d*4+r9d+0x2A2A] Y");
        test("0f 58 c0                                # addps  xmm0,xmm0 Y");
        test("66 0f 58 c0                             # addpd  xmm0,xmm0 Y");
        test("f3 0f 58 c0                             # addss  xmm0,xmm0 Y");
        test("f2 0f 58 c0                             # addsd  xmm0,xmm0 Y");
        test("df 04 25 2c 00 00 00                    # fild   WORD PTR ds:0x2c N");
        test("df 0c 25 2c 00 00 00                    # fisttp WORD PTR ds:0x2c Y");
        test("67 0f ae 10                             # ldmxcsr DWORD PTR [eax] Y");
        test("67 0f ae 18                             # stmxcsr DWORD PTR [eax] Y");
        test("0f ae 00                                # fxsave [rax] N");
        test("0f ae e8                                # lfence Y");
        test("0f ae f0                                # mfence Y");
        test("0f ae f8                                # sfence Y");
        test("67 0f ae 38                             # clflush BYTE PTR [eax] Y");
        test("67 0f 18 00                             # prefetchnta BYTE PTR [eax] Y");
        test("0f 18 0b                                # prefetcht0 BYTE PTR [rbx] Y");
        test("67 0f 18 11                             # prefetcht1 BYTE PTR [ecx] Y");
        test("0f 18 1a                                # prefetcht2 BYTE PTR [rdx] Y");
        test("df 08                                   # fisttp WORD PTR [rax] Y");
        test("df 00                                   # fild   WORD PTR [rax] N");
    
        printf("All tests passed\n");
        return 0;
    }
    

    inst_table.h:

    // Table Element format:
    // Bit: 0 SSE instruction if 66 prefix
    //      1 SSE instruction if F2 prefix
    //      2 SSE instruction if F3 prefix
    //      3 Extended table
    //      4 Instruction is always SSE
    //      5 SSE instruction if ModRM byte matches digit(s) 
    //      6 -----
    //      7 -----
    //      8 SSE if ModRM has reg = 0
    //      9 SSE if ModRM has reg = 1
    //      .
    //      . That is it matches instructoins on the form XX XX /digit
    //      .
    //      15 SSE if modRM has reg = 7 
    
    #define OP_66_SSE      0x0001 // SSE if 66 prefix
    #define OP_F2_SSE      0x0002 // SSE if F2 prefix
    #define OP_F3_SSE      0x0004 // SSE if F3 prefix
    #define OP_EXTENDED    0x0008 // continue with extended table
    #define OP_ALWAYS_SSE  0x0010
    #define OP_DIGIT       0x0020
    
    #define OP_DIGIT_MASK  0xFF00
    #define OP_DIGIT_SHIFT      8
    #define OP_MATCH_DIGIT(d) (OP_DIGIT | (1 << (d + OP_DIGIT_SHIFT)))
    
    static const uint16_t op[256] = {
        [0x0F] = OP_EXTENDED,
        [0x90] = OP_F3_SSE,
        [0xDB] = OP_MATCH_DIGIT(1), // DB /1: FISTTP
        [0xDD] = OP_MATCH_DIGIT(1), 
        [0xDF] = OP_MATCH_DIGIT(1),
    };
    
    static const uint16_t op_0F[256] = {
        [0x01] = OP_EXTENDED, 
        [0x10] = OP_ALWAYS_SSE, // 0F 10 MOVUPS, F3 0F 10 MOVSS ... 
        [0x11] = OP_ALWAYS_SSE,
        [0x12] = OP_ALWAYS_SSE,
        [0x13] = OP_ALWAYS_SSE,
        [0x14] = OP_ALWAYS_SSE,
        [0x15] = OP_ALWAYS_SSE,
        [0x16] = OP_ALWAYS_SSE,
        [0x17] = OP_ALWAYS_SSE,
        [0x18] = OP_MATCH_DIGIT(0)|OP_MATCH_DIGIT(1)|OP_MATCH_DIGIT(2)|OP_MATCH_DIGIT(3),
        [0x28] = OP_ALWAYS_SSE,
        [0x29] = OP_ALWAYS_SSE,
        [0x2A] = OP_ALWAYS_SSE,
        [0x2B] = OP_ALWAYS_SSE,
        [0x2C] = OP_ALWAYS_SSE,
        [0x2D] = OP_ALWAYS_SSE,
        [0x2E] = OP_ALWAYS_SSE,
        [0x2F] = OP_ALWAYS_SSE,
        [0x38] = OP_EXTENDED,
        [0x3A] = OP_EXTENDED,
        [0x50] = OP_ALWAYS_SSE,
        [0x51] = OP_ALWAYS_SSE,
        [0x52] = OP_ALWAYS_SSE,
        [0x53] = OP_ALWAYS_SSE,
        [0x54] = OP_ALWAYS_SSE,
        [0x55] = OP_ALWAYS_SSE,
        [0x56] = OP_ALWAYS_SSE,
        [0x57] = OP_ALWAYS_SSE,
        [0x58] = OP_ALWAYS_SSE,
        [0x59] = OP_ALWAYS_SSE,
        [0x5A] = OP_ALWAYS_SSE,
        [0x5B] = OP_ALWAYS_SSE,
        [0x5C] = OP_ALWAYS_SSE,
        [0x5D] = OP_ALWAYS_SSE,
        [0x5E] = OP_ALWAYS_SSE,
        [0x5F] = OP_ALWAYS_SSE,
        [0x60] = OP_66_SSE,
        [0x61] = OP_66_SSE,
        [0x62] = OP_66_SSE,
        [0x63] = OP_66_SSE,
        [0x64] = OP_66_SSE,
        [0x65] = OP_66_SSE,
        [0x66] = OP_66_SSE,
        [0x67] = OP_66_SSE,
        [0x68] = OP_66_SSE,
        [0x69] = OP_66_SSE,
        [0x6A] = OP_66_SSE,
        [0x6B] = OP_66_SSE,
        [0x6C] = OP_66_SSE,
        [0x6D] = OP_66_SSE,
        [0x6E] = OP_66_SSE,
        [0x6F] = OP_66_SSE | OP_F3_SSE,
        [0x70] = OP_ALWAYS_SSE,
        [0x71] = OP_66_SSE,
        [0x72] = OP_66_SSE,
        [0x73] = OP_66_SSE,
        [0x74] = OP_66_SSE,
        [0x75] = OP_66_SSE,
        [0x76] = OP_66_SSE,
        [0x77] = OP_66_SSE,
        [0x78] = OP_66_SSE,
        [0x79] = OP_66_SSE,
        [0x7A] = OP_66_SSE,
        [0x7B] = OP_66_SSE,
        [0x7C] = OP_66_SSE | OP_F2_SSE,
        [0x7D] = OP_66_SSE | OP_F2_SSE,
        [0x7E] = OP_66_SSE | OP_F3_SSE,
        [0x7F] = OP_66_SSE | OP_F3_SSE,
        [0xAE] = OP_MATCH_DIGIT(2)|OP_MATCH_DIGIT(3)|OP_MATCH_DIGIT(5)|OP_MATCH_DIGIT(6)|OP_MATCH_DIGIT(7),
        [0xC2] = OP_ALWAYS_SSE,
        [0xC3] = OP_ALWAYS_SSE,
        [0xC4] = OP_ALWAYS_SSE,
        [0xC5] = OP_ALWAYS_SSE,
        [0xC6] = OP_ALWAYS_SSE,
        [0xD0] = OP_66_SSE | OP_F2_SSE,
        [0xD1] = OP_66_SSE,
        [0xD2] = OP_66_SSE,
        [0xD3] = OP_66_SSE,
        [0xD4] = OP_ALWAYS_SSE,
        [0xD5] = OP_66_SSE,
        [0xD6] = OP_66_SSE | OP_F2_SSE | OP_F3_SSE,
        [0xD7] = OP_ALWAYS_SSE,
        [0xD8] = OP_66_SSE,
        [0xD9] = OP_66_SSE,
        [0xDA] = OP_ALWAYS_SSE,
        [0xDB] = OP_66_SSE,
        [0xDC] = OP_66_SSE,
        [0xDD] = OP_66_SSE,
        [0xDE] = OP_ALWAYS_SSE,
        [0xDF] = OP_66_SSE,
        [0xE0] = OP_ALWAYS_SSE,
        [0xE1] = OP_66_SSE,
        [0xE2] = OP_66_SSE,
        [0xE3] = OP_ALWAYS_SSE,
        [0xE4] = OP_ALWAYS_SSE,
        [0xE5] = OP_66_SSE,
        [0xE6] = OP_66_SSE | OP_F2_SSE | OP_F3_SSE,
        [0xE7] = OP_ALWAYS_SSE,
        [0xE8] = OP_66_SSE,
        [0xE9] = OP_66_SSE,
        [0xEA] = OP_ALWAYS_SSE,
        [0xEB] = OP_66_SSE,
        [0xEC] = OP_66_SSE,
        [0xED] = OP_66_SSE,
        [0xEE] = OP_ALWAYS_SSE,
        [0xEF] = OP_66_SSE,
        [0xF0] = OP_F2_SSE,
        [0xF1] = OP_66_SSE,
        [0xF2] = OP_66_SSE,
        [0xF3] = OP_66_SSE,
        [0xF4] = OP_ALWAYS_SSE,
        [0xF5] = OP_66_SSE,
        [0xF6] = OP_ALWAYS_SSE,
        [0xF7] = OP_ALWAYS_SSE,
        [0xF8] = OP_66_SSE,
        [0xF9] = OP_66_SSE,
        [0xFA] = OP_66_SSE,
        [0xFB] = OP_ALWAYS_SSE,
        [0xFC] = OP_66_SSE,
        [0xFD] = OP_66_SSE,
        [0xFE] = OP_66_SSE,
    };
    
    static const uint16_t op_0F_01[256] = {
        [0xC8] = OP_ALWAYS_SSE, // 0F 01 C8: MONITOR
        [0xC9] = OP_ALWAYS_SSE,
    };
    
    
    static const uint16_t op_0F_38[256] = {
        [0xF0] = OP_F2_SSE, // F2 0F 38 F0: CRC32
        [0xF1] = OP_F2_SSE,
    };
    
    static const uint16_t op_0F_3A[256] = {
        [0x08] = OP_66_SSE, // 66 0F 3A 08: ROUNDPS
        [0x09] = OP_66_SSE,
        [0x0A] = OP_66_SSE,
        [0x0B] = OP_66_SSE,
        [0x0C] = OP_66_SSE,
        [0x0D] = OP_66_SSE,
        [0x0E] = OP_66_SSE,
        [0x0F] = OP_ALWAYS_SSE,
        [0x14] = OP_66_SSE,
        [0x15] = OP_66_SSE,
        [0x16] = OP_66_SSE,
        [0x17] = OP_66_SSE,
        [0x20] = OP_66_SSE,
        [0x21] = OP_66_SSE,
        [0x22] = OP_66_SSE,
        [0x40] = OP_66_SSE,
        [0x41] = OP_66_SSE,
        [0x42] = OP_66_SSE,
        [0x60] = OP_66_SSE,
        [0x61] = OP_66_SSE,
        [0x62] = OP_66_SSE,
        [0x63] = OP_66_SSE,
    };
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In our embedded architecture we have a 64-bit IAB (Instruction Alignment Buffer). In order
Suppose I have the following declared: section .bss buffer resb 1 And these instructions
I have noticed that sometimes MSVC 2010 doesn't reorder SSE instructions at all. I
How to connect to SQL Anywhere 10 db? I have tryed the instruction written
I have installed Kyoto Cabinet by using instruction from this . However, this says
I have a python script that uses plt.show() as it's last instruction. When it
Does NetBeans have something akin to Set Next Statement/Instruction when debugging in Java?
Imagine I have the following: inFile = /adda/adas/sdas/hello.txt # that instruction give me hello.txt
Following the standard instruction for using AES algorithm, I have not been able to
I am debugging some programs. In per-instruction debugging (Eclipse CDT), I have come across

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.