Is there any faster method to store two x86 32 bit registers in one 128 bit xmm register?
movd xmm0, edx
movd xmm1, eax
pshufd xmm0, xmm0, $1
por xmm0, xmm1
So if EAX is 0x12345678 and EDX is 0x87654321, the result in xmm0 must be 0x8765432112345678.
With SSE 4.1 you can use
movd xmm0, eax/pinsrd xmm0, edx, 1and do it in 2 instructions.For older CPUs you can use 2 x
movdand thenpunpckldqfor a total of 3 instructions: