Is there any SSE instruction (up to version 4.2) that automatically fills four XMM registers with the value of the four words of another XMM register?
Example: running the instruction on the word ABCD would fill four XMM registers: AAAA, BBBB, CCCC, and DDDD.
I do not believe there is a single operation, but four
pshufdoperations (one for each destination) will do the job; see http://lists.apple.com/archives/perfoptimization-dev/2007/Feb/msg00002.html (the first code example, after themovdinstruction). There are similar variants ofpshufdto replicate the other parts of the register. I believe the constants to use in the instruction are 0, 85, 170, and 255 for the four parts of the register.