NASM 2.05 based x86 Instruction Reference[ch265]
A.5.241 PMULHW, PMULLW: Multiply Packed 16-bit Integers, and Store PMULHW mm1,mm2/m64 ; 0F E5 /r [PENT,MMX] PMULLW mm1,mm2/m64 ; 0F D5 /r [PENT,MMX] PMULHW xmm1,xmm2/m128 ; 66 0F E5 /r [WILLAMETTE,SSE2] PMULLW xmm1,xmm2/m128 ; 66 0F D5 /r [WILLAMETTE,SSE2] PMULxW takes two packed unsigned 16-bit integer inputs, and multiplies the values in the inputs, forming doubleword results. - PMULHW then stores the top 16 bits of each doubleword in the destination (first) operand; - PMULLW stores the bottom 16 bits of each doubleword in the destination operand.