A.5.5 ADDPS: ADD Packed Single-Precision FP Values
ADDPS xmm1,xmm2/mem128 ; 0F 58 /r [KATMAI,SSE]
ADDPS performs addition on each of four packed single-precision FP value
pairs
dst[0-31] := dst[0-31] + src[0-31],
dst[32-63] := dst[32-63] + src[32-63],
dst[64-95] := dst[64-95] + src[64-95],
dst[96-127] := dst[96-127] + src[96-127].
The destination is an XMM register. The source operand can be either an
XMM register or a 128-bit memory location.