flat assembler
Message board for the users of flat assembler.
Index
> Main > SSE2 add 32 bits parts of register xmm1 |
Author |
|
baldr 14 Aug 2013, 22:09
Roman,
Did you mean something like horisontal add? Use phaddd xmm1, xmm0 twice while xmm0 being zero, and you're all set (it's SSSE3 though; please state the constraints clearly; your reference to pshufd implies SSE2 at least). Should those colors be added component-wise? Or they're colorful shades of grayscale? How about saturation? [EDIT] Oh, I'm sorry, missed SSE2 referred in subject. [/EDIT] |
|||
14 Aug 2013, 22:09 |
|
asmdev 14 Aug 2013, 23:17
Code: movdqa xmm0, [mem] ; 3 2 1 0 pshufd xmm1, xmm0, 1110b ; _ _ 3 2 paddb xmm1, xmm0 ; change to "paddd" if adding dwords pshufd xmm0, xmm1, 1 paddb xmm0, xmm1 ; change to "paddd" if adding dwords Code: movdqa xmm0, [mem] ; 3 2 1 0 movdqa xmm2, [mem+16] pshufd xmm1, xmm0, 1110b ; _ _ 3 2 pshufd xmm3, xmm2, 1110b paddb xmm1, xmm0 ; change to "paddd" if adding dwords paddb xmm3, xmm2 pshufd xmm0, xmm1, 1 pshufd xmm2, xmm3, 1 paddb xmm0, xmm1 ; change to "paddd" if adding dwords paddb xmm2, xmm3 You should consider 4 regular "add" instructions in a row if you are adding dwords and EITHER memory is cached OR you are using movUps instead of movdqA. |
|||
14 Aug 2013, 23:17 |
|
Roman 15 Aug 2013, 13:32
baldr
Да именно сложить четири части регистра XMM1 между собой. Вроде phaddd xmm1, xmm1 то что надо. Спасибо. |
|||
15 Aug 2013, 13:32 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.