flat assembler
Message board for the users of flat assembler.
Index
> Windows > SSE question(s) |
Author |
|
ejamesr 26 May 2014, 20:01
I'm not sure exactly how you determine exactly which values to change. But you might consider the very fast pshufb command, which can easily convert any byte of an xmm register to 0. In the floating-point format, the value 0.0 is equal to all zeros in the number. So you could use the pshufb command to very quickly clear the low eight bytes of the xmm register.
For example, assume the following clears the low eight bytes of xmm0: Code: align 16 Mask db -1,-1,-1,-1,-1,-1,-1,-1,8,9,10,11,12,13,14,15 ... other data/code ... When you want to clear the register using Mask, do it like this: pshufb xmm0, dqword [Mask] The variable Mask has a value for each byte, corresponding to each byte of the xmm register. The low four bits say to move the byte from that byte offset of the register, into that position. If the high bit is set, no byte will be copied, but that destination value will be cleared to 0. And of course, you would want to put your Mask - ejamesr |
|||
26 May 2014, 20:01 |
|
BAiC 27 May 2014, 04:11
1) generate 4 floats that are all equal to 1.0 (store in an xmm register)
2) use the CMPPS instruction (described in the manuals) to compare the value with (1). the issue with source/destination registers will make this code sequence messy. you'll need to preload a register since the first source register is also the destination. the destination will be a vector mask. 3) 'not' the mask (you might be able to integrate the not into 'pandn') 4) 'and' the mask with the original vector. - Stefan _________________ byte me. |
|||
27 May 2014, 04:11 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.