flat assembler
Message board for the users of flat assembler.
Index
> Windows > SSE alternative for FPU::FABS ? |
Author |
|
sq4² 06 Aug 2005, 21:40
I need to remove the sign from all scalars in an SSE register.
Did a search on google, but can't find any alternative for FABS. Someone any idea? |
|||
06 Aug 2005, 21:40 |
|
Tomasz Grysztar 06 Aug 2005, 22:10
Perhaps not very elegant:
Code: andps xmm0,dqword [mask] with: Code: align 16 mask dd 4 dup 7FFFFFFFh |
|||
06 Aug 2005, 22:10 |
|
Tomasz Grysztar 06 Aug 2005, 23:05
Again for packed single (four 32-bit fp values):
Code: maxps xmm0,dqword [min] minps xmm0,dqword [max] with: Code: min dd -1.0,-1.0,-1.0,-1.0 max dd 1.0,1.0,1.0,1.0 |
|||
06 Aug 2005, 23:05 |
|
sq4² 07 Aug 2005, 00:46
and now for the last one :
the last step is to move the 4 Fp's as 16bitInteger to another memory location after multiplying them: I have this : !MOV edi, [v_L0] !MOV esi, [v_L1] !MOV ebx, [v_LD] !MOV ecx, 256 !.While1: !movups xmm0,[edi] !movups xmm1,[esi] !addps xmm0,xmm1 !mulps xmm0, 65000 ---------- ?? !movups [ebx],xmm0 ---------- ?? !ADD esi,16 !ADD edi,16 !ADD ebx,8 !LOOP .While1 |
|||
07 Aug 2005, 00:46 |
|
Madis731 08 Aug 2005, 11:03
Shouldn't you first move constants to XMMs and after that do the multiplication:
Code: MOV edi, [v_L0] MOV esi, [v_L1] MOV ebx, [v_LD] MOV ecx, 256 .While1: movups xmm0,[edi] movups xmm1,[esi] addps xmm0,xmm1 mulps xmm0, dqword[sixtyfiveT] ;Instead you could try moving 65000.0 and shuffling to other parts 4 times. movups [ebx],xmm0 ADD esi,16 ADD edi,16 ADD ebx,8 LOOP .While1 v_L0 dd 0 v_L1 dd 1 v_LD dd ? sixtyfiveT dd 65000.0,65000.0,65000.0,65000.0 |
|||
08 Aug 2005, 11:03 |
|
sq4² 08 Aug 2005, 11:11
This routine is working perfect!
BUT : only sometimes. v_LD points to an Asio-buffer (allocated by the Asio driver) It looks (at least it sounds) like the buffer should be Aligned by 4. The problem is that I do not have control over this buffer. Could that be the cause? Code: !movss xmm2,[v_Gain] !unpcklps xmm2,xmm2 !movlhps xmm2,xmm2 !MOV edi, [v_L0] !MOV esi, [v_L1] !MOV edx, [v_LD] !MOV ecx, 256 !.While1: !movups xmm0,[edi] !movups xmm1,[esi] !addps xmm0,xmm1 !mulps xmm0, xmm2 !cvtps2pi mm0,xmm0 !movhlps xmm0,xmm0 !cvtps2pi mm1,xmm0 !packssdw mm0,mm1 !movq [edx],mm0 !ADD esi,16 !ADD edi,16 !ADD edx,8 !LOOP .While1 !EMMS |
|||
08 Aug 2005, 11:11 |
|
Madis731 08 Aug 2005, 12:49
I think it should be aligned to 16, but I'm not sure that its possible if what you say is true - you have no control over it
|
|||
08 Aug 2005, 12:49 |
|
MCD 08 Aug 2005, 14:06
Madis731 wrote: I think it should be aligned to 16, but I'm not sure that its possible if what you say is true - you have no control over it Sounds like it's exactly the same problem I got with Delphi 6, which allows the usage of MMX/SSE in it's inline assembler, but only allows aligning data on 1, 2, 4 and 8 byte boundary, but NOT 16 byte boundary. What a pain of a compiler _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
08 Aug 2005, 14:06 |
|
sq4² 08 Aug 2005, 16:15
MCD wrote:
Ok, let's assume it's an alignment problem. I could work with an extra buffer + (WinAPI)CopyMemory. The question is : how do I allocate memory that is 16 byte aligned? |
|||
08 Aug 2005, 16:15 |
|
Tomasz Grysztar 08 Aug 2005, 16:19
If you cannot trust memory allocation, you can allocate 15 bytes more than you need and choose the starting address to be the first aligned one inside the block.
|
|||
08 Aug 2005, 16:19 |
|
sq4² 08 Aug 2005, 16:22
Tomasz Grysztar wrote: If you cannot trust memory allocation, you can allocate 15 bytes more than you need and choose the starting address to be the first aligned one inside the block. I know, but does 16 byte alignment mean that the starting address of a memory block has to be dividable by 16? Btw, should I FXSave/FXRSTOR? |
|||
08 Aug 2005, 16:22 |
|
Tomasz Grysztar 08 Aug 2005, 16:24
Yes - and this also means that the lowest four bits of address have to be 0000.
|
|||
08 Aug 2005, 16:24 |
|
sq4² 08 Aug 2005, 16:45
One thing is sure : it's not an alignment problem.
Perhaps this is complete nonsense but : in another thread also mmx/sse is used (this code resides in a dll (in fact it's a VSTI (see steinberg)) could it be that this interferes with my code? if so : will a criticalsection prevent this? |
|||
08 Aug 2005, 16:45 |
|
sq4² 08 Aug 2005, 20:00
Guess what : my soundcard is doing funny. I tried another comp+same type of soundcard, and everything is working fine.
Big thanks to you all, and especially Tomasz for answering so fast. |
|||
08 Aug 2005, 20:00 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.