flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
sq4²
I need to remove the sign from all scalars in an SSE register.
Did a search on google, but can't find any alternative for FABS. Someone any idea? |
|||
![]() |
|
Tomasz Grysztar
Perhaps not very elegant:
![]() Code: andps xmm0,dqword [mask] with: Code: align 16 mask dd 4 dup 7FFFFFFFh |
|||
![]() |
|
Tomasz Grysztar
Again for packed single (four 32-bit fp values):
Code: maxps xmm0,dqword [min] minps xmm0,dqword [max] with: Code: min dd -1.0,-1.0,-1.0,-1.0 max dd 1.0,1.0,1.0,1.0 |
|||
![]() |
|
sq4²
and now for the last one :
![]() the last step is to move the 4 Fp's as 16bitInteger to another memory location after multiplying them: I have this : !MOV edi, [v_L0] !MOV esi, [v_L1] !MOV ebx, [v_LD] !MOV ecx, 256 !.While1: !movups xmm0,[edi] !movups xmm1,[esi] !addps xmm0,xmm1 !mulps xmm0, 65000 ---------- ?? !movups [ebx],xmm0 ---------- ?? !ADD esi,16 !ADD edi,16 !ADD ebx,8 !LOOP .While1 |
|||
![]() |
|
Madis731
Shouldn't you first move constants to XMMs and after that do the multiplication:
Code: MOV edi, [v_L0] MOV esi, [v_L1] MOV ebx, [v_LD] MOV ecx, 256 .While1: movups xmm0,[edi] movups xmm1,[esi] addps xmm0,xmm1 mulps xmm0, dqword[sixtyfiveT] ;Instead you could try moving 65000.0 and shuffling to other parts 4 times. movups [ebx],xmm0 ADD esi,16 ADD edi,16 ADD ebx,8 LOOP .While1 v_L0 dd 0 v_L1 dd 1 v_LD dd ? sixtyfiveT dd 65000.0,65000.0,65000.0,65000.0 |
|||
![]() |
|
sq4²
This routine is working perfect!
BUT : only sometimes. v_LD points to an Asio-buffer (allocated by the Asio driver) It looks (at least it sounds) like the buffer should be Aligned by 4. The problem is that I do not have control over this buffer. Could that be the cause? Code: !movss xmm2,[v_Gain] !unpcklps xmm2,xmm2 !movlhps xmm2,xmm2 !MOV edi, [v_L0] !MOV esi, [v_L1] !MOV edx, [v_LD] !MOV ecx, 256 !.While1: !movups xmm0,[edi] !movups xmm1,[esi] !addps xmm0,xmm1 !mulps xmm0, xmm2 !cvtps2pi mm0,xmm0 !movhlps xmm0,xmm0 !cvtps2pi mm1,xmm0 !packssdw mm0,mm1 !movq [edx],mm0 !ADD esi,16 !ADD edi,16 !ADD edx,8 !LOOP .While1 !EMMS |
|||
![]() |
|
Madis731
I think it should be aligned to 16, but I'm not sure that its possible if what you say is true - you have no control over it
![]() |
|||
![]() |
|
MCD
Madis731 wrote: I think it should be aligned to 16, but I'm not sure that its possible if what you say is true - you have no control over it Sounds like it's exactly the same problem I got with Delphi 6, which allows the usage of MMX/SSE in it's inline assembler, but only allows aligning data on 1, 2, 4 and 8 byte boundary, but NOT 16 byte boundary. What a pain of a compiler ![]() _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
![]() |
|
sq4²
MCD wrote:
Ok, let's assume it's an alignment problem. I could work with an extra buffer + (WinAPI)CopyMemory. The question is : how do I allocate memory that is 16 byte aligned? |
|||
![]() |
|
Tomasz Grysztar
If you cannot trust memory allocation, you can allocate 15 bytes more than you need and choose the starting address to be the first aligned one inside the block.
|
|||
![]() |
|
sq4²
Tomasz Grysztar wrote: If you cannot trust memory allocation, you can allocate 15 bytes more than you need and choose the starting address to be the first aligned one inside the block. I know, but does 16 byte alignment mean that the starting address of a memory block has to be dividable by 16? Btw, should I FXSave/FXRSTOR? |
|||
![]() |
|
Tomasz Grysztar
Yes - and this also means that the lowest four bits of address have to be 0000.
|
|||
![]() |
|
sq4²
One thing is sure : it's not an alignment problem.
Perhaps this is complete nonsense but : in another thread also mmx/sse is used (this code resides in a dll (in fact it's a VSTI (see steinberg)) could it be that this interferes with my code? if so : will a criticalsection prevent this? |
|||
![]() |
|
sq4²
Guess what : my soundcard is doing funny. I tried another comp+same type of soundcard, and everything is working fine.
Big thanks to you all, and especially Tomasz for answering so fast. |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.