flat assembler
Message board for the users of flat assembler.

Index > Tutorials and Examples > Parallel hypot function

Author
Thread Post new topic Reply to topic
QQ2976501934



Joined: 06 Jul 2021
Posts: 3
QQ2976501934 17 Sep 2021, 04:06
Input: xmm0 = x, xmm1 = y
Output: xmm0 = hypot(x, y);
Code:
hypot_ps:
        push 0x1F80
        ; Save MXCSR
        stmxcsr [rsp+4]
        ; Reset MXCSR
        ldmxcsr [rsp]
        mov eax, 1.0
        pcmpeqd xmm2, xmm2
        psrld xmm2, 1
        movd xmm3, eax
        andps xmm0, xmm2
        andps xmm1, xmm2
        movaps xmm2, xmm0
        maxps xmm0, xmm1
        minps xmm1, xmm2
        pxor xmm2, xmm2
        pshufd xmm3, xmm3, 0
        cmpneqps xmm2, xmm0
        divps xmm1, xmm0
        pand xmm1, xmm2
        mulps xmm1, xmm1
        addps xmm1, xmm3
        sqrtps xmm1, xmm1
        mulps xmm0, xmm1
        ; Restore MXCSR
        ldmxcsr [rsp+4]
        pop rax
        ret

hypot_pd:
        push 0x1F80
        ; Save MXCSR
        stmxcsr [rsp+4]
        ; Reset MXCSR
        ldmxcsr [rsp]
        mov rax, 1.0
        pcmpeqd xmm2, xmm2
        psrlq xmm2, 1
        movq xmm3, rax
        andpd xmm0, xmm2
        andpd xmm1, xmm2
        movapd xmm2, xmm0
        maxpd xmm0, xmm1
        minpd xmm1, xmm2
        pxor xmm2, xmm2
        pshufd xmm3, xmm3, 0x44
        cmpneqpd xmm2, xmm0
        divpd xmm1, xmm0
        pand xmm1, xmm2
        mulpd xmm1, xmm1
        addpd xmm1, xmm3
        sqrtpd xmm1, xmm1
        mulpd xmm0, xmm1
        ; Restore MXCSR
        ldmxcsr [rsp+4]
        pop rax
        ret
    


Last edited by QQ2976501934 on 17 Sep 2021, 04:49; edited 2 times in total
Post 17 Sep 2021, 04:06
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20416
Location: In your JS exploiting you and your system
revolution 17 Sep 2021, 04:16
QQ2976501934 wrote:
Code:
        mov eax, 0x3F800000    
You can use float and double constants directly in fasm.
Code:
  mov eax, 1.0    ; same as 0x3F800000 but more obvious    
Post 17 Sep 2021, 04:16
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.