flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Big Red 12 Dec 2006, 05:18
You need to go over:
Dot Product -> Vector Subtract -> Vector Length -> In each of them you load values onto the stack without ever popping them. Use faddp/fstp. Example: Code: dot_product: fld dword [eax] fmul dword [ebx] fld dword [eax+$4] fmul dword [ebx+$4] fld dword [eax+$8] fmul dword [ebx+$8] faddp st1, st0 faddp st1, st0 ; returns in st0 ret |
|||
![]() |
|
RedGhost 12 Dec 2006, 06:38
Doh, I wasn't even considering keeping the FPU stack balanced.
_________________ redghost.ca |
|||
![]() |
|
Big Red 12 Dec 2006, 09:12
Yeah, I can't stand it. At some point I had written a sort of partial reverse-compiler to generate equations from pure asm FPU code and to detect over/underflows errors before compilation, but I was making so many FPU stack errors in the process that I figured it wasn't worth it and gave up.
Anyway, here's a bunch of random 3d procs, umm some along the lines of what you have there. It's missing definitions, but you get the idea, feel free to copy+paste if there's anything interesting. Also a 3dnow version for some.
|
|||||||||||||||||||||
![]() |
|
RedGhost 12 Dec 2006, 19:18
Thanks for the examples, but I don't think I'll venture into MMX/3DNOW just quite yet. I will updated the first post with what (hopefully) doesn't rape the FPU the stack.
I don't know if it's an error with the forums or a mistake but you seem to have uploaded the same file twice (both download as the 3DNow versions, and the files are identical). _________________ redghost.ca |
|||
![]() |
|
r22 13 Dec 2006, 03:37
Most x86 processors (Built this century) have SSE and SSE2 instruction sets.
Doing math with XMMX instructions is much better -Faster execution -No annoying FPU stack -Clearer code -Using iterative approximations for Single precision SIN/COS/TAN functions can also be faster than the FPU opcodes. Simple example: Code: ;;;C = AX + BX ;;;C = X(A+B) .data X dd -9.0 A dd 20.0 B dd 5.0 C dd 0.0 .code call Foo ret 0 Foo: movss xmm1,dword[A] movss xmm0,dword[X] addss xmm1,dword[B] mulss xmm0,xmm1 movss [C],xmm0 ret 0 |
|||
![]() |
|
RedGhost 13 Dec 2006, 19:07
I got a friend of mine to give me the algorithms to get sine/cosine to the 6th decimal place on a single number, this type of math is beyond my current knowledge so fsincos or fsin/fcos is a must. Updated the main post, everything seems to be working, I guess just optimization time? (Unless I am still raping the stack on angle_vectors)
_________________ redghost.ca |
|||
![]() |
|
Big Red 14 Dec 2006, 04:14
Quote: I don't know if it's an error with the forums or a mistake but you seem to have uploaded the same file twice (both download as the 3DNow versions, and the files are identical). Odd, must be because both have the same filename. Clearly shows two different filesizes though... second upload must have replaced the first. Bug... Anyhow, sorry about that; I have a website/forum curse. I've attached here the one that was originally meant to be there. Code looks good to me, except maybe the "fst st1" 's - not sure how standard that is if a value has not been previously loaded into st1 prior to; I suppose you could use "fld st0" instead in most of that. I could be wrong though. Well, good luck.
|
|||||||||||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.