flat assembler
Message board for the users of flat assembler.
Index
> Main > Critique this. Algorithm to add vectors 
Author 

Goplat
There's an "fsincos" instruction that calculates both sine and cosine at the same time; it's faster than doing them separately.


20 Sep 2006, 02:14 

mattst88
Ahh, you are correct. That will save quite a few clocks.
Thanks More please 

20 Sep 2006, 02:51 

Octavio
mattst88 wrote: Ahh, you are correct. That will save quite a few clocks. I don´t understand what you are doing, why you use trigonometric functions to add vectors? it´s not enought to sum the components? 

20 Sep 2006, 11:00 

Garthower
You can use 3DNOW!, or, if yours CPU new enough, SSE and/or SSE2. Banal translation into these commands it's possible to receive a gain of speed on %20%60.


20 Sep 2006, 12:36 

LocoDelAssembly
I think the FPU is more precise, he does a lot of intermediate calculus using 80 bits precision while SSE has up to 64 bits. I now that he uses qwords memory values but he does lots of calculus before storing the final result on memory so those extra bits can make some difference in the final result compared to the result obtained with SSE (which is not an extremately big difference of course).


20 Sep 2006, 13:12 

Garthower
Perhaps, and it's need to check up, since I did not check it. But in any case, this discrepancy makes million shares (if not less) if to compare the results received by commands FPU and SSE. Interesting experiment can turn out, it will be soon necessary to lead it.


20 Sep 2006, 13:28 

Goplat
Octavio wrote: I don´t understand what you are doing, why you use trigonometric functions to add vectors? That's how you add vectors that are in rectangular form. In this case the vectors are in polar form, so they have to be converted, added, then converted back. 

20 Sep 2006, 16:14 

Madis731
Try a very unusual approach:
NB! Still far from optimal Code: finit fldpi fmul [a_rad] fsincos faddp st1,st0 fmul [a_mag] fldpi fmul [b_rad] fsincos fld [b_mag] fmul st2,st0 fmulp st1,st0 fincstp faddp st1,st0 fld st6 fdecstp fld st2 fxch st1 fpatan fstp [r_rad] fmul st0,st0 fxch st1 ;Hint! This should be optimized out... fmul st0,st0 faddp st1,st0 fsqrt fstp [r_mag] 

21 Sep 2006, 18:11 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.