flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Goplat 20 Sep 2006, 02:14
There's an "fsincos" instruction that calculates both sine and cosine at the same time; it's faster than doing them separately.
|
|||
![]() |
|
mattst88 20 Sep 2006, 02:51
Ahh, you are correct. That will save quite a few clocks.
Thanks ![]() More please ![]() |
|||
![]() |
|
Octavio 20 Sep 2006, 11:00
mattst88 wrote: Ahh, you are correct. That will save quite a few clocks. I don´t understand what you are doing, why you use trigonometric functions to add vectors? it´s not enought to sum the components? |
|||
![]() |
|
Garthower 20 Sep 2006, 12:36
You can use 3DNOW!, or, if yours CPU new enough, SSE and/or SSE2. Banal translation into these commands it's possible to receive a gain of speed on %20-%60.
|
|||
![]() |
|
LocoDelAssembly 20 Sep 2006, 13:12
I think the FPU is more precise, he does a lot of intermediate calculus using 80 bits precision while SSE has up to 64 bits. I now that he uses qwords memory values but he does lots of calculus before storing the final result on memory so those extra bits can make some difference in the final result compared to the result obtained with SSE (which is not an extremately big difference of course).
|
|||
![]() |
|
Garthower 20 Sep 2006, 13:28
Perhaps, and it's need to check up, since I did not check it. But in any case, this discrepancy makes million shares (if not less) if to compare the results received by commands FPU and SSE. Interesting experiment can turn out, it will be soon necessary to lead it.
|
|||
![]() |
|
Goplat 20 Sep 2006, 16:14
Octavio wrote: I don´t understand what you are doing, why you use trigonometric functions to add vectors? That's how you add vectors that are in rectangular form. In this case the vectors are in polar form, so they have to be converted, added, then converted back. |
|||
![]() |
|
Madis731 21 Sep 2006, 18:11
Try a very unusual approach:
NB! Still far from optimal Code: finit fldpi fmul [a_rad] fsincos faddp st1,st0 fmul [a_mag] fldpi fmul [b_rad] fsincos fld [b_mag] fmul st2,st0 fmulp st1,st0 fincstp faddp st1,st0 fld st6 fdecstp fld st2 fxch st1 fpatan fstp [r_rad] fmul st0,st0 fxch st1 ;Hint! This should be optimized out... fmul st0,st0 faddp st1,st0 fsqrt fstp [r_mag] |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.