flat assembler
Message board for the users of flat assembler.

 Index > Main > Critique this. Algorithm to add vectors
Author
mattst88

Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88 20 Sep 2006, 01:29
Here's some code I wrote for fun to add vectors. It's really one of the first things I've ever written in assembly. Advice, improvements, etc wanted.

Code:
format PE

entry main

main:   finit
fldpi                   ; pi

; Calculate Ax and Ay
fld     [a_mag]         ; pi a_mag
fmul    st0,st2         ; a_rad a_mag pi
fmul    st0,st2         ; Ax a_rad a_mag pi
fxch    st2             ; a_mag a_rad Ax pi
fxch    st1             ; a_rad a_mag Ax pi
fsin                    ; sin(a_rad) a_mag Ax pi
fmulp   st1,st0         ; Ay Ax pi

; Calculate Bx and By
fld     [b_mag]         ; b_mag Ay Ax pi
fmul    st0,st4         ; b_rad b_mag Ay Ax pi
ffree   st4             ; b_rad b_mag Ay Ax
fmul    st0,st2         ; Bx b_rad b_mag Ay Ax
fxch    st2             ; b_mag b_rad Bx Ay Ax
fxch    st1             ; b_rad b_mag Bx Ay Ax
fsin                    ; sin(b_rad) b_mag Bx Ay Ax
fmulp   st1,st0         ; By Bx Ay Ax

; Calculate Rx and Ry
faddp   st2,st0         ; Ry Bx Ax
fxch    st2             ; Ax Bx Ry

; Calculate direction of resultant
fld     st0             ; Rx Rx Ry
fld     st2             ; Ry Rx Rx Ry
; resultant direction is stored in memory

; Calculate magnitude of resultant
fmul    st0,st0         ; Rx^2 Ry
fxch    st1             ; Ry Rx^2
fmul    st0,st0         ; Ry^2 Rx^2
fsqrt                   ; R
fstp    [r_mag]         ; resultant magnitude is stored in memory

ret

a_mag dq 1.0    ; Magnitude of vector A
a_rad dq 1.5    ; Direction of vector A in terms of Pi. 2.0 would be 2.0*Pi
b_mag dq 1.0    ; Magnitude of vector B
b_rad dq 0.0    ; Direction of vector B in terms of Pi. 2.0 would be 2.0*Pi
r_mag dq ?      ; Magnitude of the resultant
r_rad dq ?      ; Direction of the resultant

Values have to be hard coded. Result is stored in r_mag and r_rad. The way I checked their values for validity is by fld'ing them back onto the stack and looking in Ollydbg.

The comments out to the side of the instructions just keep track of what's on the FPU stack after that instruction is executed.
20 Sep 2006, 01:29
Goplat

Joined: 15 Sep 2006
Posts: 181
Goplat 20 Sep 2006, 02:14
There's an "fsincos" instruction that calculates both sine and cosine at the same time; it's faster than doing them separately.
20 Sep 2006, 02:14
mattst88

Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88 20 Sep 2006, 02:51
Ahh, you are correct. That will save quite a few clocks.

Thanks

20 Sep 2006, 02:51
Octavio

Joined: 21 Jun 2003
Posts: 366
Location: Spain
Octavio 20 Sep 2006, 11:00
mattst88 wrote:
Ahh, you are correct. That will save quite a few clocks.

Thanks

I don´t understand what you are doing, why you use trigonometric functions to add vectors?
it´s not enought to sum the components?
20 Sep 2006, 11:00
Garthower

Joined: 21 Apr 2006
Posts: 158
Location: Ukraine
Garthower 20 Sep 2006, 12:36
You can use 3DNOW!, or, if yours CPU new enough, SSE and/or SSE2. Banal translation into these commands it's possible to receive a gain of speed on %20-%60.
20 Sep 2006, 12:36
LocoDelAssembly

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 20 Sep 2006, 13:12
I think the FPU is more precise, he does a lot of intermediate calculus using 80 bits precision while SSE has up to 64 bits. I now that he uses qwords memory values but he does lots of calculus before storing the final result on memory so those extra bits can make some difference in the final result compared to the result obtained with SSE (which is not an extremately big difference of course).
20 Sep 2006, 13:12
Garthower

Joined: 21 Apr 2006
Posts: 158
Location: Ukraine
Garthower 20 Sep 2006, 13:28
Perhaps, and it's need to check up, since I did not check it. But in any case, this discrepancy makes million shares (if not less) if to compare the results received by commands FPU and SSE. Interesting experiment can turn out, it will be soon necessary to lead it.
20 Sep 2006, 13:28
Goplat

Joined: 15 Sep 2006
Posts: 181
Goplat 20 Sep 2006, 16:14
Octavio wrote:
I don´t understand what you are doing, why you use trigonometric functions to add vectors?
it´s not enought to sum the components?

That's how you add vectors that are in rectangular form. In this case the vectors are in polar form, so they have to be converted, added, then converted back.
20 Sep 2006, 16:14

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Try a very unusual approach:
NB! Still far from optimal
Code:
finit
fldpi
fsincos
fmul            [a_mag]
fldpi
fsincos
fld             [b_mag]
fmul            st2,st0
fmulp           st1,st0
fincstp
fld             st6
fdecstp
fld             st2
fxch            st1
fpatan
fmul            st0,st0
fxch            st1 ;Hint! This should be optimized out...
fmul            st0,st0
fsqrt
fstp            [r_mag]

21 Sep 2006, 18:11
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum