flat assembler
Message board for the users of flat assembler.

 Index > Main > nth roots in asm
Author
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 06 Mar 2010, 05:34
Not that I can do them in any language , but how would one go about finding an
nth root in assembly? I doubt I'll ever use it, just a thought that crossed my mind in
math class today.

My 128th post, I wonder when Tomasz will get tired of my spamming [/edit]

Last edited by Tyler on 07 Mar 2010, 03:01; edited 1 time in total
06 Mar 2010, 05:34
LocoDelAssembly

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Mar 2010, 05:56
Take a look to http://board.flatassembler.net/topic.php?t=10488 or maybe http://www.madwizard.org/programming/snippets?id=36 .

Remember that the Nth root of X is also X^(1/N).
06 Mar 2010, 05:56
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 06 Mar 2010, 06:43
I haven't looked at the resources you provided yet, but the reason I knew of X^(1/N) and didn't
see it as the answer is because I would do exponents as a loop that repeated for the
number that is the exponent, each rep, mul-ing the number by the previous total. Like
Code:
```; ax^cx
exponent:
mov ax, bx
@@:
imul ax, bx
dec cx
cmp cx, 0
jne @b
```

Say I want to do 16^(1/2), I can't because I can't do 1/2 repetition.
06 Mar 2010, 06:43
edemko

Joined: 18 Jul 2009
Posts: 549
edemko 06 Mar 2010, 06:51
Quote:

can't do 1/2 repetition.

x^y = 2^z
z = log(2;x^y) = log(2;x)*y
use fpu
06 Mar 2010, 06:51
edemko

Joined: 18 Jul 2009
Posts: 549
edemko 06 Mar 2010, 06:54
Quote:

dec cx
cmp cx, 0
jne @b

equals
Code:
```        loop    @b
```
06 Mar 2010, 06:54
LocoDelAssembly

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Mar 2010, 07:06
If you want an ALU solution, then you could make a dichotomic search. For every number you probe, multiply it by the N times and check if the result is the argument (make sure to check EDX is zero in between to ensure your are getting the real answer). Since some answers are not integer (and others not even real numbers), you'll have to make a decision on what to do with those.

Message to the rest: Since Tyler skipped my links he obviously wants this as an exercise so wait a little before posting code
06 Mar 2010, 07:06
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 07 Mar 2010, 02:04
Is it okay to do algorithms in such a guess-check type way? This may expose my inexperience
(yet again ), but I've always tried to find a definite solution with a fairly definite amount of
repetitions, as opposed to that, which could repeat a large amount of times if the number was big
enough. Is it possible to make a single rep equation, or do I have no choice than to use guess-check.
07 Mar 2010, 02:04
edemko

Joined: 18 Jul 2009
Posts: 549
edemko 07 Mar 2010, 02:10
what is the quess
07 Mar 2010, 02:10
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 07 Mar 2010, 02:19
The first number of the dichotomic search LocoDelAssembly mentioned.
07 Mar 2010, 02:19
edemko

Joined: 18 Jul 2009
Posts: 549
edemko 07 Mar 2010, 02:28
post #7 at
http://wasm.ru/forum/viewtopic.php?id=36161
The executive has its own to float converter which is updated but not included in the file offered. Tomasz's expressi.inc's get_fp_value at line 449 amused me more than the one i've got myself.

We should give a transcription how our aliases are spelled
07 Mar 2010, 02:28
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 07 Mar 2010, 02:58
Oh, I see now, that is a one rep equation like solution. Unfortunately, I have
no knowledge of how to program for the fpu(and barely any for the cpu ), so,
although learning how to do floating point math in asm is on my todo list, at the
present, if it uses anything that isn't stored by pushad(in exception of a few), it's
serfasm wrote:

We should give a transcription how our aliases are spelled

Sorry, I did that by accident because I remember Tomasz's name as the english
Thomas.
07 Mar 2010, 02:58
LocoDelAssembly

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 07 Mar 2010, 03:35
You should take in mind however that although your code is loop-less, the FPU internally has to do some sort of iterative work inside to compute the answer. The instructions FYL2X and F2XM1 take 96 and 69 cycles respectively on a Core2 Duo (family 06_0EH) according to Intel Optimization manual, and it is also noted that it could take longer. (fprem is not included in the manual, but Agner Fog says it takes 16-56 cycles)

For the method I commented, if your input is 32-bit numbers, it would cost you 16 probes max to compute the Nth square root (not 32 because your upper bound for searching would be 65536 to cover for the square root and could obviously be much less for cubic root, fourth root, etc)

PS: But of course the FPU produce a better quality result than this ALU method.
07 Mar 2010, 03:35
windwakr

Joined: 30 Jun 2004
Posts: 827
windwakr 07 Mar 2010, 03:50
I know this is off-topic to his question, but this is the perfect place to ask this:

I've recently been looking into exponents(So I can implement variable exponent powers on Z for my Mandelbrot viewer, which I did, and it is awesome! See it in action: http://www.youtube.com/watch?v=xAgvqG3zEto ) and I came across this way of doing it(pieced it together from here):
Code:
```fld [power]
fld [number]
fyl2x
fld st0
frndint
fsub st1,st0
fxch st1
f2xm1
fld1
fscale
fstp st1
```

Which is faster? This one, or the one in that madwizard link?

_________________
----> * <---- My star, won HERE
07 Mar 2010, 03:50
LocoDelAssembly

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 07 Mar 2010, 04:12
I don't know, I guess someone will have to time them both

BTW, there also exist an SSE version of pow inside msvcrt.dll which is so, so complex and large that I don't know if it is really faster on my computer. Check the link to this forum I posted.
07 Mar 2010, 04:12
edemko

Joined: 18 Jul 2009
Posts: 549
edemko 07 Mar 2010, 07:28
Code:
```        number  dt 0.0
power   dt 0.0
; ...
; x^y = 2^z
; z = log(2;x)*y
fld     [power]               ; y
fld     [number]              ; x
fabs                           ; engine requires x to be > 0
fyl2x                         ; z
fld     st0                   ; z copy
push    \$37f or (11b shl 10)   ; truncating rounding...
fldcw   [esp]                  ; ...applied
pop     dword[esp]             ; restore stack
frndint                       ; z.whole
fsub    st1,st0               ; z.fraction
fxch    st1                   ; z.fraction <-> z.whole
f2xm1                         ; 2^z.fraction -1
fscale                        ; 2^z.fraction * 2^z.whole = x^y
fstp    st1                   ; restore fpu stack
; Tyler, you have got complicated TODO
```

45 bytes, x cycles.
07 Mar 2010, 07:28
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum