flat assembler
Message board for the users of flat assembler.

Index > Main > nth roots in asm

Author
Thread Post new topic Reply to topic
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Not that I can do them in any language Smile, but how would one go about finding an
nth root in assembly? I doubt I'll ever use it, just a thought that crossed my mind in
math class today.

[edit]My 128th post, I wonder when Tomasz will get tired of my spamming Smile[/edit]


Last edited by Tyler on 07 Mar 2010, 03:01; edited 1 time in total
Post 06 Mar 2010, 05:34
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Take a look to http://board.flatassembler.net/topic.php?t=10488 or maybe http://www.madwizard.org/programming/snippets?id=36 .

Remember that the Nth root of X is also X^(1/N).
Post 06 Mar 2010, 05:56
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
I haven't looked at the resources you provided yet, but the reason I knew of X^(1/N) and didn't
see it as the answer is because I would do exponents as a loop that repeated for the
number that is the exponent, each rep, mul-ing the number by the previous total. Like
Code:
; ax^cx
exponent:
mov ax, bx
@@:
imul ax, bx
dec cx
cmp cx, 0
jne @b
    

Say I want to do 16^(1/2), I can't because I can't do 1/2 repetition.
Post 06 Mar 2010, 06:43
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
Quote:

can't do 1/2 repetition.

x^y = 2^z
z = log(2;x^y) = log(2;x)*y
use fpu
Post 06 Mar 2010, 06:51
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
Quote:

dec cx
cmp cx, 0
jne @b

equals
Code:
        loop    @b
    
Post 06 Mar 2010, 06:54
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
If you want an ALU solution, then you could make a dichotomic search. For every number you probe, multiply it by the N times and check if the result is the argument (make sure to check EDX is zero in between to ensure your are getting the real answer). Since some answers are not integer (and others not even real numbers), you'll have to make a decision on what to do with those.

Message to the rest: Since Tyler skipped my links he obviously wants this as an exercise so wait a little before posting code Smile
Post 06 Mar 2010, 07:06
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Is it okay to do algorithms in such a guess-check type way? This may expose my inexperience
(yet again Laughing), but I've always tried to find a definite solution with a fairly definite amount of
repetitions, as opposed to that, which could repeat a large amount of times if the number was big
enough. Is it possible to make a single rep equation, or do I have no choice than to use guess-check.
Post 07 Mar 2010, 02:04
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
what is the quess
Post 07 Mar 2010, 02:10
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
The first number of the dichotomic search LocoDelAssembly mentioned.
Post 07 Mar 2010, 02:19
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
post #7 at
http://wasm.ru/forum/viewtopic.php?id=36161
The executive has its own to float converter which is updated but not included in the file offered. Tomasz's expressi.inc's get_fp_value at line 449 amused me more than the one i've got myself.

We should give a transcription how our aliases are spelled
Post 07 Mar 2010, 02:28
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Oh, I see now, that is a one rep equation like solution. Unfortunately, I have
no knowledge of how to program for the fpu(and barely any for the cpu Smile), so,
although learning how to do floating point math in asm is on my todo list, at the
present, if it uses anything that isn't stored by pushad(in exception of a few), it's
above my head.
serfasm wrote:

We should give a transcription how our aliases are spelled

Sorry, I did that by accident because I remember Tomasz's name as the english
Thomas.
Post 07 Mar 2010, 02:58
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
You should take in mind however that although your code is loop-less, the FPU internally has to do some sort of iterative work inside to compute the answer. The instructions FYL2X and F2XM1 take 96 and 69 cycles respectively on a Core2 Duo (family 06_0EH) according to Intel Optimization manual, and it is also noted that it could take longer. (fprem is not included in the manual, but Agner Fog says it takes 16-56 cycles)

For the method I commented, if your input is 32-bit numbers, it would cost you 16 probes max to compute the Nth square root (not 32 because your upper bound for searching would be 65536 to cover for the square root and could obviously be much less for cubic root, fourth root, etc)

PS: But of course the FPU produce a better quality result than this ALU method.
Post 07 Mar 2010, 03:35
View user's profile Send private message Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
Location: Michigan, USA
windwakr
I know this is off-topic to his question, but this is the perfect place to ask this:

I've recently been looking into exponents(So I can implement variable exponent powers on Z for my Mandelbrot viewer, which I did, and it is awesome! See it in action: http://www.youtube.com/watch?v=xAgvqG3zEto ) and I came across this way of doing it(pieced it together from here):
Code:
fld [power]
fld [number]
fyl2x
fld st0
frndint
fsub st1,st0
fxch st1
f2xm1
fld1
faddp
fscale
fstp st1
    


Which is faster? This one, or the one in that madwizard link?

_________________
----> * <---- My star, won HERE
Post 07 Mar 2010, 03:50
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I don't know, I guess someone will have to time them both Very Happy

BTW, there also exist an SSE version of pow inside msvcrt.dll which is so, so complex and large that I don't know if it is really faster on my computer. Check the link to this forum I posted.
Post 07 Mar 2010, 04:12
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
Code:
        number  dt 0.0
        power   dt 0.0
                                      ; ...
                                      ; x^y = 2^z
                                      ; z = log(2;x)*y
        fld     [power]               ; y
        fld     [number]              ; x
       fabs                           ; engine requires x to be > 0
        fyl2x                         ; z
        fld     st0                   ; z copy
       push    $37f or (11b shl 10)   ; truncating rounding...
       fldcw   [esp]                  ; ...applied
       pop     dword[esp]             ; restore stack
        frndint                       ; z.whole
        fsub    st1,st0               ; z.fraction
        fxch    st1                   ; z.fraction <-> z.whole
        f2xm1                         ; 2^z.fraction -1
        fld1                          ; 1 loaded
        faddp                         ; 2^z.fraction
        fscale                        ; 2^z.fraction * 2^z.whole = x^y
        fstp    st1                   ; restore fpu stack
                                      ; Tyler, you have got complicated TODO
    

45 bytes, x cycles.
Post 07 Mar 2010, 07:28
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.