flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
LocoDelAssembly
Take a look to http://board.flatassembler.net/topic.php?t=10488 or maybe http://www.madwizard.org/programming/snippets?id=36 .
Remember that the Nth root of X is also X^(1/N). |
|||
![]() |
|
Tyler
I haven't looked at the resources you provided yet, but the reason I knew of X^(1/N) and didn't
see it as the answer is because I would do exponents as a loop that repeated for the number that is the exponent, each rep, mul-ing the number by the previous total. Like Code: ; ax^cx exponent: mov ax, bx @@: imul ax, bx dec cx cmp cx, 0 jne @b Say I want to do 16^(1/2), I can't because I can't do 1/2 repetition. |
|||
![]() |
|
edemko
Quote:
x^y = 2^z z = log(2;x^y) = log(2;x)*y use fpu |
|||
![]() |
|
edemko
Quote:
equals Code: loop @b |
|||
![]() |
|
LocoDelAssembly
If you want an ALU solution, then you could make a dichotomic search. For every number you probe, multiply it by the N times and check if the result is the argument (make sure to check EDX is zero in between to ensure your are getting the real answer). Since some answers are not integer (and others not even real numbers), you'll have to make a decision on what to do with those.
Message to the rest: Since Tyler skipped my links he obviously wants this as an exercise so wait a little before posting code ![]() |
|||
![]() |
|
Tyler
Is it okay to do algorithms in such a guess-check type way? This may expose my inexperience
(yet again ![]() repetitions, as opposed to that, which could repeat a large amount of times if the number was big enough. Is it possible to make a single rep equation, or do I have no choice than to use guess-check. |
|||
![]() |
|
edemko
what is the quess
|
|||
![]() |
|
Tyler
The first number of the dichotomic search LocoDelAssembly mentioned.
|
|||
![]() |
|
edemko
post #7 at
http://wasm.ru/forum/viewtopic.php?id=36161 The executive has its own to float converter which is updated but not included in the file offered. Tomasz's expressi.inc's get_fp_value at line 449 amused me more than the one i've got myself. We should give a transcription how our aliases are spelled |
|||
![]() |
|
Tyler
Oh, I see now, that is a one rep equation like solution. Unfortunately, I have
no knowledge of how to program for the fpu(and barely any for the cpu ![]() although learning how to do floating point math in asm is on my todo list, at the present, if it uses anything that isn't stored by pushad(in exception of a few), it's above my head. serfasm wrote:
Sorry, I did that by accident because I remember Tomasz's name as the english Thomas. |
|||
![]() |
|
LocoDelAssembly
You should take in mind however that although your code is loop-less, the FPU internally has to do some sort of iterative work inside to compute the answer. The instructions FYL2X and F2XM1 take 96 and 69 cycles respectively on a Core2 Duo (family 06_0EH) according to Intel Optimization manual, and it is also noted that it could take longer. (fprem is not included in the manual, but Agner Fog says it takes 16-56 cycles)
For the method I commented, if your input is 32-bit numbers, it would cost you 16 probes max to compute the Nth square root (not 32 because your upper bound for searching would be 65536 to cover for the square root and could obviously be much less for cubic root, fourth root, etc) PS: But of course the FPU produce a better quality result than this ALU method. |
|||
![]() |
|
windwakr
I know this is off-topic to his question, but this is the perfect place to ask this:
I've recently been looking into exponents(So I can implement variable exponent powers on Z for my Mandelbrot viewer, which I did, and it is awesome! See it in action: http://www.youtube.com/watch?v=xAgvqG3zEto ) and I came across this way of doing it(pieced it together from here): Code: fld [power] fld [number] fyl2x fld st0 frndint fsub st1,st0 fxch st1 f2xm1 fld1 faddp fscale fstp st1 Which is faster? This one, or the one in that madwizard link? |
|||
![]() |
|
LocoDelAssembly
I don't know, I guess someone will have to time them both
![]() BTW, there also exist an SSE version of pow inside msvcrt.dll which is so, so complex and large that I don't know if it is really faster on my computer. Check the link to this forum I posted. |
|||
![]() |
|
edemko
Code: number dt 0.0 power dt 0.0 ; ... ; x^y = 2^z ; z = log(2;x)*y fld [power] ; y fld [number] ; x fabs ; engine requires x to be > 0 fyl2x ; z fld st0 ; z copy push $37f or (11b shl 10) ; truncating rounding... fldcw [esp] ; ...applied pop dword[esp] ; restore stack frndint ; z.whole fsub st1,st0 ; z.fraction fxch st1 ; z.fraction <-> z.whole f2xm1 ; 2^z.fraction -1 fld1 ; 1 loaded faddp ; 2^z.fraction fscale ; 2^z.fraction * 2^z.whole = x^y fstp st1 ; restore fpu stack ; Tyler, you have got complicated TODO 45 bytes, x cycles. |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.