flat assembler
Message board for the users of flat assembler.
Index
> Main > Code optimization (AKA C vs. Asm) Goto page Previous 1, 2, 3, 4, 5 |
Author |
|
tthsqe 04 Jan 2010, 02:48
Do you think there is a better way of computing f(2^n) than the a b c thingy posted before? Might mathematica uses exactly the same algorithm, but the reason the manual code was slower was for the memory allocation/deallocation and possible type checking on a, b, and c?
|
|||
04 Jan 2010, 02:48 |
|
Alphonso 04 Jan 2010, 10:18
bitRAKE wrote: The binary powering method only needs 20 itterations for 1024^2; wereas the add method needs over a million itterations ( O(log_2 N) verses O(N) ). Well, I've got to thank you guys, I actually found out what fibo' numbers are and actually got to do some thinking. Anyway I discovered this algo while playing with the numbers and for now I'm going to call it the 'Alphonso algorithm' however I would be somewhat surprised if it doesn't already exist. For calculating fib(2^n) Code: The 'alphonso series' where x is an integer >0 and when x = 1 then alph(x) = 3 alph(x+1) = alph(x)^2 - 2 Therefore alph(1) = 3 alph(2) = 3^2 - 2 = 7 alph(3) = 7^2 - 2 = 47 alph(4) = 47^2 - 2 = 2207 alph(5) = 2207^2 - 2 = 4870847 alph(6) = 4870847^2 - 2 = 23725150497407 etc... Where x is an integer >0 and knowing fib(2) = 1 then fib(2^(x + 1)) = fib(2^x) * alph(x) so when x=1 then fib(2^(1 + 1)) = fib(2^1) * alph(1) then fib(4) = fib(2) * alph(1) then fib(4) = 1 * 3 = 3 x=2 then fib(2^(2 + 1)) = fib(2^2) * alph(2) then fib(8) = fib(4) * alph(2) then fib(8) = 3 * 7 = 21 x=3 then fib(2^(3 + 1)) = fib(2^3) * alph(3) then fib(16) = fib(8) * alph(3) then fib(16) = 21 * 47 = 987 x=4 then fib(2^(4 + 1)) = fib(2^4) * alph(4) then fib(32) = fib(16) * alph(4) then fib(32) = 987 * 2207 = 2178309 etc... |
|||
04 Jan 2010, 10:18 |
|
bitRAKE 04 Jan 2010, 15:34
tthsqe wrote: Do you think there is a better way of computing f(2^n) than the a b c thingy posted before? Might mathematica uses exactly the same algorithm, but the reason the manual code was slower was for the memory allocation/deallocation and possible type checking on a, b, and c? I have no intention of creating some new multiply code - look at GMP (and others) development for that. The itterations consist of three squares and three additions. I don't know: How many additions equals a square with your hardware? _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
04 Jan 2010, 15:34 |
|
r22 04 Jan 2010, 19:45
Squaring a power of 2 is just 1 shift operation. On a "big number" (depending on the implementation) it would be 1 mem-copy/reallocation or 1 update of the explonent.
Squaring a NON power of 2 efficiently is likely done with clever tricks based around the ease of squaring the closest power of 2 and polynomial expansion. Code: SQUARE( N ) = X = ClosestPowerOfTwo( N ) A = N - X RETURN X^2 + 2 * X + A SQUARE( 513 ) X = 2 ^ Most Significant Bit = 512 A = 513 - 512 = 1 RETURN (X + A)^2 = (512 + 1)^2 = 512^2 + 2 * 512 + 1 To conclude, binary squaring takes... 1 Search and lookup for the BSR and cloestest power of 2 1 Shift/mem-move/realloc for X^2 3 Adds for + X + X + A *** EDIT *** lest I be crucified, this is of course BEST CASE scenario where N is a power of 2 + 1. For worst case you would have to recursively call the algorithm for A. Code: SQUARE( N ) = X = ClosestPowerOfTwo( N ) A = N - X IF A = 1 RETURN X^2 + 2 * X + A ELSE RETURN X^2 + 2 * X + SQUARE( A ) |
|||
04 Jan 2010, 19:45 |
|
tthsqe 05 Jan 2010, 09:38
alphonso, great iead with the lucas numbers (your 'alphonso series')....
with them we can beat mathematica! The following consistently beats the built in function by 5-10% on most inputs: [Edited] Code: f[n_Integer] := Block[{f = 1, l = 1, d = IntegerDigits[n, 2]}, If[ Length[d] > 1, (* if n has more than one bit *) (* then *) MapThread[ If[#1 === 0, (* #1 comes from the first list, #2 from the second *) {f, l} = {f*l, l^2 - 2*#2};, (* #1 is a zero *) {f, l} = {l*(f + l)/2 - #2, l*(5*f + l)/2 - #2}; (* else #1 is a one *) ] &, {d[[2 ;; -2]], (-1)^d[[1 ;; -3]]} (* two list of bits to thread over *) ]; If[d[[-1]] === 0, If[n < 0, (* if lsb is 0 return one of these *) -f*l, f*l ] , l*(f+l)/2-(-1)^d[[-2]] (* else lsb is 1 and return this *) ] , (* else *) n^2 (* else return n^2 *) ] ]; The performance evens out as n contains more and more ones or gets larger in general. ex: f(2^26): 5.1 vs mathematica's timing of 5.6 f(1000^3): 100 vs what was below f(1024^3): 105 " " Note that it is still not optimal on inputs like 3*n where the triplication formula can be used to get even better results. Last edited by tthsqe on 05 Jan 2010, 20:02; edited 3 times in total |
|||
05 Jan 2010, 09:38 |
|
Alphonso 05 Jan 2010, 12:14
Thanks tthsqe, I didn't think it would be anything new but it was fun discovering the sequence.
BitRAKE, I didn't mean to suggest you come up with a new method, I was just interested in seeing a 'generic' FASM solution for huge numbers. Sorry if I lead you to believe otherwise and no problem if there isn't any FASM code to be had. |
|||
05 Jan 2010, 12:14 |
|
r22 05 Jan 2010, 16:18
I'm interested in the algorithm that tthsqe posted but I find that math-program syntax fairly ambiguous. Is there a link to a syntax reference so I can properly interpret what the hell it's doing?
The liberal use of square brackets, commas and semicolons in the syntax and the lack of comments is probably what's throwing me off. Perhaps this is one method that math/science types ensure job security In any case I would like to be enlightened. |
|||
05 Jan 2010, 16:18 |
|
tthsqe 05 Jan 2010, 18:15
r22,
It is a iteration of the equations: f(2n) = f(n) * l(n) l(2n) = l(n) ^ 2 - 2 (-1) ^ n f(2n+1) = l(n) * (f(n) + l(n)) / 2 - (-1)^n l(2n+1) = l(n) * (5 * f(n) + l(n)) / 2 - (-1)^n Basically, just scan the binary digits msb to lsb, and apply the appropriate equation. Of course, at the last step we don't need to compute l. A nice aspect is that it does eliminate the need for most of those mundane while and for loops, but I agree that the brackted syntax, though it may unify the whole landuage, is not always the most readable. Check out the documentation at http://reference.wolfram.com/mathematica/guide/Mathematica.html. Note: * f(-1), f(0) and f(1) need special cases, and f(-n) needs a factor of (-1)^(n-1) * a[[b]] is the Part function * MapThread really should be ScanThread if there was such a function Edit: previous post has been updated Last edited by tthsqe on 05 Jan 2010, 18:52; edited 1 time in total |
|||
05 Jan 2010, 18:15 |
|
bitRAKE 05 Jan 2010, 18:17
tthsqe, has been using Mathematica long enough to learn the shortcuts - there is a more verbose syntax which executes just as fast (*hint*). The inner IF statement is a pure function which performs the first operation if a bit is zero, and the second line if one. More obvious, {f, l} = {a, b} is shortcut for f=a, l=b. The #n notation within the pure function represents the arguments. Double brackets access elements of arrays. Much of the syntax is C-like.
|
|||
05 Jan 2010, 18:17 |
|
r22 05 Jan 2010, 19:16
It's much clearer now, I am sufficiently enlightened.
|
|||
05 Jan 2010, 19:16 |
|
tthsqe 05 Jan 2010, 19:20
Quote: there is a more verbose syntax which executes just as fast (*hint*). What are you hinting at here? |
|||
05 Jan 2010, 19:20 |
|
tthsqe 05 Jan 2010, 19:23
Ah, got it.
|
|||
05 Jan 2010, 19:23 |
|
Goto page Previous 1, 2, 3, 4, 5 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.