flat assembler
Message board for the users of flat assembler.

 Index > Main > BIGINT Factorial
Author
 Thread
bitRAKE

Joined: 21 Jul 2003
Posts: 3205
Location: vpcmipstrm
bitRAKE
So, how big is 52! ? How many bits is it?

Here is a method which grows the factorial on the stack - no need to approximate size and pre-allocate memory. The same technique works in 32-bit or 16-bit.

Code:
```BIGINT.factorial: ;...of R11

push 1 1 1
pop rcx r8

.next:
xor edx,edx
mov r9,rdx
mov r10,rdx
push rcx
jmp .1

.rdx: add [rsp],rax
jmp .C

.0:   add [rsp+rcx*8+8],rax
adc r9,rdx
.1:   mov rax,[rsp+rcx*8]
mul r8
mov [rsp+rcx*8],r9
mov r9,r10 ; zero
loop .0
pop rcx
; CF set when RDX non-zero
jc .rdx
add [rsp],rax
jnc .2
.C:       adc rdx,r9
; increase number size
push rdx
inc ecx
.2:      inc r8
cmp r11,r8
jnc .next

; number is generated and size in qwords is known (RCX)
; allocate space and copy from stack

lea rsp,[rsp+rcx*8]
retn    ```
...in my case, I just used the number on the stack; but it's in reverse order - with the high qwords low in memory.

_________________
¯\(°_o)/¯ unlicense.org
08 Jun 2009, 07:25
Madis731

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731
I'm just wondering if one can do this without shuffling so much on stack, but at least we can make it faster (~2000clk vs ~10000clk in my experiments):
Code:
```factorial:
; I couldn't find a "normal" way of doing the stacking
; Now the calling-address is interleaved with data on the stack
;
mov     rax,r11
lea     rcx,[r11-1]
xor     ebx,ebx
.inner:
mul     rcx
test    rax,rax
jz      @f
push    rax
call    .recur
pop     rax
@@:
sub     rcx,1
jne     .inner
ret
.recur:
mov     rax,[rsp-8]
mov     rbx,rdx
mul     rcx
add     rax,rbx
;        adc     qword[rsp-24],0
test    rax,rax
jz      @f
push    rax
call    .recur
pop     rax
@@:
;        mov     [rsp-8],rax
ret
```
08 Jun 2009, 13:12
bitRAKE

Joined: 21 Jul 2003
Posts: 3205
Location: vpcmipstrm
bitRAKE
I have no idea what is at [RSP-8] -- looks like bad mojo.

There are several different methods at:
http://www.luschny.de/math/factorial/index.html

...should be easy to reduce by a factor of 4 just using the Linear Difference method - almost the same code. Just kind of brain-storming still - I haven't done any time testing, yet. Each algorithm will have a performance characteristic, as well as external requirements to consider. For small values...
Code:
```; factorial RCX in interval [0,20]
mov rax,1
jrcxz .0
.n:     mul rcx
loop .n
.0:    ```
...is less bytes than storing the table.

Could use a dispatcher to select the routine based on size of RCX:
Code:
```Dispatcher:
virtual at rbp
.fac FUNCTION
end virtual

@@:      cmp [.fac.Limit],rcx
lea rbp,[rbp+sizeof.FUNCTION]
jc @B
call [.fac.Routine - sizeof.FUNCTION]    ```
08 Jun 2009, 15:22
pal

Joined: 26 Aug 2008
Posts: 227
pal
Hmm, interesting codes.

I know it is only an approximation method, but how would an implementation of Stirling's Approximation (or a derivative of it) compare to the standard method. (It is better for larger factorials then faily small ones like 52!).

Code:
```http://en.wikipedia.org/wiki/Stirling%27s_approximation
http://mathworld.wolfram.com/StirlingsApproximation.html
```
08 Jun 2009, 17:26
bitRAKE

Joined: 21 Jul 2003
Posts: 3205
Location: vpcmipstrm
bitRAKE
If you're talking in terms of speed - it'd obviously be faster if only 64 significant bits were needed of 1000000! to use an approximation. At some point all that is possible is an approximation: if we could by some magic write a decimal digit on each atom in the observable universe, we would be limited to 10^10^80 -- which is smaller than 2^260 !.
08 Jun 2009, 18:14
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18071
Location: In your JS exploiting you and your system
revolution
2^260 ~= 1.8526734e+78
08 Jun 2009, 18:23
bitRAKE

Joined: 21 Jul 2003
Posts: 3205
Location: vpcmipstrm
bitRAKE
2^260 = 1852 67342 77970 59126 77713 57601 39006 52565 23197 54650 24902 46313 21344 12661 00742 38976 [GMP].
(Did you see the "!" there?)
08 Jun 2009, 18:39
Madis731

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731
Actually I tested WolframAlpha extensively when testing the output of my factorial program. 2^260 is also calculable there, I wonder why you switched tools when "Wolfie" was already open
09 Jun 2009, 06:02
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc
Why Wolfram Alpha and not the excellent offline Pari/GP calculator (and free too)?
09 Jun 2009, 20:39
Madis731

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731
^o) hmm... maybe PariI/GP and GMP were not advertised that much.

Yet by a quick look I can see that GMP is really hard to use ("pi", "pi()", "sin", "sin(30)" are not recognized). Only thing I could do was the already used "^" and simple arithmetic ("+-*/"). Then again "3.2" nor "3,2" are recgnized Many signs you have to turn to hex ("+" == "%2B") before they start to work. This link http://gmplib.org/#TRY I found, helps a bit, though.

Pari/GP doesn't show me a quick way to get answers to my questions. I would have to download it and then get it to compile and I think it would take at least a day or two before I get my "2^260" answered.
10 Jun 2009, 06:17
r22

Joined: 27 Dec 2004
Posts: 805
r22
I was thinking SSE could help speed up the factorial, but because of the carry and the fact that SSE doesn't have an UNsigned integer multiply it probably wouldn't help much.

Unless you wanted to calculate N partial factorials in parallel then combine them as a last step.

52! = Product[1-13]x * Product[14-26]y * Product[27-39]z * Product[40-52]w

In using SIMD it might even be easier to interleave the products
52! = Prod[1-13](4x - 3) * Prod[1-13](4y-2) * ... * Prod[1-13](4w)
;;For those that don't follow my notation
(1*5*9*13...) * (2*6*10*14...) * (3*7*11*15...) * (4*8*12*16...)
10 Jun 2009, 12:46
Madis731

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731
I figured that one out pretty quickly that if you want to know the last 64 bits of the result you can just MUL in a loop.

Btw, there is "PMULUDQ—Multiply Packed Unsigned Doubleword Integers", but like you said there's no carry and all my tries also ended in a dead end.
10 Jun 2009, 19:23
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc
Madis731 wrote:
Pari/GP doesn't show me a quick way to get answers to my questions. I would have to download it and then get it to compile and I think it would take at least a day or two before I get my "2^260" answered.
It comes in binary I didn't compile anything and once it's set up, it's portable, you can even put it on your USB stick or something for the rest of your life.

Plus the true power lies in its scripts. Give it a try, and it's free.

_________________
Previously known as The_Grey_Beast
11 Jun 2009, 00:24
bitRAKE

Joined: 21 Jul 2003
Posts: 3205
Location: vpcmipstrm
bitRAKE
Madis731 wrote:
I wonder why you switched tools when "Wolfie" was already open
Some people don't like to use anything other than basic X/HTML in their browser, so GMP is another option.

http://ex-cs.sist.ac.jp/~tkouya/try_mpfr.html

...seems to have a usable interface.
Madis731 wrote:
I figured that one out pretty quickly that if you want to know the last 64 bits of the result you can just MUL in a loop.
They are mostly zero.

_________________
¯\(°_o)/¯ unlicense.org
16 Jun 2009, 05:42
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum

Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.