flat assembler
Message board for the users of flat assembler.

 Index > Main > Working with big numbers Goto page 1, 2  Next
Author
 Thread
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
hi, i wrote an article how to do basic operations with bignums (big numbers, more than 32 / 64 bits):

http://www.x86asm.net/articles/working-with-big-numbers-using-x86-instructions/
20 Nov 2007, 18:19
asmfan

Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Quote:

can't be done in any straightforward way.

Mult via addition and div via subtraction)
20 Nov 2007, 18:30
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
By straightforward way i meant something linear, without looping, some O(N) way. I plan to add example of routine to multiply two big numbers, it should be quite simple, O(N*log(N)) i think.
20 Nov 2007, 19:02
realcr

Joined: 02 Apr 2007
Posts: 39
realcr
Hey vid.

I really like reading your pieces of code , so first great thanks for sharing it.
Asm people like speed , so you probably want to use an efficient algorithm for multiplication , like fast fourier transform for the discrete case.

realcr.
20 Nov 2007, 20:26
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
Unfortunately i don't know FFT yet. I thought FFT is used for exponentiation, not multiplication.

My purpose in this article was to demonstrate real basics. Like how to work with 64bit numbers on 32bit processor, etc.

If you are really after speed, then use some ready-made library, which has been developed for several years. It will be likely faster than what you would write.
20 Nov 2007, 20:36
Garthower

Joined: 21 Apr 2006
Posts: 158
Location: Ukraine
Garthower
Hi vid!
Some time back I too was engaged in a similar problem. I shall not write the theory, you and have well painted it. Here my code for various operations, can, it will be useful to someone.

Add numbers
Code:
``` ; input: ESI=big a

;        EDI=big b

;        ECX=length in DWORDs

; output:CF

; action:b = b + a

bn_add:       clc                     ; CF=0

__cycle:      lodsd

adc     eax, [edi]

stosd

loop    __cycle

retn
```

[/b]Sub numbers:[/b]
Code:
``` ; input: ESI=big a

;        EDI=big b

;        ECX=length in DWORDs

; output:CF

; action:b = b - a

bn_sub:       clc                     ; CF=0

__cycle:      lodsd

sbb     [edi], eax

lea     edi, [edi+4]

loop    __cycle

retn
```

Compare numbers:
Code:
``` ; input: ESI=big a

;        EDI=big b

;        ECX=length in DWORDs

; output:CF==1 (jb) -- a < b

;        ZF==1 (jz) -- a = b

;        CF==0 (ja) -- a > b

bn_cmp:

__cycle:      mov     eax, [esi+ecx*4-4]

cmp     eax, [edi+ecx*4-4]
loopz  __cycle

__exit:       retn
```

Mul numbers:
Code:
```MulNumbers: ;Num1,Num2,Res,Size
pushad

mov edi,[esp+2Ch]
xor eax,eax
mov ecx,LEN_DWORD*2
rep stosd

mov ebx,[esp+24h]
mov ecx,[esp+28h]
xor esi,esi
align 4
@@First_Cicle:
xor edi,edi
align 4
@@Second_Cicle:
mov eax,[ebx+edi]
mul 4 ptr [ecx+esi]
lea ebp,[esi+edi]
add ebp,[esp+2Ch]
add [ebp],eax
adc [ebp+4],edx
lea ebp,[ebp+8]
jnc @@No_Add
@@Add_Again:
adc 4 ptr [ebp],0
lea ebp,[ebp+4]
jc @@Add_Again
@@No_Add:
add edi,4
cmp edi,[esp+30h]
jb @@Second_Cicle

add esi,4
cmp esi,[esp+30h]
jb @@First_Cicle

popad
ret 10h
```

DIV numbers:
Code:
```DivNumbers: ;Num1:dword,Num2:dword,Res:dword,Rem:dword
pushad

mov ecx,LEN_DWORD
mov edi,[esp+2Ch]
xor eax,eax
push ecx
rep stosd
pop ecx

mov edi,[esp+30h]
rep stosd

mov ebx,LEN_BITS-1
align 4
@@Next_Bit:
mov esi,[esp+2Ch]
clc
mov ecx,LEN_DWORD
align 4
@@Next_Shl:
rcl 4 ptr [esi],1
lea esi,[esi+4]
loop @@Next_Shl

mov eax,[esp+24h]
bt [eax],ebx
mov esi,[esp+30h]
mov ecx,LEN_DWORD
push esi
align 4
@@Next_Rcl:
rcl 4 ptr [esi],1
lea esi,[esi+4]
loop @@Next_Rcl
pop edi

mov esi,[esp+28h]
mov ecx,LEN_DWORD
align 4
@@Next_Cmp:
mov eax,[edi+ecx*4-4]
cmp eax,[esi+ecx*4-4]
jb @@No_Sub
ja @@Sub
loop @@Next_Cmp
@@Sub:
mov ecx,LEN_DWORD
clc
align 4
@@Next_Sub:
lodsd
sbb [edi],eax
lea edi,[edi+4]
loop @@Next_Sub

mov eax,[esp+2Ch]
or 1 ptr [eax],1
@@No_Sub:
dec ebx
jns @@Next_Bit

popad
ret 10h
```
21 Nov 2007, 18:35
hckr83

Joined: 12 Nov 2006
Posts: 86
Location: usa
hckr83
hey, Vid, is English your first language? You talk understandably, but some stuff is kinda awkward to read...like
Quote:
To invert all bits, we have not instruction

would be at least To invert all bits, we have a not instruction...and possibly phrased a bit differently...
21 Nov 2007, 23:51
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
nope, i just learnt english by using. I always omit "a" and "the" , for some reason.

and i would use "the" in that case
21 Nov 2007, 23:55
MazeGen

Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
hckr83 wrote:
Quote:
To invert all bits, we have not instruction

would be at least To invert all bits, we have a not instruction...and possibly phrased a bit differently...

I'm the editor of x86asm.net. Note that "not" is set up using Curier - that should make the difference. Vid is used to use quotes to make mnemonics different, but I don't like this way.
And yes, there should be additional a...
22 Nov 2007, 08:55
rugxulo

Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo
Code:
```; ----------------------------------------------------------------------

AMD Optimizations

(c) 2001 - 2006 Advanced Micro Devices, Inc. All rights reserved.

http://developer.amd.com
/htmlhelp/optimization/wwhelp/wwhimpl/java/html/wwhelp.htm

(below taken from "Integer Optimizations" section):

; ----------------------------------------------------------------------

Efficient 64-Bit Integer Arithmetic in 32-Bit Mode
Optimization

The following section contains a collection of code snippets and
subroutines showing the efficient implementation of 64-bit arithmetic
in 32-bit mode. Note that these are 32-bit recommendations, in 64-bit
mode it is important to use 64-bit integer instructions for best
performance.

Addition, subtraction, negation, and shifting are best handled by inline
code. Multiplication, division, and the computation of remainders are less
common operations and are usually implemented as subroutines. If these
subroutines are used often, the programmer should consider inlining them.
Except for division and remainder calculations, the following code works
for both signed and unsigned integers. The division and remainder code
shown works for unsigned integers, but can easily be extended to handle
signed integers.

64-Bit Addition

; Add ECX:EBX to EDX:EAX, and place sum in EDX:EAX.
add eax, ebx
adc edx, ecx

64-Bit Subtraction

; Subtract ECX:EBX from EDX:EAX and place difference in EDX:EAX.
sub eax, ebx
sbb edx, ecx

64-Bit Negation

; Negate EDX:EAX.
not edx
neg eax
sbb edx, -1   ; Fix: Increment high word if low word was 0.

64-Bit Left Shift

; Shift EDX:EAX left, ??shift count in ECX (count
;  applied modulo 64).
shld edx, eax, cl   ; First apply shift count.
shl  eax, cl        ; ??mod 32 to EDX:EAX
test ecx, 32        ; Need to shift by another 32?
jz   lshift_done    ; No, done.
mov  edx, eax       ; Left shift EDX:EAX
xor  eax, eax       ;  by 32 bits
lshift_done:

64-Bit Right Shift

shrd eax, edx, cl   ; First apply shift count.
shr  edx, cl        ; ??mod 32 to EDX:EAX
test ecx, 32        ; Need to shift by another 32?
jz   rshift_done    ; No, done.
mov  eax, edx       ; Left shift EDX:EAX
xor  edx, edx       ;  by 32 bits.
rshift_done:

64-Bit Multiplication

; _llmul computes the low-order half of the product of its
;  arguments, two 64-bit integers.
;
; In:       [ESP+8]:[ESP+4] = multiplicand
;           [ESP+16]:[ESP+12] = multiplier
; Out:      EDX:EAX = (multiplicand * multiplier) % 2^64
; Destroys: EAX, ECX, EDX, EFlags
_llmul PROC
mov edx, [esp+8]    ; multiplicand_hi
mov ecx, [esp+16]   ; multiplier_hi
or  edx, ecx        ; One operand >= 2^32?
mov edx, [esp+12]   ; multiplier_lo
mov eax, [esp+4]    ; multiplicand_lo
jnz twomul          ; Yes, need two multiplies.
mul edx             ; multiplicand_lo * multiplier_lo
ret                 ; Done, return to caller.
twomul:
imul edx, [esp+8]         ; p3_lo = multiplicand_hi * multiplier_lo
imul ecx, eax             ; p2_lo = multiplier_hi * multiplicand_lo
add  ecx, edx             ; p2_lo + p3_lo
mul  dword ptr [esp+12]   ; p1 = multiplicand_lo * multiplier_lo
add  edx, ecx             ; p1 + p2_lo + p3_lo = result in EDX:EAX
ret                       ; Done, return to caller.
_llmul ENDP

64-Bit Unsigned Division

; _ulldiv divides two unsigned 64-bit integers and returns the quotient.
;
; In:       [ESP+8]:[ESP+4] = dividend
;           [ESP+16]:[ESP+12] = divisor
; Out:      EDX:EAX = quotient of division
; Destroys: EAX, ECX, EDX, EFlags
_ulldiv PROC
push ebx             ; Save EBX as per calling convention.
mov  ecx, [esp+20]   ; divisor_hi
mov  ebx, [esp+16]   ; divisor_lo
mov  edx, [esp+12]   ; dividend_hi
mov  eax, [esp+8]    ; dividend_lo
test ecx, ecx        ; divisor > (2^32 - 1)?
jnz  big_divisor     ; Yes, divisor > 2^32 - 1.
cmp  edx, ebx        ; Only one division needed (ECX = 0)?
jae  two_divs        ; Need two divisions.
div  ebx             ; EAX = quotient_lo
mov  edx, ecx        ; EDX = quotient_hi = 0 (quotient in EDX:EAX)
pop  ebx             ; Restore EBX as per calling convention.
ret                  ; Done, return to caller.
two_divs:
mov  ecx, eax   ; Save dividend_lo in ECX.
mov  eax, edx   ; Get dividend_hi.
xor  edx, edx   ; Zero-extend it into EDX:EAX.
div  ebx        ; quotient_hi in EAX
xchg eax, ecx   ; ECX = quotient_hi, EAX = dividend_lo
div  ebx        ; EAX = quotient_lo
mov  edx, ecx   ; EDX = quotient_hi (quotient in EDX:EAX)
pop  ebx        ; Restore EBX as per calling convention.
ret             ; Done, return to caller.
big_divisor:
push edi                  ; Save EDI as per calling convention.
mov  edi, ecx             ; Save divisor_hi.
shr  edx, 1               ; Shift both divisor and dividend right
rcr  eax, 1               ;  by 1 bit.
ror  edi, 1
rcr  ebx, 1
bsr  ecx, ecx             ; ECX = number of remaining shifts
shrd ebx, edi, cl         ; Scale down divisor and dividend
shrd eax, edx, cl         ;  such that divisor is less than
shr  edx, cl              ;  2^32 (that is, it fits in EBX).
rol  edi, 1               ; Restore original divisor_hi.
div  ebx                  ; Compute quotient.
mov  ebx, [esp+12]        ; dividend_lo
mov  ecx, eax             ; Save quotient.
imul edi, eax             ; quotient * divisor high word (??low only)
mul  dword ptr [esp+20]   ; quotient * divisor low word
add  edx, edi             ; EDX:EAX = quotient * divisor
sub  ebx, eax             ; dividend_lo - (quot.*divisor)_lo
mov  eax, ecx             ; Get quotient.
mov  ecx, [esp+16]        ; dividend_hi
sbb  ecx, edx             ; Subtract (divisor * quot.) from dividend.
sbb  eax, 0               ; Adjust quotient if remainder negative.
xor  edx, edx             ; Clear high word of quot. (EAX<=FFFFFFFFh).
pop  edi                  ; Restore EDI as per calling convention.
pop  ebx                  ; Restore EBX as per calling convention.
ret                       ; Done, return to caller.
_ulldiv ENDP

64-Bit Signed Division

; _lldiv divides two signed 64-bit numbers and delivers the quotient
;
; In:       [ESP+8]:[ESP+4] = dividend
;           [ESP+16]:[ESP+12] = divisor
; Out:      EDX:EAX = quotient of division
; Destroys: EAX, ECX,E DX, EFlags
_lldiv PROC
push ebx    ; Save EBX as per calling convention.
push esi    ; Save ESI as per calling convention.
push edi    ; Save EDI as per calling convention.
mov  ecx, [esp+28]   ; divisor_hi
mov  ebx, [esp+24]   ; divisor_lo
mov  edx, [esp+20]   ; dividend_hi
mov  eax, [esp+16]   ; dividend_lo
mov  esi, ecx        ; divisor_hi
xor  esi, edx        ; divisor_hi ^ dividend_hi
sar  esi, 31         ; (quotient < 0) ? -1 : 0
mov  edi, edx        ; dividend_hi
sar  edi, 31         ; (dividend < 0) ? -1 : 0
xor  eax, edi        ; If (dividend < 0),
xor  edx, edi        ;  compute 1's complement of dividend.
sub  eax, edi        ; If (dividend < 0),
sbb  edx, edi        ;  compute 2's complement of dividend.
mov  edi, ecx        ; divisor_hi
sar  edi, 31         ; (divisor < 0) ? -1 : 0
xor  ebx, edi        ; If (divisor < 0),
xor  ecx, edi        ;  compute 1's complement of divisor.
sub  ebx, edi        ; If (divisor < 0),
sbb  ecx, edi        ;  compute 2's complement of divisor.
jnz  big_divisor     ; divisor > 2^32 - 1
cmp  edx, ebx        ; Only one division needed (ECX = 0)?
jae  two_divs        ; Need two divisions.
div  ebx             ; EAX = quotient_lo
mov  edx, ecx        ; EDX = quotient_hi = 0 (quotient in EDX:EAX)
xor  eax, esi        ; If (quotient < 0),
xor  edx, esi        ;  compute 1's complement of result.
sub  eax, esi        ; If (quotient < 0),
sbb  edx, esi        ;  compute 2's complement of result.
pop  edi             ; Restore EDI as per calling convention.
pop  esi             ; Restore ESI as per calling convention.
pop  ebx             ; Restore EBX as per calling convention.
ret                  ; Done, return to caller.
two_divs:
mov  ecx, eax     ; Save dividend_lo in ECX.
mov  eax, edx     ; Get dividend_hi.
xor  edx, edx     ; Zero-extend it into EDX:EAX.
div  ebx          ; quotient_hi in EAX
xchg eax, ecx     ; ECX = quotient_hi, EAX = dividend_lo
div  ebx          ; EAX = quotient_lo
mov  edx, ecx     ; EDX = quotient_hi (quotient in EDX:EAX)
jmp  make_sign   ; Make quotient signed.
big_divisor:
sub  esp, 12             ; Create three local variables.
mov  [esp], eax          ; dividend_lo
mov  [esp+4], ebx        ; divisor_lo
mov  [esp+8], edx        ; dividend_hi
mov  edi, ecx            ; Save divisor_hi.
shr  edx, 1              ; Shift both
rcr  eax, 1              ;  divisor and
ror  edi, 1              ;  and dividend
rcr  ebx, 1              ;  right by 1 bit.
bsr  ecx, ecx            ; ECX = number of remaining shifts
shrd ebx, edi, cl        ; Scale down divisor and
shrd eax, edx, cl        ;  dividend such that divisor is
shr  edx, cl             ;  less than 2^32 (that is, fits in EBX).
rol  edi, 1              ; Restore original divisor_hi.
div  ebx                 ; Compute quotient.
mov  ebx, [esp]          ; dividend_lo
mov  ecx, eax            ; Save quotient.
imul edi, eax            ; quotient * divisor high word (??low only)
mul  DWORD PTR [esp+4]   ; quotient * divisor low word
add  edx, edi            ; EDX:EAX = quotient * divisor
sub  ebx, eax            ; dividend_lo - (quot.*divisor)_lo
mov  eax, ecx            ; Get quotient.
mov  ecx, [esp+8]        ; dividend_hi
sbb  ecx, edx            ; Subtract (divisor * quot.) from dividend
sbb  eax, 0              ; Adjust quotient if remainder is negative.
xor  edx, edx            ; Clear high word of quotient.
add  esp, 12             ; Remove local variables.
make_sign:
xor eax, esi   ; If (quotient < 0),
xor edx, esi   ;  compute 1's complement of result.
sub eax, esi   ; If (quotient < 0),
sbb edx, esi   ;  compute 2's complement of result.
pop edi        ; Restore EDI as per calling convention.
pop esi        ; Restore ESI as per calling convention.
pop ebx        ; Restore EBX as per calling convention.
ret            ; Done, return to caller.
_lldiv ENDP

64-Bit Unsigned Remainder Computation

; _ullrem divides two unsigned 64-bit integers and returns the remainder.
;
; In:       [ESP+8]:[ESP+4] = dividend
;           [ESP+16]:[ESP+12] = divisor
;
; Out:      EDX:EAX = remainder of division
;
; Destroys: EAX, ECX, EDX, EFlags
_ullrem PROC
push ebx              ; Save EBX as per calling convention.
mov  ecx, [esp+20]    ; divisor_hi
mov  ebx, [esp+16]    ; divisor_lo
mov  edx, [esp+12]    ; dividend_hi
mov  eax, [esp+8]     ; dividend_lo
test ecx, ecx         ; divisor > 2^32 - 1?
jnz  r_big_divisor    ; Yes, divisor > 32^32 - 1.
cmp  edx, ebx         ; Only one division needed (ECX = 0)?
jae  r_two_divs       ; Need two divisions.
div  ebx              ; EAX = quotient_lo
mov  eax, edx         ; EAX = remainder_lo
mov  edx, ecx         ; EDX = remainder_hi = 0
pop  ebx              ; Restore EBX per calling convention.
ret                   ; Done, return to caller.
r_two_divs:
mov ecx, eax   ; Save dividend_lo in ECX.
mov eax, edx   ; Get dividend_hi.
xor edx, edx   ; Zero-extend it into EDX:EAX.
div ebx        ; EAX = quotient_hi, EDX = intermediate remainder
mov eax, ecx   ; EAX = dividend_lo
div ebx        ; EAX = quotient_lo
mov eax, edx   ; EAX = remainder_lo
xor edx, edx   ; EDX = remainder_hi = 0
pop ebx        ; Restore EBX as per calling convention.
ret            ; Done, return to caller.
r_big_divisor:
push edi                  ; Save EDI as per calling convention.
mov  edi, ecx             ; Save divisor_hi.
shr  edx, 1               ; Shift both divisor and dividend right
rcr  eax, 1               ;  by 1 bit.
ror  edi, 1
rcr  ebx, 1
bsr  ecx, ecx             ; ECX = number of remaining shifts
shrd ebx, edi, cl         ; Scale down divisor and dividend such
shrd eax, edx, cl         ;  that divisor is less than 2^32
shr  edx, cl              ;  (that is, it fits in EBX).
rol  edi, 1               ; Restore original divisor (EDI:ESI).
div  ebx                  ; Compute quotient.
mov  ebx, [esp+12]        ; dividend low word
mov  ecx, eax             ; Save quotient.
imul edi, eax             ; quotient * divisor high word (??low only)
mul  DWORD PTR [esp+20]   ; quotient * divisor low word
add  edx, edi             ; EDX:EAX = quotient * divisor
sub  ebx, eax             ; dividend_lo - (quot.*divisor)_lo
mov  ecx, [esp+16]        ; dividend_hi
mov  eax, [esp+20]        ; divisor_lo
sbb  ecx, edx             ; Subtract divisor * quot. from dividend.
sbb  edx, edx             ; (remainder < 0) ? 0xFFFFFFFF : 0
and  eax, edx             ; (remainder < 0) ? divisor_lo : 0
and  edx, [esp+24]        ; (remainder < 0) ? divisor_hi : 0
add  eax, ebx             ; remainder += (remainder < 0) ? divisor : 0
pop  edi                  ; Restore EDI as per calling convention.
pop  ebx                  ; Restore EBX as per calling convention.
ret                       ; Done, return to caller.
_ullrem ENDP

64-Bit Signed Remainder Computation

; _llrem divides two signed 64-bit numbers and returns the remainder.
;
; In:       [ESP+8]:[ESP+4] = dividend
;           [ESP+16]:[ESP+12] = divisor
;
; Out:      EDX:EAX = remainder of division
;
; Destroys: EAX, ECX, EDX, EFlags
push ebx               ; Save EBX as per calling convention.
push esi               ; Save ESI as per calling convention.
push edi               ; Save EDI as per calling convention.
mov  ecx, [esp+28]     ; divisor-hi
mov  ebx, [esp+24]     ; divisor-lo
mov  edx, [esp+20]     ; dividend-hi
mov  eax, [esp+16]     ; dividend-lo
mov  esi, edx          ; sign(remainder) == sign(dividend)
sar  esi, 31           ; (remainder < 0) ? -1 : 0
mov  edi, edx          ; dividend-hi
sar  edi, 31           ; (dividend < 0) ? -1 : 0
xor  eax, edi          ; If (dividend < 0),
xor  edx, edi          ;  compute 1's complement of dividend.
sub  eax, edi          ; If (dividend < 0),
sbb  edx, edi          ;  compute 2's complement of dividend.
mov  edi, ecx          ; divisor-hi
sar  edi, 31           ; (divisor < 0) ? -1 : 0
xor  ebx, edi          ; If (divisor < 0),
xor  ecx, edi          ;  compute 1's complement of divisor.
sub  ebx, edi          ; If (divisor < 0),
sbb  ecx, edi          ;  compute 2's complement of divisor.
jnz  sr_big_divisor    ; divisor > 2^32 - 1
cmp  edx, ebx          ; Only one division needed (ECX = 0)?
jae  sr_two_divs       ; No, need two divisions.
div  ebx               ; EAX = quotient_lo
mov  eax, edx          ; EAX = remainder_lo
mov  edx, ecx          ; EDX = remainder_lo = 0
xor  eax, esi          ; If (remainder < 0),
xor  edx, esi          ;  compute 1's complement of result.
sub  eax, esi          ; If (remainder < 0),
sbb  edx, esi          ;  compute 2's complement of result.
pop  edi               ; Restore EDI as per calling convention.
pop  esi               ; Restore ESI as per calling convention.
pop  ebx               ; Restore EBX as per calling convention.
ret                    ; Done, return to caller.
sr_two_divs:
mov ecx, eax        ; Save dividend_lo in ECX.
mov eax, edx        ; Get dividend_hi.
xor edx, edx        ; Zero-extend it into EDX:EAX.
div ebx             ; EAX = quotient_hi, EDX = intermediate remainder
mov eax, ecx        ; EAX = dividend_lo
div  ebx            ; EAX = quotient_lo
mov  eax, edx       ; remainder_lo
xor  edx, edx       ; remainder_hi = 0
jmp  sr_makesign    ; Make remainder signed.
sr_big_divisor:
sub  esp, 16             ; Create three local variables.
mov  [esp], eax          ; dividend_lo
mov  [esp+4], ebx        ; divisor_lo
mov  [esp+8], edx        ; dividend_hi
mov  [esp+12], ecx       ; divisor_hi
mov  edi, ecx            ; Save divisor_hi.
shr  edx, 1              ; Shift both
rcr  eax, 1              ;  divisor and
ror  edi, 1              ;  and dividend
rcr  ebx, 1              ;  right by 1 bit.
bsr  ecx, ecx            ; ECX = number of remaining shifts
shrd ebx, edi, cl        ; Scale down divisor and
shrd eax, edx, cl        ;  dividend such that divisor is
shr  edx, cl             ;  less than 2^32 (that is, fits in EBX).
rol  edi, 1              ; Restore original divisor_hi.
div  ebx                 ; Compute quotient.
mov  ebx, [esp]          ; dividend_lo
mov  ecx, eax            ; Save quotient.
imul edi, eax            ; quotient * divisor high word (??low only)
mul  DWORD PTR [esp+4]   ; quotient * divisor low word
add  edx, edi            ; EDX:EAX = quotient * divisor
sub  ebx, eax            ; dividend_lo - (quot.*divisor)_lo
mov  ecx, [esp+8]        ; dividend_hi
sbb  ecx, edx            ; Subtract divisor * quot. from dividend.
sbb  eax, eax            ; remainder < 0 ? 0xffffffff : 0
mov  edx, [esp+12]       ; divisor_hi
and  edx, eax            ; remainder < 0 ? divisor_hi : 0
and  eax, [esp+4]        ; remainder < 0 ? divisor_lo : 0
add  eax, ebx            ; remainder_lo
add  edx, ecx            ; remainder_hi
add  esp, 16             ; Remove local variables.
sr_makesign:
xor eax, esi   ; If (remainder < 0),
xor edx, esi   ;  compute 1's complement of result.
sub eax, esi   ; If (remainder < 0),
sbb edx, esi   ;  compute 2's complement of result.
pop edi        ; Restore EDI as per calling convention.
pop esi        ; Restore ESI as per calling convention.
pop ebx        ; Restore EBX as per calling convention.
ret            ; Done, return to caller.
```
05 Dec 2007, 14:10
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
Thanks. Don't you happen to have direct link to this manual, and name of chapter/section where this is? I'd add it as reference to article.

Things in article are basically done same way, except for NEG and that was already pointed out by bitrake on x86asm board.

Those bignum divisions are really interesting, I'd have to study them someday.
05 Dec 2007, 14:30
rugxulo

Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo
Sorry, I posted as close to a direct reference (URL) as I could.

Software Optimization Guide for AMD64 Processors
- Integer Optimizations
-- Efficient 64-Bit Integer Arithmatic in 32-Bit Mode
05 Dec 2007, 14:45
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
Thanks a lot.

I plan to add bignum multiplication, and with your help even division someday to article.
05 Dec 2007, 14:50
MCD

Joined: 21 Aug 2004
Posts: 602
Location: Germany
MCD
07 Jan 2008, 05:41
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
MCD:
not sure about math stuff, but my O(n*log(n)) is number of MUL instructions used, probably not what math people intended. My idea is something like this:

c = a*b
(all number consist of 4 dwords,
c0 = first dword of destination, c1 = second, etc...
edx is upper 32bits of previous multiplication)
Code:
```c0 = a0 * b0
c1 = a1 * b0 + edx
c2 = a2 * b0 + edx
c3 = a3 * b0 + edx

c1 += a0 * b1
c2 += a1 * b1 + edx
c3 += a2 * b1 + edx

c2 += a0 * b2
c3 += a1 * b2 + edx

c3 += a0 * b3
```

Isn't this O(n*log(n)) number of MUL instructions?
07 Jan 2008, 11:25
Madis731

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731
But if you want this code to scale without loops, you need to scale your code that does the operation.
If you add 128-bit numbers with 32-bit registers you need a loop ECX=4
If you mul 128-bit numbers with 64-bit registers you can do this manually
but MUL128 with R32 is tricky. You still need an algorithm, which means shifting is needed on sub-calculations.
07 Jan 2008, 13:19
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
of course that it would be done with loops.

but number of MULs is N + (N-1) + (N-2) + (N-3) + ... + 1

isn't this O(n*log(n))? I never really knew big-O notation
07 Jan 2008, 14:13
LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly
The "N + (N-1) + (N-2) + (N-3) + ... + 1" is sometimes refered as the sum of the first N natural numbers (though, you wrote it in reverse order ).

Doing some maths:
Code:
```1 + 2 + ... + N-1 + N = ((1 + 2 + 3 + ... + N-2 + N-1 + N) + (N + N-1 + N-2 + ... + 3 + 2 + 1))/2
= (1+N + 2+N-1 + 3+N-2 + ... + N-2+3 + N-1+2 + N+1)/2
= (N+1 + N+1 + N+1 + ... + N+1 + N+1 + N+1)/2
= N(N+1)/2    ```

However, while N(N+1)/2 for N=100 is 5050, log(100) is 2 and log2(100) is 6,64.. Not sure if the notation refers to log2 or log10 but non of those matches your calculation of the number of MULs
07 Jan 2008, 14:38
vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
okay, so it is O(N^2)... my bad
I was just quessing anyway...
07 Jan 2008, 14:54
LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly
Doesn't exists O(N(N+1)/2)? N^2 is ultra big on big Ns, much much bigger than N(N+1)/2
07 Jan 2008, 14:57
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area
Goto page 1, 2  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum

Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.