flat assembler
Message board for the users of flat assembler.
 Home   FAQ   Search   Register 
 Profile   Log in to check your private messages   Log in 
flat assembler > Main > Division by multiplication. How to get the remainder?

Author
Thread Post new topic Reply to topic
zhak



Joined: 12 Apr 2005
Posts: 479
Location: Belarus
Division by multiplication. How to get the remainder?
When substituting unsigned integer division by multiplication (aka magic division)

For example,

Code:
mov ecx10
div rcx


can be substituted with

Code:
mov rdx0xcccccccccccccccd
mul rdx
shr rdx3


how to get the remainder?
I'm trying to adapt the method for base conversion, but cannot figure out how to retrieve the remainder of the division in an elegant way.

Thanks in advance!

I first tried to reuse algorithm from this post:

Code:

mov rsi,10
mov r8,0x199999999999999a

mul r8
mov rdi,rdx
mul rsi
mov rax,rdi



DL holds the remainder

But this algorithm produces incorrect results for very large numbers.
For example, 9223372036854775807 is converted to 9223372036854775808, etc.

If anyone is interested in current code, here's a compilable example:

Code:

  mov rdx9223372036854775807
  mov rcxstrFmt
  call __dec2str
  
  jmp $

struct STRFMT
  flags     rw 1
  width     rw 1
  size      rw 1
  length    rb 1
  precision rb 1
ends

STRFMT_WIDTH          = 0x0001
STRFMT_SIZE           = 0x0002
STRFMT_LENGTH         = 0x0004
STRFMT_PRECISION      = 0x0008
STRFMT_ALIGN_LEFT     = 0x0010
STRFMT_ALIGN_CENTER   = 0x0020
STRFMT_SIGNED_NUMBER  = 0x0100
STRFMT_SIGNED_PLUS    = 0x0200
STRFMT_SIGNED_BLANK   = 0x0400
  
__dec2str:
  push rbp
  push rdi
  push rbx
  push r8
  push r9
  push r10
  mov rbprcx
  test rdxrdx
  jz short .check_padding
  mov raxrdx
  mov r10rdx
  sub rsp56
  xor ecxecx
  lea rdi, [rsp + 48]
  mov r80x199999999999999a
  mov ebx10
  mov word[rdi], cx
  test word[rbp + STRFMT.flags], STRFMT_SIGNED_NUMBER
  jz short @f
  test raxrax
  jns short @f
  neg rax
  @@:
  mul r8
  mov r9rdx
  mul rbx
  mov raxr9
  or r9rdx
  jz short .check_padding
  or dl0x30
  sub rdi2
  mov [rdi], dx
  inc ecx
  jmp short @b
 
 .check_padding:
  test word[rbp + STRFMT.flags], STRFMT_LENGTH
  jz short .check_sign
  mov albyte[rbp + STRFMT.length]
  cmp al20
  jbe short @f
  mov al20
  @@:
  sub eaxecx
  @@:
  jle short .check_sign
  sub rdi2
  mov word[rdi], '0'
  inc ecx
  dec eax
  jmp short @b
  
 .check_sign:
  test word[rbp + STRFMT.flags], STRFMT_SIGNED_NUMBER
  jz short .ret
  test r10r10
  mov ax'-'
  js short .apply_sign
  test word[rbp + STRFMT.flags], STRFMT_SIGNED_PLUS
  mov ax'+'
  jnz short .apply_sign
  test word[rbp + STRFMT.flags], STRFMT_SIGNED_BLANK
  jz short .ret
  mov ax' '
  
 .apply_sign:
  sub rdi2
  mov word[rdi], ax
  
 .ret:
  mov rdxrdi
  xor eaxeax
  add rsp56
  pop r10
  pop r9
  pop r8
  pop rbx
  pop rdi
  pop rbp
  ret

strFmt STRFMT STRFMT_LENGTH,0,0,0,0


Post 12 Jun 2016, 20:22
View user's profile Send private message Reply with quote
system error



Joined: 01 Sep 2013
Posts: 673
zhak, that Agner Fog's version doesn't return the remainder. It returns the most significant digits in EDX after getting rid of the remainder in one pass. 456/10...... returns 45 for the first pass.
Post 12 Jun 2016, 22:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15303
Location: Bigweld Industries
You have to multiply the quotient by the divisor and subtract to get the remainder.

remainder = dividend - quotient * divisor

where: quotient = floor(dividend / divisor)
Post 12 Jun 2016, 23:24
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 479
Location: Belarus
Yeah, I thought about such approach in the first place as most obvious. But hoped there's some other magic trick for this Smile
Post 12 Jun 2016, 23:38
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15303
Location: Bigweld Industries
I think that when you need the remainder it is probably just as efficient to use DIV. In the current batch of CPUs it is not too slow. So many people just assume it runs like a snail in molasses but in reality it is okay.
Post 12 Jun 2016, 23:58
View user's profile Send private message Visit poster's website Reply with quote
Mike Gonta



Joined: 26 Dec 2010
Posts: 202

revolution wrote:
You have to multiply the quotient by the divisor and subtract to get the remainder.
remainder = dividend - quotient * divisor
where: quotient = floor(dividend / divisor)


zhak wrote:
Yeah, I thought about such approach in the first place as most obvious. But hoped there's some other magic trick for this Smile

After the initial multiplication rdx contains the quotient*8, simply save this as two temporary's and right shift one
by 2 (divide by 4) to get the quotient*2 and add together to get the quotient*10.

revolution wrote:
I think that when you need the remainder it is probably just as efficient to use DIV. In the current batch of CPUs
it is not too slow. So many people just assume it runs like a snail in molasses but in reality it is okay.

Exactly, and since internally division is done with multiplication and multiplication is done with shifts this technique
is more elegant.
fizzbuzzter

_________________
Mike Gonta
look and see - many look but few see

http://mikegonta.com
Post 13 Jun 2016, 09:04
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15303
Location: Bigweld Industries

Mike Gonta wrote:
After the initial multiplication rdx contains the quotient*8, simply save this as two temporary's and right shift one
by 2 (divide by 4) to get the quotient*2 and add together to get the quotient*10.

Before you use the value in RDX the lower three bits must be zeroed before you use it to get the final tens multiple. So you still either do the SHR by 3 and then SHL by 3, or you do AND with 0xfff...ff8. Usually it is easier to SHR by 3 to get the quotient and then two LEAs to do the multiply by 10.
Post 13 Jun 2016, 09:11
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 479
Location: Belarus
Thanks for the hints, guys. So I tried both div and mul variants:

First, DIV

Code:

@@:
  mov ebx10
  xor edxedx
  div rbx
  mov rbxrax
  or rbxrdx
  jz short .check_padding
  or dl0x30
  sub rdi2
  mov [rdi], dx
  inc ecx
  jmp short @b




and then MUL

Code:

@@:
  mov rbxrax
  mul r8
  shr rdx3
  mov raxrdx
  lea rdx, [rdx * 8]
  lea rdx, [rdx + rax * 2]
  sub rbxrdx
  mov rdxrax
  or rdxrbx
  jz short .check_padding
  or bl0x30
  sub rdi2
  mov [rdi], bx
  inc ecx
  jmp short @b




converting number 9223372036854775807 (19 iterations). Measured performance with simple RDTSC:

Code:

cpuid
  rdtsc
  push rax
  mov rdx9223372036854775807
  mov rcxstrFmt
  call __dec2str
  cpuid
  rdtsc
  pop rbx
  sub raxrbx



On my Core i7-4790K, DIV gives an average of 1356 ticks, while MUL gives average of 1012 ticks.
Yes, it's not that much, like 18 ticks per iteration. But with huge number of divisions gain would become significant.
Post 13 Jun 2016, 20:58
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15303
Location: Bigweld Industries
The timing depends upon what the code is actually doing. Each use case will be different.

But I did notice that you load RBX with the value 10 for the DIV test, whereas you don't load R8 with its constant in the MUL test.

You can also eliminate one instruction in the MUL test:

Code:
  mov rbxrax
  mul r8
  shr rdx3
  lea rax, [rdx * 8]
  lea rax, [rax + rdx * 2]
  sub rbxrax ;RBX=remainder, RDX=quotient

Post 14 Jun 2016, 01:07
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 479
Location: Belarus
you're right. thanks. fixed
Post 14 Jun 2016, 06:44
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2005 phpBB Group.

Main index   Download   Documentation   Examples   Message board
Copyright © 2004-2016, Tomasz Grysztar.