flat assembler
Message board for the users of flat assembler.

Index > Main > FPU: my first attempt

Author
Thread Post new topic Reply to topic
system error



Joined: 01 Sep 2013
Posts: 670
system error 10 Feb 2014, 11:23
I am trying to parse the fraction part of a precision data (float) but when running this test code, a problem appears. The only problem lies at the last fragment (before finish:)... it prints a number that appears to me as "overflow" or something like that. Since all code fragments basically doing the same thing, the problem always at the last fragment (doesn't matter the value).

What could be the problem? Could it be related to stack or something with FPU registers?

Code:
format PE
include 'win32ax.inc'
entry start

section 'code' code readable executable
start:
finit

mov dword[d],2
fild dword[d]   ;2
fild dword[c]   ;1
fdiv st0,st1    ;1/2
fimul [t]
;fadd dword[tot]
fist dword[tot]
cinvoke printf,"%d",[tot]
cinvoke printf,"%s",newline

mov dword[d],4
;fild[tot]
fild dword[d]
fild dword[c]
fdiv st0,st1    ;1/4
fimul [t]
;fadd dword[tot]
fist dword[tot]
cinvoke printf,"%u",[tot]
cinvoke printf,"%s",newline

mov dword[d],8
;fild[tot]
fild dword[d]
fild dword[c]
fdiv st0,st1    ;1/8
fimul [t]
;fadd dword[tot]
fist dword[tot]
cinvoke printf,"%u",[tot]
cinvoke printf,"%s",newline

mov dword[d],16
;fild[tot]
fild dword[d]
fild dword[c]
fdiv st0,st1    ;1/16
fimul [t]
;fadd dword[tot]
fist dword[tot]
cinvoke printf,"%u",[tot]
cinvoke printf,"%s",newline

mov dword[d],32     ;Problem ALWAYS here at this fragment
;fild[tot]
fild dword[d]
fild dword[c]
fdiv st0,st1    ;1/32
fimul [t]
;fadd dword[tot]
fist dword[tot]
cinvoke printf,"%u",[tot]
cinvoke printf,"%s",newline

finish:
invoke system,hold
invoke exit,0

section 'data' data readable writable
c dd 1
d dd 0
t dd 100000
tot dd 0
hold db "pause>nope"
newline db 0ah,0dh

section 'import' import data readable
     library msvcrt,'msvcrt.dll'
     import msvcrt,\
     printf,'printf',system,'system',exit,'exit'    
Post 10 Feb 2014, 11:23
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 10 Feb 2014, 11:33
You are overflowing the FPU stack. Try something like this:
Code:
fild dword[d]
fidivr dword[c]
fimul [t]
fistp dword[tot]    
Post 10 Feb 2014, 11:33
View user's profile Send private message Visit poster's website Reply with quote
system error



Joined: 01 Sep 2013
Posts: 670
system error 10 Feb 2014, 11:40
revolution wrote:
You are overflowing the FPU stack. Try something like this:
Code:
fild dword[d]
fidivr dword[c]
fimul [t]
fistp dword[tot]    


Love you revo!

By the way, any "special code" from you regarding this? (aka I am asking for a working floating point conversion code gently and nicely) Razz
Post 10 Feb 2014, 11:40
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 10 Feb 2014, 11:42
system error wrote:
By the way, any "special code" from you regarding this? (aka I am asking for a working floating point conversion code gently and nicely) Razz
I don't know what you mean. Conversion from what to what?
Post 10 Feb 2014, 11:42
View user's profile Send private message Visit poster's website Reply with quote
system error



Joined: 01 Sep 2013
Posts: 670
system error 10 Feb 2014, 11:46
revolution wrote:
system error wrote:
By the way, any "special code" from you regarding this? (aka I am asking for a working floating point conversion code gently and nicely) Razz
I don't know what you mean. Conversion from what to what?
to string... and back to float.. something like scanf (%f) and printf(%f)
Post 10 Feb 2014, 11:46
View user's profile Send private message Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 618
cod3b453 10 Feb 2014, 18:04
I don't have code for this but basically you can treat the whole number as a big integer then divide by ten to the power of the number of fractional digits e.g. treat 123.45 as if it were 12345 and then divide by 10^2 because there are two digits after ".". If you want to handle sign (+/-)/scientific 1.5E34 type notation it's similar but you have to check for them and add handling for those.

Going the other way is a little more tricky; scientific notation is simpler for general case. You can do this by extracting the exponent and do the base conversion from 2 to 10 using y=floor(x*log_10(2)). Use y to divide the original number by 10^y. You can then extract each digit by repeatedly converting to int, subtracting the int from the original (to leave the fraction) and multiplying by 10 (the "." goes after the first digit in this case). Finally append "e" and the value of y that you had (this is signed integer, so negative for small numbers) to end up with 1.2345e2. [If the number is 'near to zero' 2^64>x>2^-64 you can just do the convert to int/subtraction/multiply by 10 to get 123.45 but other values will be a bit long]
Post 10 Feb 2014, 18:04
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 11 Feb 2014, 01:24
If you want ABSOLUTELY ACCURATE conversion between strings and extended precision values, you will need to build a multiprecision multiplier with a precision of at least 96 bits. I use 192 bits, but I realize that this much precision is overkill.

Code:

; prints float in st0 to string at rdi
PrintFloat:
virtual at rsp
.x       dq ?,?,?,?
.d       dq ?
.ex      dq ?
.ii      dq ?
.message dq ?
.rsi     dq ?
.rbp     dq ?
.rbx     dq ?
.r12     dq ?
.r13     dq ?
.s       dq ?
end virtual

                        sub  rsp,8*(65)
                        mov  [.rsi],rsi
                        mov  [.rbp],rbp
                        mov  [.rbx],rbx
                        mov  [.r12],r12
                        mov  [.r13],r13

                        fld  st0
                       fstp  tword[.x]
                      movzx  edx,word[.x+8]
                        mov  ecx,edx
                        and  ecx,0x08000
                         jz  @f
                       fchs
                        mov  al,'-'
                      stosb
                      @@:
                        and  edx,0x07FFF
                         jz  .zero
                        cmp  edx,0x07FFF
                         je  .inf
                        mov  [.message],rdi

                       fld1
                       fld1
                       fadd  st0,st0
                      fdivp  st1,st0

                     fldlg2
                       fild  qword[OutputBits]
                      fmulp  st1,st0
                      faddp  st1,st0
                      fistp  qword[.d]
                        mov  eax,1
                        mov  qword[.ex],rax
                     fldlg2
                        fld  st1
                      fyl2x
                      fistp  qword[.ii]
                        lea  rdi,[.x]
                       call  CVT_f80_F192
                        lea  rdi,[.s]
                        lea  rsi,[.x]
                        mov  rbx,[.ex]
                        add  rbx,[.d]
                        sub  rbx,[.ii]
                       call  POWMUL_F192
                        sub  rax,1
                        sub  rax,rbx
                        add  rdi,[.d]
                 @@:    sub  rdi,1
                        cmp  byte[rdi],'0'
                         je  @b
                        add  rdi,1

                        mov  rcx,rdi
                        lea  rsi,[.s]
                        mov  rdi,[.message]
                        sub  rcx,rsi

                        cmp  rax,16
                         jg  .print_w_exp
                        cmp  rax,-8
                         jl  .print_w_exp
                       test  rax,rax
                         jz  .print_no_exp
                         js  .print_neg_exp
                .print_pos_exp:
                        cmp  eax,ecx
                        lea  eax,[rax+1]
                        jae  @f
                       push  rax
                       push  rcx
                        mov  ecx,eax
                  rep movsb
                        mov  al,'.'
                      stosb
                        pop  rcx
                        pop  rax
                        sub  ecx,eax
                  rep movsb
                        jmp  .done
             @@:        sub  eax,ecx
                  rep movsb
                        mov  ecx,eax
                        mov  al,'0'
                  rep stosb
                        mov  al,'.'
                      stosb
                        jmp  .done

                .print_neg_exp:
                       push  rcx
                       push  rax
                        mov  ax,'0.'
                      stosw
                        pop  rcx
                        not  rcx
                        mov  al,'0'
                  rep stosb
                        pop  rcx
                  rep movsb
                        jmp  .done

                .print_no_exp:
                      movsb
                        sub  ecx,1
                        mov  al,'.'
                      stosb
                  rep movsb
                        jmp  .done

                .print_w_exp:
                      movsb
                        sub  ecx,1
                       push  rax
                        mov  al,'.'
                      stosb
                  rep movsb
                        mov  al,'e'
                      stosb
                        pop  rax
                       call  PrintInteger
                       ; jmp  .done

   .done:
                        mov  rsi,[.rsi]
                        mov  rbp,[.rbp]
                        mov  rbx,[.rbx]
                        mov  r12,[.r12]
                        mov  r13,[.r13]
                        add  rsp,8*65
                        ret


  .zero:                mov  ax,'0.'
                       fstp  st0
                      stosw
                        jmp  .done

 .inf:                  mov  rax,[.x]
                       fstp  st0
                       test  rax,rax
                        jns  .Nan
                        shl  rax,1
                        jnz  .Nan
                        mov  eax,'inf'
                      stosd
                        sub  rdi,1
                        jmp  .done

        .Nan:
                        mov  eax,'NaN'
                      stosd
                        sub  rdi,1
                        jmp  .done



CVT_f80_F192: ; pop fpu stack to rdi
                        sub  rsp,8*5
                       fstp  tword[rsp]
                        xor  eax,eax
                        mov  [rdi+8*0],rax
                        mov  [rdi+8*1],rax
                      movsx  ecx,word[rsp+8]
                        and  ecx,0x7FFFF
                        sub  rcx,16383
                        mov  rax,qword[rsp+0]
                        mov  [rdi+8*2],rax
                        mov  [rdi+8*3],rcx
                        add  rsp,8*5
                        ret






POWMUL_F192:    ;  [rdi] = string(round([rsi] * 10^rbx))

virtual at rsp
  .base  dq ?,?,?,?
  .unity dq ?,?,?,?
  .rdi   dq ?
  .rsi   dq ?
  .rbx   dq ?
end virtual
                        sub  rsp,8*13
                        mov  [.rdi],rdi
                        mov  [.rbx],rbx
                        mov  [.rsi],rsi

                     movaps  xmm0,dqword[Bases+32*1+16*0]
                     movaps  xmm1,dqword[Bases+32*1+16*1]
                     movups  dqword[.unity+16*0],xmm0
                     movups  dqword[.unity+16*1],xmm1

                        mov  rax,qword[OutputBase]
                       test  rbx,rbx
                         jz  .zero
                        jns  @f
                        neg  rax
                        neg  rbx
                   @@:
                        shl  rax,5
                     movaps  xmm0,dqword[Bases+rax+16*0]
                     movaps  xmm1,dqword[Bases+rax+16*1]
                     movups  dqword[.base+16*0],xmm0
                     movups  dqword[.base+16*1],xmm1

                       test  rbx,rbx
                         jz  .w3
              .w1:      shr  rbx,1
                        jnc  .w2
                        lea  rdi,[.unity]
                        lea  rsi,[.base]
                       call  MUL_F192
              .w2:     test  rbx,rbx
                         jz  .w3
                        lea  rdi,[.base]
                        lea  rsi,[.base]
                       call  MUL_F192
                        jmp  .w1
              .w3:
        .zero:          lea  rdi,[.unity]
                        mov  rsi,[.rsi]
                       call  MUL_F192

                        ; rax = sign bit
                        ; ecx = exponent

                        sub  rcx,2*64-1
                        jns  .error

                @@:     shr  r13,1
                        rcr  r12,1
                        rcr  r11,1
                        add  rcx,1
                        jnz  @b

                        cmp  r11,rax
                         jb  .store
                         ja  .roundup
                       test  r12,1
                         jz  .store
     .roundup:          add  r12,1
                        adc  r13,0
    .store:
                        mov  rdi,[.rdi]
                        mov  rbp,rsp
        .ComputeDigits: xor  edx,edx
                        mov  rax,r13
                        div  qword[OutputBase]
                        mov  r13,rax
                        mov  rcx,rax
                        mov  rax,r12
                        div  qword[OutputBase]
                        mov  r12,rax
                         or  rcx,rax
                       push  rdx
                        jnz  .ComputeDigits
        .PrintDigits:   pop  rax
                        add  al,'0'
                      stosb
                        cmp  rsp,rbp
                         jb  .PrintDigits

                        mov  rax,rdi
                        sub  rax,[.rdi]
                        mov  rdi,[.rdi]
                        mov  rsi,[.rsi]
                        mov  rbx,[.rbx]

                        add  rsp,8*13
                        ret

.error:                int3


MUL_F192:               mov  rcx,[rsi+8*3]
                        add  rcx,[rdi+8*3]

                        mov  rax,[rsi+8*0]
                        mul  qword[rdi+8*0]
                        mov  r8,rax
                        mov  r9,rdx
                        xor  r10,r10
                        xor  r11,r11
                        xor  r12,r12
                        xor  r13,r13

                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*0]
                        add  r9,rax
                        adc  r10,rdx
                        adc  r11,0
                        mov  rax,[rsi+8*0]
                        mul  qword[rdi+8*1]
                        add  r9,rax
                        adc  r10,rdx
                        adc  r11,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*0]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0
                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*1]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0
                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*0]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*1]
                        add  r11,rax
                        adc  r12,rdx
                        adc  r13,0
                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*2]
                        add  r11,rax
                        adc  r12,rdx
                        adc  r13,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*2]

                        add  r12,rax
                        adc  r13,rdx


                         js  @f
                        shl  r8,1
                        rcl  r9,1
                        rcl  r10,1
                        rcl  r11,1
                        rcl  r12,1
                        rcl  r13,1
                        jmp  .round
                  @@:   add  rcx,1

    .round:             xor  eax,eax
                        bts  rax,63
                        cmp  r10,rax
                         jb  .store
                         ja  .roundup
                       test  r11,1
                         jz  .store
     .roundup:          add  r11,1
                        adc  r12,0
                        adc  r13,0
                        jnc  .store
                        rcr  r13,1
                        rcr  r12,1
                        rcr  r11,1
                        add  rcx,1

    .store:             mov  [rdi+8*0],r11
                        mov  [rdi+8*1],r12
                        mov  [rdi+8*2],r13
                        mov  [rdi+8*3],rcx
                        ret


align 32
dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -4
dq 0x8e38e38e38e38e38, 0x38e38e38e38e38e3, 0xe38e38e38e38e38e, -4
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -3
dq 0x4924924924924924, 0x2492492492492492, 0x9249249249249249, -3
dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -3
dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -3
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -2
dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -2
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -1
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0
Bases: dq ?,?,?,?
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 1
dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 1
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xe000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 3
dq 0x0000000000000000, 0x0000000000000000, 0x9000000000000000, 3
dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 3


PrintInteger:       ; rax: number
                       push  rbp rcx rdx
                        mov  rbp,rsp
                       test  rax,rax
                        jns  .l1
                        mov  byte[rdi],'-'
                        add  rdi,1
                        neg  rax
                .l1:    xor  edx,edx
                        div  qword[OutputBase]
                       push  rdx
                       test  rax,rax
                        jnz  .l1
                .l2:    pop  rax
                        add  al,'0'
                      stosb
                        cmp  rsp,rbp
                         jb  .l2
                        pop  rdx rcx rbp
                        ret                                                                
          
Post 11 Feb 2014, 01:24
View user's profile Send private message Reply with quote
system error



Joined: 01 Sep 2013
Posts: 670
system error 12 Feb 2014, 06:02
cod3b453 wrote:
I don't have code for this but basically you can treat the whole number as a big integer then divide by ten to the power of the number of fractional digits e.g. treat 123.45 as if it were 12345 and then divide by 10^2 because there are two digits after ".". If you want to handle sign (+/-)/scientific 1.5E34 type notation it's similar but you have to check for them and add handling for those.

Going the other way is a little more tricky; scientific notation is simpler for general case. You can do this by extracting the exponent and do the base conversion from 2 to 10 using y=floor(x*log_10(2)). Use y to divide the original number by 10^y. You can then extract each digit by repeatedly converting to int, subtracting the int from the original (to leave the fraction) and multiplying by 10 (the "." goes after the first digit in this case). Finally append "e" and the value of y that you had (this is signed integer, so negative for small numbers) to end up with 1.2345e2. [If the number is 'near to zero' 2^64>x>2^-64 you can just do the convert to int/subtraction/multiply by 10 to get 123.45 but other values will be a bit long]


Thanks for the explanation.
Post 12 Feb 2014, 06:02
View user's profile Send private message Reply with quote
system error



Joined: 01 Sep 2013
Posts: 670
system error 12 Feb 2014, 06:08
tthsqe wrote:
If you want ABSOLUTELY ACCURATE conversion between strings and extended precision values, you will need to build a multiprecision multiplier with a precision of at least 96 bits. I use 192 bits, but I realize that this much precision is overkill.

Code:

; prints float in st0 to string at rdi
PrintFloat:
virtual at rsp
.x       dq ?,?,?,?
.d       dq ?
.ex      dq ?
.ii      dq ?
.message dq ?
.rsi     dq ?
.rbp     dq ?
.rbx     dq ?
.r12     dq ?
.r13     dq ?
.s       dq ?
end virtual

                        sub  rsp,8*(65)
                        mov  [.rsi],rsi
                        mov  [.rbp],rbp
                        mov  [.rbx],rbx
                        mov  [.r12],r12
                        mov  [.r13],r13

                        fld  st0
                       fstp  tword[.x]
                      movzx  edx,word[.x+8]
                        mov  ecx,edx
                        and  ecx,0x08000
                         jz  @f
                       fchs
                        mov  al,'-'
                      stosb
                      @@:
                        and  edx,0x07FFF
                         jz  .zero
                        cmp  edx,0x07FFF
                         je  .inf
                        mov  [.message],rdi

                       fld1
                       fld1
                       fadd  st0,st0
                      fdivp  st1,st0

                     fldlg2
                       fild  qword[OutputBits]
                      fmulp  st1,st0
                      faddp  st1,st0
                      fistp  qword[.d]
                        mov  eax,1
                        mov  qword[.ex],rax
                     fldlg2
                        fld  st1
                      fyl2x
                      fistp  qword[.ii]
                        lea  rdi,[.x]
                       call  CVT_f80_F192
                        lea  rdi,[.s]
                        lea  rsi,[.x]
                        mov  rbx,[.ex]
                        add  rbx,[.d]
                        sub  rbx,[.ii]
                       call  POWMUL_F192
                        sub  rax,1
                        sub  rax,rbx
                        add  rdi,[.d]
                 @@:    sub  rdi,1
                        cmp  byte[rdi],'0'
                         je  @b
                        add  rdi,1

                        mov  rcx,rdi
                        lea  rsi,[.s]
                        mov  rdi,[.message]
                        sub  rcx,rsi

                        cmp  rax,16
                         jg  .print_w_exp
                        cmp  rax,-8
                         jl  .print_w_exp
                       test  rax,rax
                         jz  .print_no_exp
                         js  .print_neg_exp
                .print_pos_exp:
                        cmp  eax,ecx
                        lea  eax,[rax+1]
                        jae  @f
                       push  rax
                       push  rcx
                        mov  ecx,eax
                  rep movsb
                        mov  al,'.'
                      stosb
                        pop  rcx
                        pop  rax
                        sub  ecx,eax
                  rep movsb
                        jmp  .done
             @@:        sub  eax,ecx
                  rep movsb
                        mov  ecx,eax
                        mov  al,'0'
                  rep stosb
                        mov  al,'.'
                      stosb
                        jmp  .done

                .print_neg_exp:
                       push  rcx
                       push  rax
                        mov  ax,'0.'
                      stosw
                        pop  rcx
                        not  rcx
                        mov  al,'0'
                  rep stosb
                        pop  rcx
                  rep movsb
                        jmp  .done

                .print_no_exp:
                      movsb
                        sub  ecx,1
                        mov  al,'.'
                      stosb
                  rep movsb
                        jmp  .done

                .print_w_exp:
                      movsb
                        sub  ecx,1
                       push  rax
                        mov  al,'.'
                      stosb
                  rep movsb
                        mov  al,'e'
                      stosb
                        pop  rax
                       call  PrintInteger
                       ; jmp  .done

   .done:
                        mov  rsi,[.rsi]
                        mov  rbp,[.rbp]
                        mov  rbx,[.rbx]
                        mov  r12,[.r12]
                        mov  r13,[.r13]
                        add  rsp,8*65
                        ret


  .zero:                mov  ax,'0.'
                       fstp  st0
                      stosw
                        jmp  .done

 .inf:                  mov  rax,[.x]
                       fstp  st0
                       test  rax,rax
                        jns  .Nan
                        shl  rax,1
                        jnz  .Nan
                        mov  eax,'inf'
                      stosd
                        sub  rdi,1
                        jmp  .done

        .Nan:
                        mov  eax,'NaN'
                      stosd
                        sub  rdi,1
                        jmp  .done



CVT_f80_F192: ; pop fpu stack to rdi
                        sub  rsp,8*5
                       fstp  tword[rsp]
                        xor  eax,eax
                        mov  [rdi+8*0],rax
                        mov  [rdi+8*1],rax
                      movsx  ecx,word[rsp+8]
                        and  ecx,0x7FFFF
                        sub  rcx,16383
                        mov  rax,qword[rsp+0]
                        mov  [rdi+8*2],rax
                        mov  [rdi+8*3],rcx
                        add  rsp,8*5
                        ret






POWMUL_F192:    ;  [rdi] = string(round([rsi] * 10^rbx))

virtual at rsp
  .base  dq ?,?,?,?
  .unity dq ?,?,?,?
  .rdi   dq ?
  .rsi   dq ?
  .rbx   dq ?
end virtual
                        sub  rsp,8*13
                        mov  [.rdi],rdi
                        mov  [.rbx],rbx
                        mov  [.rsi],rsi

                     movaps  xmm0,dqword[Bases+32*1+16*0]
                     movaps  xmm1,dqword[Bases+32*1+16*1]
                     movups  dqword[.unity+16*0],xmm0
                     movups  dqword[.unity+16*1],xmm1

                        mov  rax,qword[OutputBase]
                       test  rbx,rbx
                         jz  .zero
                        jns  @f
                        neg  rax
                        neg  rbx
                   @@:
                        shl  rax,5
                     movaps  xmm0,dqword[Bases+rax+16*0]
                     movaps  xmm1,dqword[Bases+rax+16*1]
                     movups  dqword[.base+16*0],xmm0
                     movups  dqword[.base+16*1],xmm1

                       test  rbx,rbx
                         jz  .w3
              .w1:      shr  rbx,1
                        jnc  .w2
                        lea  rdi,[.unity]
                        lea  rsi,[.base]
                       call  MUL_F192
              .w2:     test  rbx,rbx
                         jz  .w3
                        lea  rdi,[.base]
                        lea  rsi,[.base]
                       call  MUL_F192
                        jmp  .w1
              .w3:
        .zero:          lea  rdi,[.unity]
                        mov  rsi,[.rsi]
                       call  MUL_F192

                        ; rax = sign bit
                        ; ecx = exponent

                        sub  rcx,2*64-1
                        jns  .error

                @@:     shr  r13,1
                        rcr  r12,1
                        rcr  r11,1
                        add  rcx,1
                        jnz  @b

                        cmp  r11,rax
                         jb  .store
                         ja  .roundup
                       test  r12,1
                         jz  .store
     .roundup:          add  r12,1
                        adc  r13,0
    .store:
                        mov  rdi,[.rdi]
                        mov  rbp,rsp
        .ComputeDigits: xor  edx,edx
                        mov  rax,r13
                        div  qword[OutputBase]
                        mov  r13,rax
                        mov  rcx,rax
                        mov  rax,r12
                        div  qword[OutputBase]
                        mov  r12,rax
                         or  rcx,rax
                       push  rdx
                        jnz  .ComputeDigits
        .PrintDigits:   pop  rax
                        add  al,'0'
                      stosb
                        cmp  rsp,rbp
                         jb  .PrintDigits

                        mov  rax,rdi
                        sub  rax,[.rdi]
                        mov  rdi,[.rdi]
                        mov  rsi,[.rsi]
                        mov  rbx,[.rbx]

                        add  rsp,8*13
                        ret

.error:                int3


MUL_F192:               mov  rcx,[rsi+8*3]
                        add  rcx,[rdi+8*3]

                        mov  rax,[rsi+8*0]
                        mul  qword[rdi+8*0]
                        mov  r8,rax
                        mov  r9,rdx
                        xor  r10,r10
                        xor  r11,r11
                        xor  r12,r12
                        xor  r13,r13

                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*0]
                        add  r9,rax
                        adc  r10,rdx
                        adc  r11,0
                        mov  rax,[rsi+8*0]
                        mul  qword[rdi+8*1]
                        add  r9,rax
                        adc  r10,rdx
                        adc  r11,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*0]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0
                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*1]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0
                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*0]
                        add  r10,rax
                        adc  r11,rdx
                        adc  r12,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*1]
                        add  r11,rax
                        adc  r12,rdx
                        adc  r13,0
                        mov  rax,[rsi+8*1]
                        mul  qword[rdi+8*2]
                        add  r11,rax
                        adc  r12,rdx
                        adc  r13,0

                        mov  rax,[rsi+8*2]
                        mul  qword[rdi+8*2]

                        add  r12,rax
                        adc  r13,rdx


                         js  @f
                        shl  r8,1
                        rcl  r9,1
                        rcl  r10,1
                        rcl  r11,1
                        rcl  r12,1
                        rcl  r13,1
                        jmp  .round
                  @@:   add  rcx,1

    .round:             xor  eax,eax
                        bts  rax,63
                        cmp  r10,rax
                         jb  .store
                         ja  .roundup
                       test  r11,1
                         jz  .store
     .roundup:          add  r11,1
                        adc  r12,0
                        adc  r13,0
                        jnc  .store
                        rcr  r13,1
                        rcr  r12,1
                        rcr  r11,1
                        add  rcx,1

    .store:             mov  [rdi+8*0],r11
                        mov  [rdi+8*1],r12
                        mov  [rdi+8*2],r13
                        mov  [rdi+8*3],rcx
                        ret


align 32
dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -4
dq 0x8e38e38e38e38e38, 0x38e38e38e38e38e3, 0xe38e38e38e38e38e, -4
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -3
dq 0x4924924924924924, 0x2492492492492492, 0x9249249249249249, -3
dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -3
dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -3
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -2
dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -2
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -1
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0
Bases: dq ?,?,?,?
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 1
dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 1
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0xe000000000000000, 2
dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 3
dq 0x0000000000000000, 0x0000000000000000, 0x9000000000000000, 3
dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 3


PrintInteger:       ; rax: number
                       push  rbp rcx rdx
                        mov  rbp,rsp
                       test  rax,rax
                        jns  .l1
                        mov  byte[rdi],'-'
                        add  rdi,1
                        neg  rax
                .l1:    xor  edx,edx
                        div  qword[OutputBase]
                       push  rdx
                       test  rax,rax
                        jnz  .l1
                .l2:    pop  rax
                        add  al,'0'
                      stosb
                        cmp  rsp,rbp
                         jb  .l2
                        pop  rdx rcx rbp
                        ret                                                                
          
Thanks for the great code, although I can't test it on my little Atom pc. I'll tag your name whenever I use this code in the future. Questions....

1) what's the lookup table for?
2) what editor do you use? Code layout seems weird but interesting.
Post 12 Feb 2014, 06:08
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 13 Feb 2014, 15:49
It seems that I have a habit of posting code that other people can't run. What exactly is wrong with this that you cannot run it on atom? no 64 bit os?
I like to have the instructions right justified and the operands left justified because it makes it much easier to read.
I'll post a better version with a parser and some 32 bit versions as well.
You will have to figure out what the table contains
Wink
Post 13 Feb 2014, 15:49
View user's profile Send private message Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 19 Feb 2014, 18:02
Here's some NASM parsing code that might be interesting. It uses SSE (isntead of FPU) to store the mantissa (fractional part) and the characteristic. Then adds them together at the end.

Code:
global _start

section .data
    ten     dq  10.0
    one     dq  1.0
    zero    dq  0.0
    negate  dq  8000000000000000h
    result  dd  0.0
    hex     db  '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
    length  dd  0
    buffer  db  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

section .text

_start:
    ;;
    ;; read the fp number from stdin
    ;;
    mov     ecx, buffer
    mov     edx, 32
    call    read
    mov     dword [length], eax
    movq    xmm0, qword [zero]
    movq    xmm1, qword [zero]
    movq    xmm2, qword [ten]
    movq    xmm4, qword [ten]
    ;;
    ;; loop through 1 character at a time
    ;;
    mov     ebx, dword [length]
    test    ebx, ebx
    jz      quit        ;; there's no input
    mov     ecx, 0      ;; offset counter
    mov     edx, 0      ;; 0 for before decimal, 1 for after decimal
    mov     edi, 0      ;; 0 for positive, 1 for negative
    mov     esi, buffer
    cmp     byte [esi], '-'
    jne     process
    mov     edi, 1      ;; the number is negative
    inc     ecx
process:
    movzx   eax, byte [esi + ecx]
    cmp     al, '.'     ;; does al contain a decimal point '.'
    jne     next_check
    test    edx, edx    ;; more than 1 decimal error
    jnz     quit
    mov     edx, 1
    jmp     continue_process
next_check:
    sub     eax, '0'    ;; ascii digit to binary
    js      end_process ;; not a digit since eax is negative
    cmp     eax, 10
    jge     end_process ;; not a digit since eax is >= 10
    test    edx, edx    ;; before or after decimal
    jnz     mantissa_process
    mulsd   xmm0, xmm2  ;; result characteristic * 10
   cvtsi2sd xmm3, eax
    addsd   xmm0, xmm3  ;; result characteristic + next digit
    jmp     continue_process
mantissa_process:
    cvtsi2sd    xmm3, eax
    divsd   xmm3, xmm2  ;; next digit / current mantissa power of 10
    addsd   xmm1, xmm3  ;; result mantissa + next fraction
    mulsd   xmm2, xmm4  ;; mantissa power * 10
continue_process:
    inc     ecx
    cmp     ecx, ebx
    jl      process
end_process:
    addsd   xmm0, xmm1  ;; characteristic + mantissa
    test    edi, edi    ;; is the number supposed to be negative ?
    jz      store_result
    movq    xmm3, qword [negate]
    por     xmm0, xmm3  ;; toggle the sign bit
 store_result:
   cvtsd2ss xmm0, xmm0  ;; double (64bit) to single (32) fp
    movd    eax, xmm0
    mov     dword[result], eax
    ;;
    ;; convert result to hex
    ;;
to_hex:
    mov edi, buffer
    mov esi, hex
    mov ebx, 0
    mov eax, dword [result]
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 7], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 6], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 5], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 4], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 3], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 2], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 1], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 0], bl
    ;;
    ;; print result
    ;;
print_dword:
    mov     ecx, buffer
    mov     edx, 8
    call    write
    ;;
    ;; quit
    ;;
quit:
    call    exit

exit:
    mov     eax, 01h    ; exit()
    xor     ebx, ebx    ; errno
    int     80h
read:
    mov     eax, 03h    ; read()
    mov     ebx, 00h    ; stdin
    int     80h
    ret
write:
    mov     eax, 04h    ; write()
    mov     ebx, 01h    ; stdout
    int     80h
    ret
    

I wrote it as an answer to a SO question http://stackoverflow.com/a/20379836
Post 19 Feb 2014, 18:02
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
system error



Joined: 01 Sep 2013
Posts: 670
system error 05 Mar 2014, 18:22
r22 wrote:
Here's some NASM parsing code that might be interesting. It uses SSE (isntead of FPU) to store the mantissa (fractional part) and the characteristic. Then adds them together at the end.

Code:
global _start

section .data
    ten     dq  10.0
    one     dq  1.0
    zero    dq  0.0
    negate  dq  8000000000000000h
    result  dd  0.0
    hex     db  '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
    length  dd  0
    buffer  db  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

section .text

_start:
    ;;
    ;; read the fp number from stdin
    ;;
    mov     ecx, buffer
    mov     edx, 32
    call    read
    mov     dword [length], eax
    movq    xmm0, qword [zero]
    movq    xmm1, qword [zero]
    movq    xmm2, qword [ten]
    movq    xmm4, qword [ten]
    ;;
    ;; loop through 1 character at a time
    ;;
    mov     ebx, dword [length]
    test    ebx, ebx
    jz      quit        ;; there's no input
    mov     ecx, 0      ;; offset counter
    mov     edx, 0      ;; 0 for before decimal, 1 for after decimal
    mov     edi, 0      ;; 0 for positive, 1 for negative
    mov     esi, buffer
    cmp     byte [esi], '-'
    jne     process
    mov     edi, 1      ;; the number is negative
    inc     ecx
process:
    movzx   eax, byte [esi + ecx]
    cmp     al, '.'     ;; does al contain a decimal point '.'
    jne     next_check
    test    edx, edx    ;; more than 1 decimal error
    jnz     quit
    mov     edx, 1
    jmp     continue_process
next_check:
    sub     eax, '0'    ;; ascii digit to binary
    js      end_process ;; not a digit since eax is negative
    cmp     eax, 10
    jge     end_process ;; not a digit since eax is >= 10
    test    edx, edx    ;; before or after decimal
    jnz     mantissa_process
    mulsd   xmm0, xmm2  ;; result characteristic * 10
   cvtsi2sd xmm3, eax
    addsd   xmm0, xmm3  ;; result characteristic + next digit
    jmp     continue_process
mantissa_process:
    cvtsi2sd    xmm3, eax
    divsd   xmm3, xmm2  ;; next digit / current mantissa power of 10
    addsd   xmm1, xmm3  ;; result mantissa + next fraction
    mulsd   xmm2, xmm4  ;; mantissa power * 10
continue_process:
    inc     ecx
    cmp     ecx, ebx
    jl      process
end_process:
    addsd   xmm0, xmm1  ;; characteristic + mantissa
    test    edi, edi    ;; is the number supposed to be negative ?
    jz      store_result
    movq    xmm3, qword [negate]
    por     xmm0, xmm3  ;; toggle the sign bit
 store_result:
   cvtsd2ss xmm0, xmm0  ;; double (64bit) to single (32) fp
    movd    eax, xmm0
    mov     dword[result], eax
    ;;
    ;; convert result to hex
    ;;
to_hex:
    mov edi, buffer
    mov esi, hex
    mov ebx, 0
    mov eax, dword [result]
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 7], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 6], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 5], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 4], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 3], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 2], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 1], bl
    shr eax, 4
    mov bl, al
    and bl, 0fh
    mov bl, byte [esi + ebx]
    mov byte [edi + 0], bl
    ;;
    ;; print result
    ;;
print_dword:
    mov     ecx, buffer
    mov     edx, 8
    call    write
    ;;
    ;; quit
    ;;
quit:
    call    exit

exit:
    mov     eax, 01h    ; exit()
    xor     ebx, ebx    ; errno
    int     80h
read:
    mov     eax, 03h    ; read()
    mov     ebx, 00h    ; stdin
    int     80h
    ret
write:
    mov     eax, 04h    ; write()
    mov     ebx, 01h    ; stdout
    int     80h
    ret
    

I wrote it as an answer to a SO question http://stackoverflow.com/a/20379836


Thanks. will check it out later. Still on windows
Post 05 Mar 2014, 18:22
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 06 Mar 2014, 08:06
Again, let me warn you about accuracy. The code that r22 posted will work fine if you don't need accurate parsing of floats, but if you need perfect accuracy then you will have to be more careful.
Post 06 Mar 2014, 08:06
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.