flat assembler
Message board for the users of flat assembler.
Index
> Main > FPU: my first attempt |
Author |
|
revolution 10 Feb 2014, 11:33
You are overflowing the FPU stack. Try something like this:
Code: fild dword[d] fidivr dword[c] fimul [t] fistp dword[tot] |
|||
10 Feb 2014, 11:33 |
|
system error 10 Feb 2014, 11:40
revolution wrote: You are overflowing the FPU stack. Try something like this: Love you revo! By the way, any "special code" from you regarding this? (aka I am asking for a working floating point conversion code gently and nicely) |
|||
10 Feb 2014, 11:40 |
|
revolution 10 Feb 2014, 11:42
system error wrote: By the way, any "special code" from you regarding this? (aka I am asking for a working floating point conversion code gently and nicely) |
|||
10 Feb 2014, 11:42 |
|
system error 10 Feb 2014, 11:46
revolution wrote:
|
|||
10 Feb 2014, 11:46 |
|
cod3b453 10 Feb 2014, 18:04
I don't have code for this but basically you can treat the whole number as a big integer then divide by ten to the power of the number of fractional digits e.g. treat 123.45 as if it were 12345 and then divide by 10^2 because there are two digits after ".". If you want to handle sign (+/-)/scientific 1.5E34 type notation it's similar but you have to check for them and add handling for those.
Going the other way is a little more tricky; scientific notation is simpler for general case. You can do this by extracting the exponent and do the base conversion from 2 to 10 using y=floor(x*log_10(2)). Use y to divide the original number by 10^y. You can then extract each digit by repeatedly converting to int, subtracting the int from the original (to leave the fraction) and multiplying by 10 (the "." goes after the first digit in this case). Finally append "e" and the value of y that you had (this is signed integer, so negative for small numbers) to end up with 1.2345e2. [If the number is 'near to zero' 2^64>x>2^-64 you can just do the convert to int/subtraction/multiply by 10 to get 123.45 but other values will be a bit long] |
|||
10 Feb 2014, 18:04 |
|
tthsqe 11 Feb 2014, 01:24
If you want ABSOLUTELY ACCURATE conversion between strings and extended precision values, you will need to build a multiprecision multiplier with a precision of at least 96 bits. I use 192 bits, but I realize that this much precision is overkill.
Code: ; prints float in st0 to string at rdi PrintFloat: virtual at rsp .x dq ?,?,?,? .d dq ? .ex dq ? .ii dq ? .message dq ? .rsi dq ? .rbp dq ? .rbx dq ? .r12 dq ? .r13 dq ? .s dq ? end virtual sub rsp,8*(65) mov [.rsi],rsi mov [.rbp],rbp mov [.rbx],rbx mov [.r12],r12 mov [.r13],r13 fld st0 fstp tword[.x] movzx edx,word[.x+8] mov ecx,edx and ecx,0x08000 jz @f fchs mov al,'-' stosb @@: and edx,0x07FFF jz .zero cmp edx,0x07FFF je .inf mov [.message],rdi fld1 fld1 fadd st0,st0 fdivp st1,st0 fldlg2 fild qword[OutputBits] fmulp st1,st0 faddp st1,st0 fistp qword[.d] mov eax,1 mov qword[.ex],rax fldlg2 fld st1 fyl2x fistp qword[.ii] lea rdi,[.x] call CVT_f80_F192 lea rdi,[.s] lea rsi,[.x] mov rbx,[.ex] add rbx,[.d] sub rbx,[.ii] call POWMUL_F192 sub rax,1 sub rax,rbx add rdi,[.d] @@: sub rdi,1 cmp byte[rdi],'0' je @b add rdi,1 mov rcx,rdi lea rsi,[.s] mov rdi,[.message] sub rcx,rsi cmp rax,16 jg .print_w_exp cmp rax,-8 jl .print_w_exp test rax,rax jz .print_no_exp js .print_neg_exp .print_pos_exp: cmp eax,ecx lea eax,[rax+1] jae @f push rax push rcx mov ecx,eax rep movsb mov al,'.' stosb pop rcx pop rax sub ecx,eax rep movsb jmp .done @@: sub eax,ecx rep movsb mov ecx,eax mov al,'0' rep stosb mov al,'.' stosb jmp .done .print_neg_exp: push rcx push rax mov ax,'0.' stosw pop rcx not rcx mov al,'0' rep stosb pop rcx rep movsb jmp .done .print_no_exp: movsb sub ecx,1 mov al,'.' stosb rep movsb jmp .done .print_w_exp: movsb sub ecx,1 push rax mov al,'.' stosb rep movsb mov al,'e' stosb pop rax call PrintInteger ; jmp .done .done: mov rsi,[.rsi] mov rbp,[.rbp] mov rbx,[.rbx] mov r12,[.r12] mov r13,[.r13] add rsp,8*65 ret .zero: mov ax,'0.' fstp st0 stosw jmp .done .inf: mov rax,[.x] fstp st0 test rax,rax jns .Nan shl rax,1 jnz .Nan mov eax,'inf' stosd sub rdi,1 jmp .done .Nan: mov eax,'NaN' stosd sub rdi,1 jmp .done CVT_f80_F192: ; pop fpu stack to rdi sub rsp,8*5 fstp tword[rsp] xor eax,eax mov [rdi+8*0],rax mov [rdi+8*1],rax movsx ecx,word[rsp+8] and ecx,0x7FFFF sub rcx,16383 mov rax,qword[rsp+0] mov [rdi+8*2],rax mov [rdi+8*3],rcx add rsp,8*5 ret POWMUL_F192: ; [rdi] = string(round([rsi] * 10^rbx)) virtual at rsp .base dq ?,?,?,? .unity dq ?,?,?,? .rdi dq ? .rsi dq ? .rbx dq ? end virtual sub rsp,8*13 mov [.rdi],rdi mov [.rbx],rbx mov [.rsi],rsi movaps xmm0,dqword[Bases+32*1+16*0] movaps xmm1,dqword[Bases+32*1+16*1] movups dqword[.unity+16*0],xmm0 movups dqword[.unity+16*1],xmm1 mov rax,qword[OutputBase] test rbx,rbx jz .zero jns @f neg rax neg rbx @@: shl rax,5 movaps xmm0,dqword[Bases+rax+16*0] movaps xmm1,dqword[Bases+rax+16*1] movups dqword[.base+16*0],xmm0 movups dqword[.base+16*1],xmm1 test rbx,rbx jz .w3 .w1: shr rbx,1 jnc .w2 lea rdi,[.unity] lea rsi,[.base] call MUL_F192 .w2: test rbx,rbx jz .w3 lea rdi,[.base] lea rsi,[.base] call MUL_F192 jmp .w1 .w3: .zero: lea rdi,[.unity] mov rsi,[.rsi] call MUL_F192 ; rax = sign bit ; ecx = exponent sub rcx,2*64-1 jns .error @@: shr r13,1 rcr r12,1 rcr r11,1 add rcx,1 jnz @b cmp r11,rax jb .store ja .roundup test r12,1 jz .store .roundup: add r12,1 adc r13,0 .store: mov rdi,[.rdi] mov rbp,rsp .ComputeDigits: xor edx,edx mov rax,r13 div qword[OutputBase] mov r13,rax mov rcx,rax mov rax,r12 div qword[OutputBase] mov r12,rax or rcx,rax push rdx jnz .ComputeDigits .PrintDigits: pop rax add al,'0' stosb cmp rsp,rbp jb .PrintDigits mov rax,rdi sub rax,[.rdi] mov rdi,[.rdi] mov rsi,[.rsi] mov rbx,[.rbx] add rsp,8*13 ret .error: int3 MUL_F192: mov rcx,[rsi+8*3] add rcx,[rdi+8*3] mov rax,[rsi+8*0] mul qword[rdi+8*0] mov r8,rax mov r9,rdx xor r10,r10 xor r11,r11 xor r12,r12 xor r13,r13 mov rax,[rsi+8*1] mul qword[rdi+8*0] add r9,rax adc r10,rdx adc r11,0 mov rax,[rsi+8*0] mul qword[rdi+8*1] add r9,rax adc r10,rdx adc r11,0 mov rax,[rsi+8*2] mul qword[rdi+8*0] add r10,rax adc r11,rdx adc r12,0 mov rax,[rsi+8*1] mul qword[rdi+8*1] add r10,rax adc r11,rdx adc r12,0 mov rax,[rsi+8*2] mul qword[rdi+8*0] add r10,rax adc r11,rdx adc r12,0 mov rax,[rsi+8*2] mul qword[rdi+8*1] add r11,rax adc r12,rdx adc r13,0 mov rax,[rsi+8*1] mul qword[rdi+8*2] add r11,rax adc r12,rdx adc r13,0 mov rax,[rsi+8*2] mul qword[rdi+8*2] add r12,rax adc r13,rdx js @f shl r8,1 rcl r9,1 rcl r10,1 rcl r11,1 rcl r12,1 rcl r13,1 jmp .round @@: add rcx,1 .round: xor eax,eax bts rax,63 cmp r10,rax jb .store ja .roundup test r11,1 jz .store .roundup: add r11,1 adc r12,0 adc r13,0 jnc .store rcr r13,1 rcr r12,1 rcr r11,1 add rcx,1 .store: mov [rdi+8*0],r11 mov [rdi+8*1],r12 mov [rdi+8*2],r13 mov [rdi+8*3],rcx ret align 32 dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -4 dq 0x8e38e38e38e38e38, 0x38e38e38e38e38e3, 0xe38e38e38e38e38e, -4 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -3 dq 0x4924924924924924, 0x2492492492492492, 0x9249249249249249, -3 dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -3 dq 0xcccccccccccccccc, 0xcccccccccccccccc, 0xcccccccccccccccc, -3 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -2 dq 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, -2 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, -1 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0 Bases: dq ?,?,?,? dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 0 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 1 dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 1 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 2 dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 2 dq 0x0000000000000000, 0x0000000000000000, 0xc000000000000000, 2 dq 0x0000000000000000, 0x0000000000000000, 0xe000000000000000, 2 dq 0x0000000000000000, 0x0000000000000000, 0x8000000000000000, 3 dq 0x0000000000000000, 0x0000000000000000, 0x9000000000000000, 3 dq 0x0000000000000000, 0x0000000000000000, 0xa000000000000000, 3 PrintInteger: ; rax: number push rbp rcx rdx mov rbp,rsp test rax,rax jns .l1 mov byte[rdi],'-' add rdi,1 neg rax .l1: xor edx,edx div qword[OutputBase] push rdx test rax,rax jnz .l1 .l2: pop rax add al,'0' stosb cmp rsp,rbp jb .l2 pop rdx rcx rbp ret |
|||
11 Feb 2014, 01:24 |
|
system error 12 Feb 2014, 06:02
cod3b453 wrote: I don't have code for this but basically you can treat the whole number as a big integer then divide by ten to the power of the number of fractional digits e.g. treat 123.45 as if it were 12345 and then divide by 10^2 because there are two digits after ".". If you want to handle sign (+/-)/scientific 1.5E34 type notation it's similar but you have to check for them and add handling for those. Thanks for the explanation. |
|||
12 Feb 2014, 06:02 |
|
system error 12 Feb 2014, 06:08
tthsqe wrote: If you want ABSOLUTELY ACCURATE conversion between strings and extended precision values, you will need to build a multiprecision multiplier with a precision of at least 96 bits. I use 192 bits, but I realize that this much precision is overkill. 1) what's the lookup table for? 2) what editor do you use? Code layout seems weird but interesting. |
|||
12 Feb 2014, 06:08 |
|
tthsqe 13 Feb 2014, 15:49
It seems that I have a habit of posting code that other people can't run. What exactly is wrong with this that you cannot run it on atom? no 64 bit os?
I like to have the instructions right justified and the operands left justified because it makes it much easier to read. I'll post a better version with a parser and some 32 bit versions as well. You will have to figure out what the table contains |
|||
13 Feb 2014, 15:49 |
|
r22 19 Feb 2014, 18:02
Here's some NASM parsing code that might be interesting. It uses SSE (isntead of FPU) to store the mantissa (fractional part) and the characteristic. Then adds them together at the end.
Code: global _start section .data ten dq 10.0 one dq 1.0 zero dq 0.0 negate dq 8000000000000000h result dd 0.0 hex db '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' length dd 0 buffer db 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 section .text _start: ;; ;; read the fp number from stdin ;; mov ecx, buffer mov edx, 32 call read mov dword [length], eax movq xmm0, qword [zero] movq xmm1, qword [zero] movq xmm2, qword [ten] movq xmm4, qword [ten] ;; ;; loop through 1 character at a time ;; mov ebx, dword [length] test ebx, ebx jz quit ;; there's no input mov ecx, 0 ;; offset counter mov edx, 0 ;; 0 for before decimal, 1 for after decimal mov edi, 0 ;; 0 for positive, 1 for negative mov esi, buffer cmp byte [esi], '-' jne process mov edi, 1 ;; the number is negative inc ecx process: movzx eax, byte [esi + ecx] cmp al, '.' ;; does al contain a decimal point '.' jne next_check test edx, edx ;; more than 1 decimal error jnz quit mov edx, 1 jmp continue_process next_check: sub eax, '0' ;; ascii digit to binary js end_process ;; not a digit since eax is negative cmp eax, 10 jge end_process ;; not a digit since eax is >= 10 test edx, edx ;; before or after decimal jnz mantissa_process mulsd xmm0, xmm2 ;; result characteristic * 10 cvtsi2sd xmm3, eax addsd xmm0, xmm3 ;; result characteristic + next digit jmp continue_process mantissa_process: cvtsi2sd xmm3, eax divsd xmm3, xmm2 ;; next digit / current mantissa power of 10 addsd xmm1, xmm3 ;; result mantissa + next fraction mulsd xmm2, xmm4 ;; mantissa power * 10 continue_process: inc ecx cmp ecx, ebx jl process end_process: addsd xmm0, xmm1 ;; characteristic + mantissa test edi, edi ;; is the number supposed to be negative ? jz store_result movq xmm3, qword [negate] por xmm0, xmm3 ;; toggle the sign bit store_result: cvtsd2ss xmm0, xmm0 ;; double (64bit) to single (32) fp movd eax, xmm0 mov dword[result], eax ;; ;; convert result to hex ;; to_hex: mov edi, buffer mov esi, hex mov ebx, 0 mov eax, dword [result] mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 7], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 6], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 5], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 4], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 3], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 2], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 1], bl shr eax, 4 mov bl, al and bl, 0fh mov bl, byte [esi + ebx] mov byte [edi + 0], bl ;; ;; print result ;; print_dword: mov ecx, buffer mov edx, 8 call write ;; ;; quit ;; quit: call exit exit: mov eax, 01h ; exit() xor ebx, ebx ; errno int 80h read: mov eax, 03h ; read() mov ebx, 00h ; stdin int 80h ret write: mov eax, 04h ; write() mov ebx, 01h ; stdout int 80h ret I wrote it as an answer to a SO question http://stackoverflow.com/a/20379836 |
|||
19 Feb 2014, 18:02 |
|
system error 05 Mar 2014, 18:22
r22 wrote: Here's some NASM parsing code that might be interesting. It uses SSE (isntead of FPU) to store the mantissa (fractional part) and the characteristic. Then adds them together at the end. Thanks. will check it out later. Still on windows |
|||
05 Mar 2014, 18:22 |
|
tthsqe 06 Mar 2014, 08:06
Again, let me warn you about accuracy. The code that r22 posted will work fine if you don't need accurate parsing of floats, but if you need perfect accuracy then you will have to be more careful.
|
|||
06 Mar 2014, 08:06 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.