flat assembler
Message board for the users of flat assembler.
Index
> Main > Why cant I add floating point number right 
Author 

revolution 02 Feb 2012, 03:20
12.2 cannot be perfectly represented as a floating point number. The FPU will use the closest approximation available 12.199999... Generally one would round the numbers to the required precision for display and the small error in the internal representation would not be seen.


02 Feb 2012, 03:20 

NanoBytes 02 Feb 2012, 03:39
Ahh, i think i figured it out
Code: MOV [Integer1],15 FILD [Integer1] MOV [Integer1],2 FILD [Integer1] MOV [Integer1],10 FIDIV [Integer1] FSUBP st1,st0 FSTP [Integer1] BTS [Integer1],2 FLD [Integer1] Though, i am worried, I am converting from the registers 10 bytes to the integers 4 bytes just to correctly add two numbers. Is there a more accurate way to do this? _________________ He is no fool who gives what he cannot keep to gain what he cannot loose. 

02 Feb 2012, 03:39 

NanoBytes 02 Feb 2012, 04:01
Ok, revolution, how would i round the entire number off?


02 Feb 2012, 04:01 

revolution 02 Feb 2012, 04:16
NanoBytes wrote: Ok, revolution, how would i round the entire number off? 

02 Feb 2012, 04:16 

NanoBytes 02 Feb 2012, 05:35
I know, I am working on a atof procedure


02 Feb 2012, 05:35 

revolution 02 Feb 2012, 05:52


02 Feb 2012, 05:52 

JohnFound 02 Feb 2012, 06:46
Note that from mathematical point of view 12.199(9) = 12.2 and this is exact equation.


02 Feb 2012, 06:46 

revolution 02 Feb 2012, 09:40
JohnFound wrote: Note that from mathematical point of view 12.199(9) = 12.2 and this is exact equation. The closest is 12.19999999999999929 The next value is 12.20000000000000107 

02 Feb 2012, 09:40 

smiddy 02 Feb 2012, 13:00
revolution wrote:


02 Feb 2012, 13:00 

JohnFound 02 Feb 2012, 13:18
Quote: That's interesting! What is the limiting factor causing the precision error? The registers size of course. 

02 Feb 2012, 13:18 

revolution 02 Feb 2012, 13:33
smiddy wrote: That's interesting! What is the limiting factor causing the precision error? 

02 Feb 2012, 13:33 

Matrix 02 Feb 2012, 21:38
revolution wrote:
or we just create a struct doubledouble and stack 2 double values in it, and get 2x precision bits 

02 Feb 2012, 21:38 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.