flat assembler
Message board for the users of flat assembler.

Index > Main > Maybe another FPU bug

Author
Thread Post new topic Reply to topic
MCD



Joined: 21 Aug 2004
Posts: 604
Location: Germany
MCD
I just noticed on Ollydbg 1.1 that the following rounding code results in a wrong value:
Code:
 movss xmm0,[_0.5]
 cvtss2si eax,xmm0
    

The value returned in eax is 0, but isn't it supposed to be 1? Or did I miss something from the intel docs?

Anyway, rounding the value 1.5 results in the correct result: 2.

I guess it's my fault somewhere, 'cause the same behaviour shows up with classical FPU code:
Code:
 fld dword [_0.5]
 frndint;rounding mode was set to "round to nearest"
    


Have just tested it on intel Pentium III/IV and AMD Athlon XP

_________________
MCD - the inevitable return of the Mad Computer Doggy

-||__/
.|+-~
.|| ||
Post 02 Aug 2005, 12:33
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
The RC field (bits 11 and 10) or Rounding Control determines how the FPU will round results in one of four ways:

00 = Round to nearest, or to even if equidistant (this is the initialized state)
01 = Round down (toward -infinity)
10 = Round up (toward +infinity)
11 = Truncate (toward 0)

I tried with 2.5 and my pentium rounds to 2.0. I also tested with your values with the same result. With my bad English I understand by "even" a number with this property: x.y; if mod(x) = 0 then "even"

Hope this helps

[edit]ups sorry, here is the link http://www.website.masmforum.com/tutorials/fptute/fpuchap1.htm#cword [/edit]
Post 02 Aug 2005, 14:44
View user's profile Send private message Reply with quote
MCD



Joined: 21 Aug 2004
Posts: 604
Location: Germany
MCD
round to nearest or to even if equidistant Sorry, that's exactly what I missed. I must have overreadden this. But this will make it a bit more harder to compute general round to nearest, as it is actually mostly used.
Post 03 Aug 2005, 09:39
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
MCD wrote:
But this will make it a bit more harder to compute general round to nearest, as it is actually mostly used.
Yes you are right, actually I didn't know that the FPU rounds to even in some cases.
I think if you add to the number to round 0.1 you can force to round up when the number is x.5 . The problem with this is when the number is -x.5 or -x.6, adding 0.1 before runding will rounds down the number so you will need to take care with that adding -0.1 in that case.

If you find a solution please post Very Happy
Post 03 Aug 2005, 14:40
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Quote:
But this will make it a bit more harder to compute general round to nearest
But x.5 does not have a single number nearest. It is nearest to both integers. What does it matter if it rounds up or down, both are the same distance.
Post 03 Aug 2005, 16:25
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
In math (that you learn in school) you round any number up to 0.4(9) to ZERO and >=0.5 to ONE. Its just agreed on. When you have -0.5 did any teacher tell you where to round Smile ? Do you logically take ...hmm -1 or put bunch of numbers on i.e. x-axis and equally share every integer the same amount of ... "attention"?
I think if you add 0.000001 (single) or 1E-12 (double) to any number and round it - you're OK!
Other thing you might consider is adding 0.5 and truncating. Should give the same results. Why the last one works? When you got 3 floats: -0.5, ±0.0 and +0.5, you round them in your head and get 0,0,1 (equal share!). Now add 0.5 to each float resulting in ±0.0, +0.5, +1.0. You can already see what comes out after truncating.
Post 03 Aug 2005, 21:31
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Quote:
I think if you add 0.000001 (single) or 1E-12 (double) to any number and round it - you're OK
This will get you trouble with x.49999999...
Quote:
adding 0.5 and truncating
This is correct to get the result you desire.


Last edited by revolution on 04 Aug 2005, 15:57; edited 1 time in total
Post 03 Aug 2005, 23:06
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
My electronic calculator rounds -x.5 to x-1 and x.5 to x+1, if MCD needs this behavior he will need to add -0.5 before the truncation.

[edit]Add -0.5 when x is negative of course and add 0.5 if x is not negative[/edit]
Post 03 Aug 2005, 23:12
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
MCD check this with OllyDbg:
Code:
format PE GUI 4.0

  finit
  fstcw [control]
  wait
  or    [control], 0000110000000000b
  fldcw [control]
  mov   ecx, 6

.loop:
  fld   [nums+ecx*4]
  ftst
  fstsw ax
  fwait
  sahf
  sbb   eax, eax
  fadd  [_0.5+4+eax*4]
  frndint
  dec ecx
  jns .loop

int 3

_0.5 dd -0.5, 0.5

nums dd -1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5

control dw ?     


BTW, why FASM inserts a WAIT after ftst? FASM also inserts a WAIT after FNINIT and if I use FINIT inserts a WAIT before FINIT too. I'm using FASM 1.62
Post 04 Aug 2005, 05:51
View user's profile Send private message Reply with quote
MCD



Joined: 21 Aug 2004
Posts: 604
Location: Germany
MCD
You're correct locodelassembly, I already have used such rounding method, but it's just too slow Sad

Anyway, for SSE you can do it easier with cvtt-instructions.

And actually, sometimes this little rounding issue DOES matter a lot, as the following FPU general exponentiating code shows (a^b = e^( ln(a) * b ) with fscale):
Code:
        finit;init FPU to set rounding to round to nearest or even
        fld     dword [esi+4];this is exponent b
        fld     dword [esi];this is the base a
        fyl2x
        fld     st
        frndint
        fsubr   st,st(1)
        f2xm1
        fld1
        fadd
        fscale
        fstp    st(1)
        fstp    dword [edi];result a^b
    

Unfortunately we have to use frndint, fscale and so on because FPU logarythm doesn't accept all values. So we need rounded value of exponent for fscale later. All seems to work good. But the problem arrises now if you calculate something like 100^1.5. In this case, the rounding method differs from usual math round to nearest, and a value, which is 1 to small is passed to the fscale, so we got the false result 100^1.5 = 500 instead of 100^1.5 = 1000.

Unfortunately there is a work around, even without having to switch the FPU CW to truncate:
Code:
        finit
        fld     dword [esi+4]
        fld     dword [esi]
        fyl2x
        fld     st
        frndint
;        fsubr   st,st(1)
        fsub    st(1),st;fixed
        fxch    st(1)    ;fixed
        f2xm1
        fld1
        fadd
        fscale
        fstp    st(1)
        fstp    dword [edi]
    

This was the kind of buggy fpu calculation routine which made me running up the walls for hours

_________________
MCD - the inevitable return of the Mad Computer Doggy

-||__/
.|+-~
.|| ||


Last edited by MCD on 05 Aug 2005, 09:22; edited 3 times in total
Post 04 Aug 2005, 07:36
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7725
Location: Kraków, Poland
Tomasz Grysztar
Use FNSTSW and FNINIT for the opcodes that don't do FWAIT - check out in Intel manuals.
Post 04 Aug 2005, 10:11
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Opcode Instruction Description
9B D9 /7 FSTCW m2byte Store FPU control word to m2byte after checking for
pending unmasked floating-point exceptions.

D9 /7 FNSTCW* m2byte Store FPU control word to m2byte without checking for
pending unmasked floating-point exceptions.

You are right (as always), I thought putting a wait was responsibility of the coder and the wait version of the instructions are for wait for completeness.

MCD, sorry I don't understand too much maths, I'm very bad and I loose some years in my career for that. Sad
Post 04 Aug 2005, 14:22
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.