flat assembler
Message board for the users of flat assembler.

Index > Main > fpu Store instruction

Author
Thread Post new topic Reply to topic
Zoltanmatey31



Joined: 10 Jan 2023
Posts: 20
Zoltanmatey31 16 Jan 2023, 20:01
"fst copies the value of st0 register to the destination operand, which can be 32–bit
or 64–bit memory location or another FPU register."

can it not also be the destination (only operand) an 80 bit memory location?
Post 16 Jan 2023, 20:01
View user's profile Send private message Reply with quote
Zoltanmatey31



Joined: 10 Jan 2023
Posts: 20
Zoltanmatey31 16 Jan 2023, 20:02
silly me: "fstp accepts the same
operands as the fst instruction and can also store value in the 80–bit memory."

WRITTEN RIGHT THERE.
Post 16 Jan 2023, 20:02
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1692
Location: Toronto, Canada
AsmGuru62 16 Jan 2023, 23:15
I was surprised to know that FST cannot store into 80-bit memory.
Only FSTP can do it.
So, imagine you have all 8 registers in FPU busy with values and you need to just dump ST0 (all 80 bits) into memory.
And you cannot do it.
Post 16 Jan 2023, 23:15
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20513
Location: In your JS exploiting you and your system
revolution 16 Jan 2023, 23:22
Use FSAVE if you want to examine FPU registers.
Post 16 Jan 2023, 23:22
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2595
Furs 17 Jan 2023, 14:08
AsmGuru62 wrote:
I was surprised to know that FST cannot store into 80-bit memory.
Only FSTP can do it.
So, imagine you have all 8 registers in FPU busy with values and you need to just dump ST0 (all 80 bits) into memory.
And you cannot do it.
How would you end up in such a situation? Usually after every operation, you pop something off the stack to replace the result with.

If you just use unary operators that replace the top of the stack and push back new value, then why do you need to fill the other 7 registers? What are they filled for?

I guess that's what Intel thought when designing the instruction encodings. Can't blame them.
Post 17 Jan 2023, 14:08
View user's profile Send private message Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1062
Location: Russia
macomics 17 Jan 2023, 15:14
Furs wrote:
If you just use unary operators that replace the top of the stack and push back new value, then why do you need to fill the other 7 registers? What are they filled for?

In order to reduce the number of loads and stores of values, you can use more than one top register.

Old example
Code:
; Cx = Az * By - Ay * Bz
; Cy = Ax * Bz - Az * Bx
; Cz = Ay * Bx - Ax * By
if used fpuVectorMultVA
  fpuVectorMultVA:
; Ax = st0, Ay = st1, Az = st2, Bx = st3, By = st4, Bz = st5
        fld     st1
        fmul    st0, st4
        fld     st1
        fmul    st0, st6
        fsubp   st1, st0
        fxch    st3
        fld     st0
        fmul    st0, st6
        fxch    st2
        fmul    st0, st7
        fxch    st3
        fmul    st0, st7
        fsubp   st2, st0
        fmul    st0, st4
        fsubp   st2, st0
; Cx = st0, Cy = st1, Cz = st2, Bx = st3, By = st4, Bz = st5
        retn
end if; used fpuVectorMultVA    
Two vectors as 6 values are loaded to the top of the coprocessor stack. After multiplication, the vector loaded last is replaced by the result.

Since the loading algorithm is implemented separately, you do not need to write several identical algorithms for integers or for single or double precision real numbers.

Also, the data may be in the coprocessor for longer than a single operation. For example, immediately after multiplication, the normalization function can be called for the first three numbers loaded in the coprocessor.
Post 17 Jan 2023, 15:14
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1692
Location: Toronto, Canada
AsmGuru62 17 Jan 2023, 18:40
I was doing some formulas for Gravity Corrections and they are huge!
So, I was calculating some intermediate values and keeping it in FPU until needed.
I could have saved it, but I needed all 80-bits of precision.
An example of routine in FORTRAN I had to port:
Code:
      subroutine innerzone(m,n,u,v,z,f)
      implicit double precision (a-h,o-z)
      real m,n
      h1=m+sqrt(1+m**2)
      g1=n+sqrt(1+n**2)
      p=1/(1+v**2)
      g2=n+u*v*p
      h2=m+u*v*p
      t=1/sqrt(1+v**2)
      g3=sqrt((n+u*v*p)**2+(1+u**2)*p-(u*v*p)**2)
      h3=sqrt((m+u*v*p)**2+(1+u**2)*p-(u*v*p)**2)
      f=z*(log(g1/h1)-t*log((g2+g3)/(h2+h3)))
      return
      end
    
Post 17 Jan 2023, 18:40
View user's profile Send private message Send e-mail Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.