flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
AsmGuru62 02 May 2020, 15:51
Code: fdiv st0, st2 fdiv st1, st3 <-- cannot assemble this one. The Intel Manual says: DC F8+i ---> FDIV ST(i), ST(0) |
|||
![]() |
|
AsmGuru62 02 May 2020, 17:36
Wow! I did not see this.
Thank you! |
|||
![]() |
|
revolution 02 May 2020, 22:47
All of the legacy FP instructions can only encode a single register.
If you wanted to have Fop st(i),st(j) then you need a way to encode two registers, i & j. The newer MMX instructions and up can encode a second register. |
|||
![]() |
|
AsmGuru62 02 May 2020, 23:47
You mean SSE2 instruction set for packed doubles and floats. I wanted to use those, but they do not have ATAN2 or LOG operations. I suppose I can use a mixed set.
|
|||
![]() |
|
revolution 03 May 2020, 00:18
Yes. All the MMX, SSE, SSE2, ..., AVX512 are not stack based and have at least two registers in the encoding, a source and a destination.
But there are no native trigonometric instructions outside of the legacy FP. If you are doing many trig ops then it might be more efficient overall to use an SSE+ based algorithm. It depends upon what you need them for. |
|||
![]() |
|
Tomasz Grysztar 03 May 2020, 09:13
revolution wrote: But there are no native trigonometric instructions outside of the legacy FP. If you are doing many trig ops then it might be more efficient overall to use an SSE+ based algorithm. It depends upon what you need them for. |
|||
![]() |
|
AsmGuru62 03 May 2020, 12:24
Interesting link, Tomasz. Thanks.
I have a client - they use C++ legacy app to scan a ~20Gb file with float point values and do some statistics and it is slow. I'll just compare the results from my new FPU code and their legacy app and just see if they match. So, where I can find some resources on how to make FPATAN equivalent using SSE+? |
|||
![]() |
|
donn 04 May 2020, 03:58
Had a few old transcendental function implementations sitting around. Just posted them, they use double precision radian values, think accuracy is up to 4 places. They use Gregory's series and Maclaurin's series on the sine/cosine:
https://github.com/hpdporg/datap/blob/bfbe7544ba6e8aadbfa2b1d8b8cc021677458860/src/main/asm/Numeric/Transcendental.inc#L228 And tests/usage: https://github.com/hpdporg/datap/blob/master/src/test/cpp/Numeric/Transcendental.cpp The range of input values may be limited to just small values, but shows some of the SSE instructions and the values I've tested in that range seem correct when using a calculator. |
|||
![]() |
|
AsmGuru62 04 May 2020, 14:39
Thank you.
|
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.