flat assembler
Message board for the users of flat assembler.
Index
> Main > Invalid C/C++ Translation. Anyone the right way? |
Author |
|
Matrix 03 Dec 2005, 19:23
jdawg, have you checked the variable sizes?
( precision: single, double, maeby tbyte ? ) or maeby you need to set the fpu rounding control (in the fpu status register) or simply , your PI is not accurate enough, the one built in has more digits as i remeber anyway. |
|||
03 Dec 2005, 19:23 |
|
jdawg 03 Dec 2005, 20:42
The variable sizes I used are 32bit IEEE format. I was using them on the assumption that a C float type was 32bit IEEE as well. The fldpi produced the same result. You may be right about the rounding control, probably set to 2 places or so to get a tighter result.
|
|||
03 Dec 2005, 20:42 |
|
Eoin 03 Dec 2005, 23:07
Check on a calculator and see which is giving the more correct answer.
|
|||
03 Dec 2005, 23:07 |
|
Madis731 04 Dec 2005, 09:48
Code: Pi dd 3.141592654 OneEighty dd 180.0 macro Deg2Rad ang { fld [ang] fld [Pi] ;3 or so ticks faster than fldpi fmul st0,st1 fld [OneEighty] fdiv st1,st0 fst dword[ang] } |
|||
04 Dec 2005, 09:48 |
|
madmatt 04 Dec 2005, 11:23
Why not just pre-calculate PI/180 and then just multiply the angle by this number?
deg2rad dd 0.017453292519943295 ;PI/180 fld [ang] fmul [deg2rad] |
|||
04 Dec 2005, 11:23 |
|
Madis731 05 Dec 2005, 14:14
Why not?
When I look at this: Code: fld [Pi] ;3 or so ticks faster than fldpi then the latest version would be the best |
|||
05 Dec 2005, 14:14 |
|
jdawg 23 Dec 2005, 19:42
I tried a bunch of things, including madmatt's post. I disassembled the C code and got something similar to the following.
Code: fld dword ptr [MemoryAddressPointingToAngle] fld qword ptr [MemoryAddressPointingToPi] fmulp fld qword ptr [MemoryAddressPointingToOneEighty] fdivrp st1,st0 ;this may have been reversed I can't remember but this is only an example fstp dword ptr [MemoryAddressPointingToAngle] When I tried writing my macro with this code, it produced even further from accurate results. I believe it's due to the overhead for using the C/C++ macro syntax. In the ASM version, popping the FPU stack when the only thing on it is what you need to work on is rediculous, but for some reason, it's what makes the C version of the code work. |
|||
23 Dec 2005, 19:42 |
|
madmatt 24 Dec 2005, 09:22
fasmw macro's cannot use floating point for calculatons, so, you will have to make a proc function. Also, for better precision you could:
Code: deg2rad dq 0.017453292519943295 ;PI/180 (Double Precision) proc DegToRad, angle fld [angle] fmul [deg2rad] ;return value is left on FPU stack (ST0). ret endp You'll have to define the constant(deg2rad) in your data section somewhere. |
|||
24 Dec 2005, 09:22 |
|
jdawg 07 Jan 2006, 21:27
Here it is. Finally. Before I show it to you, I just wanna take the time to say, that it's a sad state of affairs when it's easier how to figure out SSE,SSE2,SSE3 AND 3DNow before you can properly translate code from C/C++ to assembly language. I don't know maybe it's just me. Regardless, here we go.
Code: ;~~~Non Working First attempt~~~ macro Deg2Rad ang { fld [ang] fld [Pi] fmul st0,st1 fld [OneEighty] fdiv st1,st0 fst [ang] } OneEighty dq 180.0 Pi dq 3.141592654 The macro was off by about .2 decimal places, not severe, but still not a correct translation, which was all I was going for, at first. After disassembling the code I found the problem with the C macro overhead. After a while of screwing around, I came up with this. Code: ;~~~Working slow version ~~~ macro Deg2Rad ang { fld [ang] fmul [Pi] fdiv [OneEighty] fst [ang] } Pi and OneEighty are the same as above. The code is slow as hell (around 98 ticks), but at least now it works. I had always wanted to streamline the code by precalculating Pi/OneEighty but I left all that alone until I could get the macro to work. Here is the final code that just flies. Code: ;~~~fast Deg2Rad ~~~ macro Deg2Rad ang { fld [ang] fmul [PiDiv180] fst [ang] } PiDiv180 dq 0.01745329252 This cut the code down from 98 ticks to 25, which makes it faster than a procedure call to find the closest value in a prebuilt lookup table. Not to mention saves memory that the table would have occupied. Meaning that it can be used to for realtime 3D calculations within procedures. Pretty freaking sweet if you ask me. |
|||
07 Jan 2006, 21:27 |
|
jdawg 07 Jan 2006, 21:33
Quote: Posted by madmatt... That's wrong. At least as far as v1.43 and v1.62 are concerned. I haven't tries v1.65.5 but that may be where you are having difficulty. I knew it was possible because of the whole flat binary thing that FASM does, it seriously was do to C/C++ macro overhead. |
|||
07 Jan 2006, 21:33 |
|
Reverend 08 Jan 2006, 10:59
jdawg: No, you are wrong. Floating point calculations are not possible in FASM. Tomasz once showed macros that could somehow emulate such possibility but it is not FASM's built-in feature
|
|||
08 Jan 2006, 10:59 |
|
jdawg 18 Feb 2006, 19:43
Reverend:
Is this because of another similarity between FASM and NASM? I read once, in the NASM documentation, that NASM didn't do preprocessing on the FPU code, because the assembler itself could be run on Big Endian machines. Either way, the code does exactly what I want it to do. I have to use memory operands, but apparently the opcodes have been written into the correct locations. |
|||
18 Feb 2006, 19:43 |
|
Reverend 18 Feb 2006, 22:57
It's just one of design decisions made by Tomasz during FASM creation. Maybe he thought it's too hard or too slow or whatever. I can't say.
|
|||
18 Feb 2006, 22:57 |
|
Madis731 19 Feb 2006, 14:24
I think that the problem is about understanding here:
jdawg: you really can make macros and they work, but what madmatt wanted to tell you was that you can't do calculations at assembly time which means you have to calculate 1/180 first on a calculator and then insert it into the code: Code: dt 3.1415926535897932384626433832795/180.0 ; You'll get an "invalid name" error here. dt 0.017453292519943295769236907684886 ; This is legal Why I would choose dt is that the internal FPU holds floats as 10 bytes so theoretically CPU has less converting to do. I don't know wheather it held true for real applications, but this is what I think the most optimal. EDIT: sorry, I looked up my optimizations manuals and discovered that the 32/64 bit forms are one microoperation, while 80-bit one takes 4 microoperations . That is big news for me (btw, FLDPI takes 3 microoperations). |
|||
19 Feb 2006, 14:24 |
|
jdawg 22 Feb 2006, 23:17
Madis731
I originally just kept the value a 32bit float because, I was planning on using SSE to do the rest of the calculations. This would save a little extra conversion later. As far as the Assembly Time calculations, that would definately explain alot. Where does one go to find more information about FASM's capabilities? |
|||
22 Feb 2006, 23:17 |
|
Madis731 23 Feb 2006, 07:17
Most of the stuff you will find in the manuals, but the "cutting-edge" is explained by Tomasz Grysztar on these forums. Some features/macros get into the final release - others are just a niche that one out of a hundred will use.
I think you meant capabilities for translating from C...this could be achived by macros, but the true meaning of assembly is its raw access, but macros tend to hide it so you are better of with plain assembly |
|||
23 Feb 2006, 07:17 |
|
Tomasz Grysztar 23 Feb 2006, 09:28
Reverend wrote: It's just one of design decisions made by Tomasz during FASM creation. Maybe he thought it's too hard or too slow or whatever. I can't say. I a followed a NASM in this decision, and it was mainly because fasm has to run on the 386 processors without FPU, so any FP-related operations it needs to implement itself. |
|||
23 Feb 2006, 09:28 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.