flat assembler
Message board for the users of flat assembler.
 flat assembler > Main > Exponential instruction ? Goto page 1, 2  Next
Author
Mino

Joined: 14 Jan 2018
Posts: 97

# Exponential instruction ?

Hello,
I would like to know if there was an instruction dedicated to exhibitors' calculations. There are add, sub, div, ... but is there any instruction for exponential?

_________________
The best way to predict the future is to invent it.
30 Apr 2018, 00:21
alexfru

Joined: 23 Mar 2014
Posts: 60
f2xm1?
30 Apr 2018, 01:18
donn

Joined: 05 Mar 2010
Posts: 100
Yeah, I don't think there is a single exponential instruction on newer instruction sets?

x87 had the logarithmic instructions: f2xm1, fscale, fyl2x, fyl2xp1, as mentioned, but I think the x87 instructions are usually discouraged.

f2xm1:
 Quote: This instruction, when used in conjunction with the FYL2X instruction, can be applied to calculate z=x y by taking advantage of the log property xy =2y*log2x.

When the exponent is an integer, mul can be used in an accumulating loop. There are square root functions and reciprocal functions when dealing with exponents as .5 and negative exponents. rsqrtss is an example. They use the Newton-Raphson method to approximate the result. I was just reading about Newton-Raphson in an elasticity/mechanics book by J.T. Oden, and want to try to compute it. Aside from x87, and values of .5, not sure if there is a fractional exponent instruction?

If you didn't mean exponential functions, but euler's number, I guess you could just load it as a constant, or calculate that as well with factorials or other approaches.
30 Apr 2018, 01:36
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 15727
Location: (514107) 2015 BZ509
 donn wrote: ... I think the x87 instructions are usually discouraged.
I can't see them being dropped from the CPUs any time soon. And even if they are dropped from the hardware sometime in the future, all the major OSes will emulate them. Emulation is not hard, it was being done a long time ago when the x87 chips were an optional extra. So I don't think there is any issue with using the x87 instructions.
30 Apr 2018, 02:37
donn

Joined: 05 Mar 2010
Posts: 100
In terms of compatibility, the x87 instructions are probably safe to use way down the road. They also support double-extended precision, 80 bits of accuracy on scalar types.

These special x87 features are good examples to use X87 instead of SSE:

From Intel's optimization manual:
 Quote: Assembly/Compiler Coding Rule 62. (M impact, M generality) Use Streaming SIMD Extensions 2 or Streaming SIMD Extensions unless you need an x87 feature. Most SSE2 arithmetic operations have shorter latency then their X87 counterpart and they eliminate the overhead associated with the management of the X87 register stack.

Aside from the compatibility consideration, there is the performance consideration also. Maybe this is why the transcendental (trigonometric and exponential) functions were not carried over to the SSE/AVX instruction sets in full. You can optimize for latency or throughput differently with different accuracy.

Intel says:

 Quote: Although x87 supports transcendental instructions, software library implementation of transcendental function can be faster in many cases.
They also recommend inlining instead of calling the function for performance. Someone provided a neat lookup-table based method of computing sine cosine here, and I've implemented sine cosine, tangent and arctangent with Taylor/Maclaurin series in only a couple instructions, but with less accuracy.
30 Apr 2018, 04:23
bitRAKE

Joined: 21 Jul 2003
Posts: 2651
Location: dank orb
Another example of what donn is saying:

https://stackoverflow.com/questions/47025373/fastest-implementation-of-exponential-function-using-sse
(see njuffa's post for different precision routines)
30 Apr 2018, 05:08
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 15727
Location: (514107) 2015 BZ509
 bitRAKE wrote: Another example of what donn is saying: https://stackoverflow.com/questions/47025373/fastest-implementation-of-exponential-function-using-sse (see njuffa's post for different precision routines)
That is good. But it is complex and uses more of the Icache. If you are doing millions of them each second then it might be worthwhile to invest the time to code it. Else I'd just stick with the x87 single instruction.
30 Apr 2018, 05:21
Tomasz Grysztar
Assembly Artist

Joined: 16 Jun 2003
Posts: 6813
Location: Kraków, Poland
AVX-512 ER comes with VEXP2PS/VEXP2PD - described as "Approximation to the Exponential 2^x of Packed Double-Precision Floating-Point Values with Less Than 2^-23 Relative Error".
30 Apr 2018, 07:21
Mino

Joined: 14 Jan 2018
Posts: 97

So, if I understood correctly: FASM does not have arithmetic instructions to calculate exponents. However, you can use x86 instructions for such uses, for example this one :
f2xm1
However, is it only allows to calculate the given exponent, with a base already defined at 2?

Would you recommend me to create a dedicated function ( https://pastebin.com/r9TUmUuT ), as shown in the links of your answers, or use this instruction (or other(s) )?

_________________
The best way to predict the future is to invent it.
30 Apr 2018, 07:41
DimonSoft

Joined: 03 Mar 2010
Posts: 231
Location: Belarus
 Mino wrote: FASM does not have arithmetic instructions to calculate exponents.

Instructions are implemented in CPU, FASM doesn’t really have an xadd.

 Mino wrote: Would you recommend me to create a dedicated function ( https://pastebin.com/r9TUmUuT ), as shown in the links of your answers, or use this instruction (or other(s) )?

There will always be things that are not readily available.
30 Apr 2018, 11:13
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 15727
Location: (514107) 2015 BZ509

DimonSoft wrote:
 Mino wrote: FASM does not have arithmetic instructions to calculate exponents.

Instructions are implemented in CPU, FASM doesn’t really have an xadd.

I expect Mino meant "FASM does not have arithmetic operators to calculate exponents".
30 Apr 2018, 11:16
donn

Joined: 05 Mar 2010
Posts: 100
Right, the quickest and simplest way to get an exponential function implemented (mul iterations could be used if you are just squaring, for instance) is:

 Quote: Else I'd just stick with the x87

You can put your base (x) in st0, exponent (y) in st1, call fyl2x, call f2xm1, and then add 1, I believe: x^y = z.

I'm a bit tired today, but I think those are the steps, which could be verified (or disproved) with a calculator or testing. If you want a base other than 2, you can use these two instructions in conjunction, as the AMD docs mention.

If you want to tune or improve performance, inlining could be the next step, or the AVX-512 ER VEXP2PS/VEXP2PD instructions if you have those.

Again, I'm a bit tired today so someone may correct me if what I'm saying is off, but I've received a lot of good advice from these guys here, so their recommendations are pretty safe bets for learning and in practice. Also, I've found the cvtss2si instructions helpful (there are few combinations documented in the manuals) for debugging if you don't have a way to view floating point numbers yet.
30 Apr 2018, 15:36
Mino

Joined: 14 Jan 2018
Posts: 97
Thank you very much for those explanations and clarifications. This will probably be very useful for the rest of my project
30 Apr 2018, 20:01
rugxulo

Joined: 09 Aug 2005
Posts: 2279
Location: Usono (aka, USA)
Check Free Pascal's rtl/i386/math.inc for function fpc_exp_real (etc etc).
02 May 2018, 02:07
Furs

Joined: 04 Mar 2016
Posts: 1125
FYI x87 is part of the ABI on both x86 and x64 (for the latter it's when you use the long double type in C). I don't think it will be dropped. It would be a real shame if they did drop it, since 80-bit precision is really useful for certain cases (especially when working with math on 64-bit integers, which double can't -- I mean "math" like sqrt on them and stuff like that).

It's also unique with the register stack, and compact in encoding, which is pretty cool concept for me (stack-machines are interesting also). I know I'm against the "trend" or pissing against the wind since everyone seems to hate the register stack and prefer straight (and bloated encoding) of registers with 3 or more operands... because of "simplicity" (that's arguable, to me stack-based machine is MUCH simpler to implement, and x87 is far simpler than SSE, but whatever) or due to retarded compilers' generated code... meh
02 May 2018, 20:08
DimonSoft

Joined: 03 Mar 2010
Posts: 231
Location: Belarus
 Furs wrote: FYI x87 is part of the ABI on both x86 and x64 (for the latter it's when you use the long double type in C). I don't think it will be dropped. It would be a real shame if they did drop it, since 80-bit precision is really useful for certain cases (especially when working with math on 64-bit integers, which double can't -- I mean "math" like sqrt on them and stuff like that). It's also unique with the register stack, and compact in encoding, which is pretty cool concept for me (stack-machines are interesting also). I know I'm against the "trend" or pissing against the wind since everyone seems to hate the register stack and prefer straight (and bloated encoding) of registers with 3 or more operands... because of "simplicity" (that's arguable, to me stack-based machine is MUCH simpler to implement, and x87 is far simpler than SSE, but whatever) or due to retarded compilers' generated code... meh

It won’t be dropped for compatibility reasons. Although they can try some day but that is going to be a big failure.

And, speaking about SSE, let me say that instruction names are overbloated there. It’s always a pain to look for the right instruction, especially if you want to target, say, SSE2–, not the very recent version.
02 May 2018, 22:03
rugxulo

Joined: 09 Aug 2005
Posts: 2279
Location: Usono (aka, USA)
I don't think "standard" C supports "long double". IIRC, anything using MSVCRT.DLL (e.g. TinyC) only supports "long double" same as "double" (64-bit). OpenWatcom too, IIRC, but I could be remembering incorrectly. Not sure if all standard(s) are the same on this, though (C99 vs. C11 or C17). Things do change sometimes. Just because GCC supports it doesn't mean everyone else does. (Not sure about MinGW either, which probably still "mostly" relies on MSVCRT.DLL.)

Another problem with the FPU is that it's always misaligned, but I guess the OS can (sometimes) be smart enough to use (late P2-era) FXSAVE, which is reputedly faster than FNSAVE.

IIRC, many architectures (DEC Alpha?) didn't support beyond "double" (64-bit) anyways. IIRC, Oberon usually supports "REAL" and "LONGREAL" but nothing else. Turbo Pascal/Delphi/FPC all have Extended, but I don't know how (or if) it fully supports that on AMD64/SSE2 or whatever other platforms.

I always got the feeling that Intel wanted to deprecate/remove x87 entirely in lieu of SSE2. Nowadays, with AVX-512, who knows?? GCC always assumed an FPU, but I swear many compilers have become "SSE2 only" anyways. So while I may not agree, I do think it's definitely deprecated and shunned and won't be supported forever.
05 May 2018, 00:17
Furs

Joined: 04 Mar 2016
Posts: 1125
long double exists since C89 and improved in C99 even with standard library functions (the 'l' suffix versions for the math functions). As usual, it's msvcrt which is non-compliant, but that's nothing new.

long double is part of the Linux ABI and calling convention in both 32-bit and 64-bit. For example in x64 (which assumes SSE2, that's why I'm mentioning it), if you make a function returning long double, it's required to place the value in x87. Even returning a "complex" long double is defined in the ABI -- where it's mandated to return the real part in st0 and imaginary in st1.

But the type itself also usually forces code to use x87. Yeah, GCC is great here since you can fine-tune the fpmath to use with a setting, even without long double, but I think it works at least with Clang and ICC, no idea about Visual Studio's compiler.
05 May 2018, 23:37
rugxulo

Joined: 09 Aug 2005
Posts: 2279
Location: Usono (aka, USA)
06 May 2018, 01:11
Mino

Joined: 14 Jan 2018
Posts: 97
 Furs wrote: As usual, it's msvcrt which is non-compliant, but that's nothing new.

Is it then a "bad" thing to use it in our projects? Is there another library (compatible with Linux by the way) that is more "powerful" and compliant?

_________________
The best way to predict the future is to invent it.
06 May 2018, 09:42
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------Blog General----------------MainDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsCompiler InternalsIDE DevelopmentOS ConstructionNon-x86 architecturesHigh Level LanguagesProgramming Language DesignProjects and IdeasExamples and Tutorials Other----------------FeedbackHeapTest Area
Goto page 1, 2  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum