flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Tomasz Grysztar 14 Apr 2012, 20:32
Since AMD implemented FMA4 while Intel decided to use the 3-operand FMA, we now have two sets of instructions that do essentially the same, but have different syntaxes and encodings. But they are still so similar that it is possible to emulate one set with the other by using some simple macroinstructions to do the conversion. I decided to give it a try.
FMA4 is more flexible than FMA, so every FMA instruction can be encoded as FMA4 equivalent, but in the other direction it is possible only when two of the four operands are the same register. So emulation of FMA with FMA4 instructions is very simple and should work for all the correct FMA syntaxes: Code: irps m, m nm { irps a, add sub addsub subadd \{ irps s, ps pd ss sd \\{ macro vf#m\#a\\#132\\#s dest,src1,src2 \\\{ vf#m\#a\\#s dest,dest,src2,src1 \\\} macro vf#m\#a\\#213\\#s dest,src1,src2 \\\{ vf#m\#a\\#s dest,dest,src1,src2 \\\} macro vf#m\#a\\#231\\#s dest,src1,src2 \\\{ vf#m\#a\\#s dest,src1,src2,dest \\\} \\} \} } ; example conversions: vfmsub231ps ymm1,ymm2,ymm3 ; vfmsubps ymm1,ymm2,ymm3,ymm1 vfnmadd132sd xmm0,xmm5,[ebx] ; vfnmaddsd xmm0,xmm0,[ebx],xmm5 vfmadd213pd ymm0,ymm1,[esi] ; vfmaddpd ymm0,ymm0,ymm1,[esi] Emulation of FMA4 with FMA instructions is a bit harder, FMA4 instruction needs to have two of the operand being the same register, otherwise the simple conversion is not possible: Code: irps m, m nm { irps a, add sub addsub subadd \{ irps s, ps pd ss sd \\{ macro vf#m\#a\\#s dest,src1,src2,src3 \\\{ if dest eq src3 vf#m\#a\\#231\\#s dest,src1,src2 else if dest eq src2 vf#m\#a\\#213\\#s dest,src1,src3 else if dest eq src1 if src2 eqtype [si] | src2 eqtype byte[si] vf#m\#a\\#132\\#s dest,src3,src2 else vf#m\#a\\#213\\#s dest,src2,src3 end if else err; not encodable end if \\\} \\} \} } ; example conversions: vfmaddpd ymm0,ymm0,[esi],ymm2 ; vfmadd132pd ymm0,ymm2,[esi] vfmsubpd ymm0,ymm1,[esi],ymm0 ; vfmsub231pd ymm0,ymm1,[esi] vfnmaddps ymm0,ymm1,ymm0,[esi] ; vfnmadd213ps ymm0,ymm1,[esi] vfmsubsd xmm0,xmm0,xmm1,[esi] ; vfmsub213sd xmm0,xmm1,[esi] Code: irps m, m nm { irps a, add sub addsub subadd \{ irps s, ps pd ss sd \\{ macro vf#m\#a\\#s dest,src1,src2,src3 \\\{ if dest eq src3 vf#m\#a\\#231\\#s dest,src1,src2 else if dest eq src2 vf#m\#a\\#213\\#s dest,src1,src3 else if dest eq src1 if src2 eqtype [si] | src2 eqtype byte[si] vf#m\#a\\#132\\#s dest,src3,src2 else vf#m\#a\\#213\\#s dest,src2,src3 end if else if src3 eqtype [si] | src3 eqtype byte[si] if s eq ps | s eq pd vmova\\#s dest,src1 else vmov\\#s dest,dest,src1 end if vf#m\#a\\#213\\#s dest,src2,src3 else if src1 eqtype [si] | src1 eqtype byte[si] if s eq ps | s eq pd vmova\\#s dest,src2 else vmov\\#s dest,dest,src2 end if vf#m\#a\\#132\\#s dest,src3,src1 else if s eq ps | s eq pd vmova\\#s dest,src3 else vmov\\#s dest,dest,src3 end if vf#m\#a\\#231\\#s dest,src1,src2 end if end if \\\} \\} \} } ; example conversions: vfmaddpd ymm0,ymm1,[esi],ymm2 ; vmovapd ymm0,ymm2 ; vfmadd231pd ymm0,ymm1,[esi] vfmsubss xmm0,xmm1,xmm2,[ebx] ; vmovss xmm0,xmm0,xmm1 ; vfmsub213ss xmm0,xmm2,dword [ebx] Note that I'm just playing with macros here - I do not claim that this is the optimal way to do such emulation nor that it is a good idea at all. ![]() |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.