flat assembler
Message board for the users of flat assembler.

Index > Main > fma4 with fasmg?

Author
Thread Post new topic Reply to topic
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 03 Oct 2020, 12:44
I have some tests:
Code:
addition ordinary                                                                                                                                                                                     
201 301 401 501                                                                                                                                                                                       
601 701 801 901                                                                                                                                                                                       
93.544222                                                                                                                                                                                             
addition fma                                                                                                                                                                                          
301 401 501 601                                                                                                                                                                                       
701 801 901 1001                                                                                                                                                                                      
112.271838
addition fma4
401 501 601 701
801 901 1001 1101
137.837093
    

with fma4 I get significant boost. fasm 1 supports fma4, but fasmg does not.
Zen and Zen+ both support fma4, and that is undocumented, but we are assembler
hackers? Wink
Code:
/.../examples/x86 >>> INCLUDE=./include fasmg gather.asm                                                                                                                                            
flat assembler  version g.j27m
gather.asm [366]:
        vfmaddps ymm4,ymm0,ymm3,ymm2
Processed: vfmaddps ymm4,ymm0,ymm3,ymm2
Error: illegal instruction.
    

Perhaps I am missing some header?
Code:
include 'format/format.inc'
include 'ext/avx.inc'
include 'ext/avx2.inc'
include 'ext/fma.inc'
    
Post 03 Oct 2020, 12:44
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8367
Location: Kraków, Poland
Tomasz Grysztar 03 Oct 2020, 15:58
The headers I published are focused on the instruction sets that ended up officially documented in Intel SDMs, and FMA4 was not among them - it was only officially supported with AMD's XOP, and I have not yet published packages for fasmg that would support AMD-specific extensions (like 3DNow! or XOP), mainly because there was no demand. With macro-based encoders it quite easy to implement new instruction sets, though, and you can support FMA4 with a simple header like:
Code:
include 'cpu/ext/avx.inc'

macro FMA4_instruction opcode,msize,dest,src,src2,src3
        AVX.parse_operand @dest,dest
        AVX.parse_operand @src,src
        AVX.parse_operand @src2,src2
        AVX.parse_operand @aux,src3
        if @dest.type = 'mmreg' & @src.type = 'mmreg' & (@src2.type = 'mem' | @src2.type = 'mmreg') & (@aux.type = 'mem' | @aux.type = 'mmreg')
                if msize & (@dest.size < msize | (@dest.size > msize & @dest.size <> 16) | (@src2.type = 'mem' & @src2.size and not msize) | (@aux.type = 'mem' & @aux.size and not msize))
                        err 'invalid operand size'
                else if @src.size <> @dest.size | (@src2.size and not @dest.size & (@src2.type = 'mmreg' | msize = 0))
                        err 'operand sizes do not match'
                end if
                if @aux.type = 'mmreg'
                        AVX.store_instruction @dest.size,VEX_66_0F3A_W0,opcode,@src2,@dest.rm,@src.rm,1,@aux.rm shl 4
                else if @src2.type = 'mmreg'
                        AVX.store_instruction @dest.size,VEX_66_0F3A_W1,opcode,@aux,@dest.rm,@src.rm,1,@src2.rm shl 4
                else
                        err 'invalid combination of operands'
                end if
        else
                err 'invalid combination of operands'
        end if
end macro

iterate <instr,opcode>, vfmaddsub,5Ch, vfmsubadd,5Eh, vfmadd,68h, vfmsub,6Ch, vfnmadd,78h, vfnmsub,7Ch

        macro instr#pd? dest*,src*,src2*,src3*
                FMA4_instruction opcode+1,0,dest,src,src2,src3
        end macro

        macro instr#ps? dest*,src*,src2*,src3*
                FMA4_instruction opcode,0,dest,src,src2,src3
        end macro

        if opcode > 60h

                macro instr#sd? dest*,src*,src2*,src3*
                        FMA4_instruction opcode+3,8,dest,src,src2,src3
                end macro

                macro instr#ss? dest*,src*,src2*,src3*
                        FMA4_instruction opcode+2,4,dest,src,src2,src3
                end macro

        end if

end iterate    
I made it just now, so it not thoroughly tested - but it should be easy to tweak it if necessary.

PS. Perhaps you may also find these macros for fasm 1 interesting: Emulating FMA4 with FMA and vice versa.
Post 03 Oct 2020, 15:58
View user's profile Send private message Visit poster's website Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 04 Oct 2020, 11:13
Thanks!
Post 04 Oct 2020, 11:13
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.