flat assembler
Message board for the users of flat assembler.

Index > Main > Assembling HeavyThing 1.24 with fasmg

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8000
Location: Kraków, Poland
Tomasz Grysztar
The HeavyThing library is one of the largest publicly available projects written in fasm, and I've been using it as a testing ground and a benchmark for fasmg for years. And with x86-64 instruction sets rewritten in CALM the performance is finally good enough that I decided to publish the macros needed to assemble the latest HeavyThing with fasmg. In fact, the assembly time is now similar for both fasm 1 and fasm g (mainly thanks to fasm 1 spending more time on preprocessing).

I added a layer of fun to this by preparing the macros to assemble HeavyThing 1.24 sources without having to change any of the original files - which made this a challenge a little similar to my emulation of some old DOS assemblers.

First the list of ingredients:

And to assemble, for example, rwasa web server, go to HeavyThing-1.24/rwasa/ and use the command:
Code:
fasmg rwasa.asm -iinclude\ \'../fasmg/ht.inc\'    
or, if you are on Windows:
Code:
fasmg rwasa.asm -iinclude('..\fasmg\ht.inc')    


You may notice that the file produced by fasmg is a little smaller than the one originally made by fasm. This is because of "globals" macro (defined in dataseg_macros.inc), which moves the definition of data into a common place (marked with "globalVars") and because it is done by preprocessor, ignores the IF conditions. Even though the implementation of the macro has been extended to trace the conditions, it only recognizes ones of the form "if used ...", and not all of them are. This causes fasm to include some data that is not really necessary. But I found a simple remedy - we can use a "canary" variable to check the circumstances and decide whether specific block should be assembled.

Let's then replace the macros in HeavyThing-1.24/dataseg_macros.inc with these:
Code:
macro globals {
        local   z
        .gvar_stack equ z
        canary_#z = 1
        macro   z
}

macro globalVars {
        irpv z,.gvar_stack \{
                if defined canary_\#z
                        z
                end if
        \}
}    
and now fasm 1 also produces the smaller file. Comparing it with the one produces by fasmg you should see that the only differences are in some alignment fillers in the data section (and these are of no consequence).


Description: To be unpacked into HeavyThing-1.24/fasmg
Download
Filename: HeavyFlex.zip
Filesize: 2.6 KB
Downloaded: 128 Time(s)

Post 13 Feb 2022, 21:58
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8000
Location: Kraków, Poland
Tomasz Grysztar
All the files in HeavyFlex.zip are quite small, so I'm going to show their text in full here, and comment on what they do.

HeavyThing-1.24/fasmg/ht.inc:
Code:
include '@@.inc'                        ; emulate fasm's anonymous labels

include 'format/format.inc'             ; emulate fasm's FORMAT directive

macro? format? options*
        format options
        ; by default, emulated FORMAT only includes the basic x86 CPU instruction set,
        ; any required extensions need to be included manually:
        include 'fasmg/cpu/ext/sse4.1.inc'
        include 'fasmg/cpu/ext/aes.inc'
end macro

macro? include: file*
        local path
        path = file

        if path eq 'align_macros.inc'
                path = 'fasmg/align_macros.inc'

        else if path eq 'dataseg_macros.inc'
                path = 'fasmg/dataseg_macros.inc'

        else if path eq 'call.inc'
                path = 'fasmg/call.inc'

        else if path eq 'cleartext.inc' ; starting here, legacy macro emulation is needed
                include 'fasmg/macro.inc'
        end if

        include? path
end macro    
This is the main header, which provides basic fasm 1 compatibility (emulating things like FORMAT and anonymous labels) and also overrides INCLUDE directive to provide replacement packages for some of the macro sets. The replacement "include" is a lowercase one, and is defined as recursive, so "include?" is used to call the original case-insensitive directive.

HeavyThing-1.24/fasmg/@@.inc:
Code:
macro? @INIT name,prefix

        macro? name tail&
                match label, prefix#f?
                        label tail
                        prefix#b? equ? prefix#f?
                        prefix#r? equ? prefix#f?
                end match
                local anonymous
                prefix#f? equ? anonymous
        end macro

        define prefix#f?
        name

end macro

@INIT @@,@    
Emulation of fasm's anonymous labels, based on fasmg's advanced package (available at https://github.com/tgrysztar/fasmg/blob/master/packages/utility/@@.inc).

HeavyThing-1.24/fasmg/align_macros.inc:
Code:
struc bstr? bytes&
        virtual at 0
                db bytes
                load . : $ from 0
        end virtual
end struc

define aligncode aligncode
namespace aligncode
        ?15 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0x66, 0xf, 0x1f, 0x44, 0x00, 0x00
        ?14 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0xf, 0x1f, 0x44, 0x00, 0x00
        ?13 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0xf, 0x1f, 0x40, 0x00
        ?12 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0xf, 0x1f, 0x00
        ?11 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0x66, 0x90
        ?10 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, \
                 0x90
        ?9 bstr 0x66, 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00
        ?8 bstr 0xf, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00
        ?7 bstr 0xf, 0x1f, 0x80, 0x00, 0x00, 0x00, 0x00
        ?6 bstr 0x66, 0xf, 0x1f, 0x44, 0x00, 0x00
        ?5 bstr 0xf, 0x1f, 0x44, 0x00, 0x00
        ?4 bstr 0xf, 0x1f, 0x40, 0x00
        ?3 bstr 0xf, 0x1f, 0x00
        ?2 bstr 0x66, 0x90
        ?1 bstr 0x90
        ?0 := ''
end namespace

define virtual          virtual
define end_virtual      end virtual

iterate <name,alignment>, falign,function_alignment, calign,inner_alignment, dalign,data_alignment

        define name     align alignment

        calminstruction name
                check   alignment
                jno     done
                local   a
                assemble virtual
                assemble name
                compute a, $ - $$
                assemble end_virtual
                check   a
                jno     done
                arrange a, =db =aligncode.a
                assemble a
            done:
        end calminstruction

end iterate    
A replacement for the code alignment macros that HeavyThing uses, implemented with CALM for better performance.

HeavyThing-1.24/fasmg/call.inc:
Code:
macro? call? target*
        if align_callreturns
                local c, r, a
                virtual
                        align 16
                        a = $ - $$
                end virtual
                a = a - (r - c)
                ; a can be negative as well as positive
                repeat 1, pa: a and 0Fh
                        db aligncode.pa
                end repeat
        c:      call    target
        r:
        else if align_returns
                local   c
                push    c
                jmp     target
                calign
                c:
        else
                call    target
        end if
end macro    
Conversion of the automatically-aligning "call" macro, making use of the alignments table defined in the previous file.

HeavyThing-1.24/fasmg/dataseg_macros.inc:
Code:
define globalVars

macro? globals arg&
        local status
        status = 0
        calminstruction ?! line&
                match   , line
                jyes    done
                check   status
                jyes    already_opened
                match   {line, line
                jyes    initial
                match   {, line
                jyes    open
                arrange line, =err 'syntax error'
                assemble line
                exit
            open:
                compute status, 1
                exit
            initial:
                compute status, 1
            already_opened:
                local   post
                match   line} post?, line
                jyes    final
                match   } post?, line
                jyes    close
                take    globalVars, line
                exit
            final:
                take    globalVars, line
            close:
                arrange line, =purge ?
                assemble line
                assemble post
            done:
        end calminstruction
        arg
end macro

calminstruction globalVars
        local   tmp
    reverse:
        take    tmp, globalVars
        jyes    reverse
    execute:
        take    globalVars, tmp
        jno     done
        assemble globalVars
        jump    execute
    done:
end calminstruction    
These are a bit complex, because we need to emulate fasm's brace-enclosed blocks. CALM helps to not spend too much assembler's resources on handling the foreign syntax.

HeavyThing-1.24/fasmg/macro.inc:
Code:
calminstruction macro declaration&
        local any, name
        match any?[name], declaration
        jno pass
        local dbg
        initsym dbg, display 'Warning: unsupported legacy syntax, attempting partial translation...',10,9,'macro ',declaration,10
        stringify declaration
        assemble dbg
        arrange declaration, =macro? any name&
        assemble declaration
        exit
    pass:
        arrange declaration, =macro? declaration
        assemble declaration
end calminstruction

macro common?
end macro

macro? macro declaration&
        local started, ns
        started = 0
        define ns ns
        calminstruction ?! line&
                local command, name
                check started
                jyes body
                match command? { line?, line
                jno pass
                assemble command
                compute started, 1
            body:
                match line? } command?, line
                jyes close
                match =local? line, line
                jno pass
                arrange command, =local
            local_list:
                match item=,line, line
                jyes local_item
                arrange item, line
                arrange line,
            local_item:
                arrange name, item
                match .item, item
                jno local_item_ready
                match .item, item
                arrange name, ns.name
                publish name, item
            local_item_ready:
                arrange command, command item
                match , line
                jyes local_list_ready
                arrange command, command=,
                jump local_list
            local_list_ready:
                assemble command
                exit
            pass:
                transform line, ns
                assemble line
                exit
            close:
                transform line, ns
                assemble line
                arrange line, =end =macro
                assemble line
                arrange line, =purge ?
                assemble line
                assemble command
                exit
        end calminstruction
        esc macro declaration
end macro

calminstruction (var) equ! value&       ; make EQU unconditional to simulate separation of preprocessing stage
        transform value
        publish var, value
end calminstruction    
And here is the emulation of some features of fasm's preprocessor, just enough to be able to assemble all other macros that HeavyThing defines. You may notice that "macro" is redefined twice - I separated different types of conversion to make it more manageable.

The first one just replaces declarations like:
Code:
macro cleartext name*, [val*]    
with
Code:
macro cleartext name*, val*&    
and chooses to ignore the keyword COMMON. This is not a general solution (for example, if there was any FORWARD block, it would need to be converted into ITERATE, etc.), but it's enough for the purposes of assembling HeavyThing.

The second one converts the macro body, handling the brace-enclosed syntax, and also handling LOCAL declarations that don't work with fasmg because of its namespaces design:
Code:
local ..str,..sz,..ch,..ci,..cj,..cc,..ign,..dat,..idx,..pad    
These are converted into names with stripped dots - with fasmg they are just symbols in a completely separate namespace.

Finally, I had to make EQU be processed unconditionally, like fasm 1 does it, because of this line in HeavyThing-1.24/vdso.inc:
Code:
gettimeofday equ qword [vdso_gettimeofday]    
which is under the "if used vdso_gettimeofday" guard. If this definition does not happen, then "call gettimeofday" is not recognized as using "vdso_gettimeofday", preventing the block from ever being assembled even though it's needed.
Post 13 Feb 2022, 22:27
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8000
Location: Kraków, Poland
Tomasz Grysztar
And if you're just interested in what the results looks like, here's the outcome of the process on my laptop:
Code:
~/HeavyThing-1.24/rwasa$ ~/fasm/fasm.x64 rwasa.asm -m523288
flat assembler  version 1.73.29  (523288 kilobytes memory, x64)
10 passes, 8.0 seconds, 482392 bytes.
~/HeavyThing-1.24/rwasa$ ~/fasmg/fasmg.x64 rwasa.asm -iinclude\ \'../fasmg/ht.inc\'
flat assembler  version g.jje9
Warning: unsupported legacy syntax, attempting partial translation...
        macro cleartext name*, [val*]
Warning: unsupported legacy syntax, attempting partial translation...
        macro url_addone port*, [val*]

10 passes, 8.6 seconds, 482392 bytes.    
(This is with HeavyThing-1.24/dataseg_macros.inc replaced as advised in the first post above.)
Post 13 Feb 2022, 22:40
View user's profile Send private message Visit poster's website Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 378
Location: Australia
redsock
This is seriously an impressive bit of effort and definitely highlights just how flexible fasmg really is. While I don't entirely have my head around all of the fasmg enhancements you did to make this work, Wow! Smile Smile

For reference only, on my workstation the results for fasm1 (note: fasm1 does not provide sub-second resolution on its timers):
Code:
# perf stat fasm.x64 -m 524288 rwasa.asm
flat assembler  version 1.73.29  (524288 kilobytes memory, x64)
10 passes, 5.0 seconds, 482456 bytes.

 Performance counter stats for 'fasm.x64 -m 524288 rwasa.asm':

           4583.81 msec task-clock                #    0.998 CPUs utilized          
              1160      context-switches          #    0.253 K/sec                  
                22      cpu-migrations            #    0.005 K/sec                  
               684      page-faults               #    0.149 K/sec                  
       19034092477      cycles                    #    4.152 GHz                      (83.36%)
        1290480445      stalled-cycles-frontend   #    6.78% frontend cycles idle     (83.35%)
        7150562046      stalled-cycles-backend    #   37.57% backend cycles idle      (83.36%)
       25364559186      instructions              #    1.33  insn per cycle         
                                                  #    0.28  stalled cycles per insn  (83.36%)
        8717119436      branches                  # 1901.719 M/sec                    (83.27%)
          27197734      branch-misses             #    0.31% of all branches          (83.30%)

       4.591537328 seconds time elapsed

       4.543371000 seconds user
       0.039994000 seconds sys
    

And the results for fasmg:
Code:
# perf stat fasmg.x64 rwasa.asm -iinclude\ \'../fasmg/ht.inc\'
flat assembler  version g.jje9
Warning: unsupported legacy syntax, attempting partial translation...
        macro cleartext name*, [val*]
Warning: unsupported legacy syntax, attempting partial translation...
        macro url_addone port*, [val*]

10 passes, 6.1 seconds, 482392 bytes.

 Performance counter stats for 'fasmg.x64 rwasa.asm -iinclude '../fasmg/ht.inc'':

           6145.83 msec task-clock                #    0.999 CPUs utilized          
               920      context-switches          #    0.150 K/sec                  
                54      cpu-migrations            #    0.009 K/sec                  
              7809      page-faults               #    0.001 M/sec                  
       25469419116      cycles                    #    4.144 GHz                      (83.29%)
        1036223997      stalled-cycles-frontend   #    4.07% frontend cycles idle     (83.35%)
        2844452418      stalled-cycles-backend    #   11.17% backend cycles idle      (83.35%)
       34884610370      instructions              #    1.37  insn per cycle         
                                                  #    0.08  stalled cycles per insn  (83.35%)
       11915497351      branches                  # 1938.793 M/sec                    (83.35%)
          62137755      branch-misses             #    0.52% of all branches          (83.30%)

       6.151490670 seconds time elapsed

       6.137656000 seconds user
       0.007996000 seconds sys
    


Very cool! Will your classic Pentium machine assemble it? Smile

_________________
2 Ton Digital - https://2ton.com.au/
Post 13 Feb 2022, 23:34
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8000
Location: Kraków, Poland
Tomasz Grysztar
redsock wrote:
Will your classic Pentium machine assemble it? Smile
The memory is the main issue there. That machine has 24 MB of RAM (and this is already extended from its original 8 MB). Even though assembling HeavyThing with fasmg now uses significantly less memory, it still requires ~42 MB. I could make it use swap file (even under DOS, with help of CWSDPMI), but then it is going to be sloooow.
Post 13 Feb 2022, 23:52
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.