flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > Tricky stuff in fasmg, part 2: Namespace separation

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 05 Oct 2016, 22:19
This is the second part of a series on advanced trickery that can be done with fasmg. The other installments: part 1, part 3, part 4.

Edit: this part has been largely deprecated by later changes to how the macro recursion is handled by fasmg. With the new syntax a macro can only be recursive when marked as such and unwanted recursion in embedded namespaces is no longer a problem.
_______

The NAMESPACE directive allows to process entire sections of source in a separate contexts, avoiding any name clashes. If the sub-modules are assembled in separate namespaces, then they can not only use the same names for various labels, but even their macroinstructions are confined to their local scope. For example, one module may include macros for 8086 instruction set, while the other one could use instructions of a different processor, and they would not get in each other's way (the PE formatter macros that come with the examples in fasmg package do something like this, when they define 8086 instruction set macros within a local namespace only to assemble the MZ stub).

A separation of modules can look like this:
Code:
define First
namespace First
        include 'module1.asm'
end namespace

define Second
namespace Second
        include 'module2.asm'
end namespace

        call    First.main
        call    Second.main    
Giving a definition to the parent symbol of a namespace is a recommended practice if this namespace is going to be accessed by name in some other places (like in the CALL instructions in the above sample). But if for some reason it was only needed to separate the namespaces of two sources - maybe because all they have to do is just generate some data into output - a minimal variant would work just as well:
Code:
namespace First
        include 'module1.asm'
end namespace
namespace Second
        include 'module2.asm'
end namespace    


There is however, a dangerous trap hidden there, and it is related to the forward-referencing of symbols.

Let's consider the following framework: we have a global COUNTER variable, initialized in the beginning of source:
Code:
COUNTER = 0    
Now every module may for some reason need to sometimes increase this counter, with an instruction like:
Code:
COUNTER = COUNTER + 1    
If we now try to put these modules into their own namespaces, suddenly they are going to start defining COUNTER inside their local contexts. If one such module contains only one command like the above one, it is not only going to define COUNTER as a symbol local to its namespace, but this symbol will be allowed to be forward-referenced (because it has only one definition in the entire source), and this construction becomes a self-referencing definition. It is impossible to fulfill such clause, as there is no value of COUNTER that solves such equation, so the assembly is going to fail.

The same problem can also apply to symbolic variables: if we had global LIST variable, perhaps initialized like this:
Code:
LIST equ initial    
and then expanded in modules with commands like:
Code:
LIST equ LIST, element    

then putting the module inside its own namespace would cause the above definition to become circular - this would in theory create ever-growing text, but fasmg catches such circular references early (also the ones of form "a equ b"/"b equ a") and signals the problem.

A possible solution to these problems is very simple: the modules should re-define the global variable with constructions like:
Code:
COUNTER. = COUNTER + 1
LIST. equ LIST, element    
The dot after the name of symbol tells the assembler to look for the already defined symbol with such name, including parent namespaces, so this way we modify the global symbol instead of creating a local one.

The same principle would apply if we created a special globally-acessible namespace where we would keep these variables:
Code:
define Globals
namespace Globals
        COUNTER = 0
        LIST equ initial
end namespace

define Module
namespace Module
        Globals.COUNTER = Globals.COUNTER + 1
        Globals.LIST equ Globals.LIST, element 
end namespace    
The principle is the same because again it is the dot in the identifier that makes the assembler look for the defined symbol in parent namespaces, only this time after a dot comes a name of descendant symbol, so this time it is not the global symbol that gets re-defined, but the symbol inside its namespace.

Interestingly, the same problem can also occur in case of macroinstructions. Let's consider that we have some simple global macroinstruction:
Code:
macro INT value
        dd value
end macro    
and that sub-modules may seek to re-define this mcaroinstruction to meet their requirements:
Code:
macro INT values&
        iterate value, values
                INT value
        end iterate
end macro    
Normally when such re-defined macro calls its own name, it refers to the previous macro with such name. But if we put the second definition inside a local namespace, we get the same result as with numeric or symbolic variables: the local macro now has just one definition and it can be forward-referenced, and this results in it calling itself recursively. This is very similar to what happens with circularly-defined symbolic value, but this time fasmg is not easily able to detect this and it will only detect an error when it reaches the built-in recursion limit (this limit can be altered with the -r command line option, setting it some small number like 100 allows to catch such errors early).

This time adding a dot after the name of a macro is not a valid solution, because a dot causes the assembler to look for the symbol of the expression class, not the instruction class - so it would only find globally defined INT if was also defined as a numeric or symbolic constant or variable there. Using a special namespace would work, but this would require a macro to also be used in this way.

However there is a different possible solution that may help in this case. If we somehow force the local symbol to be considered variable even when it has just one definition, the infinite recursion is going to disappear. When a variable symbol references in its first definition the same name, the assembler looks for the defined value for that name also outside the local namespace, so it is going to use the global value. And we can force local macroinstruction to become variable by creating a dummy definition and immediately removing it with PURGE:
Code:
macro int value
        dd value
end macro

namespace Module

        ; force variable macro:
        macro int
        end macro
        purge int

        macro int values&
                iterate value, values
                        int value
                end iterate
        end macro

        int 1,2,3

end namespace    
The similar trick can be applied to the symbols of the expression class, but this only makes sense when their modified values need only to be used locally:
Code:
COUNTER = 0

namespace First

        ; force variable:
        define COUNTER
        restore COUNTER

        ; increase counter:
        COUNTER = COUNTER + 1

        ; use the local counter value
        db COUNTER

end namespace

namespace Second

        ; force variable:
        define COUNTER
        restore COUNTER

        ; counting again from 0:
        COUNTER = COUNTER + 1
        COUNTER = COUNTER + 1

        db COUNTER

end namespace    


There is one case when this problem is going to show up frequently when putting some module into its separated namespace: it is when the module tries to re-define some of the internal instructions of the assembler. All the instructions of fasmg are the built-in global symbols, and when a module tries to re-define such instruction in a way that calls the original one, but it does it inside a local namespace, the infinite recursion is going to kick in.

We can see this effect immediately if we try to encapsulate in such way any complete program that uses the PE formatter, for instance the win32.asm example from the fasmg package:
Code:
namespace Win32_Sample
        include 'win32.asm' ; infinite recursion imminent
end namespace    
If we use the "-r100" command line switch to avoid the long wait and detect the recursion early, we are going to notice that it is caused by the re-defined DD instruction.

But we already know how to fix this. To make things simpler, let's use this handy macro:
Code:
macro var? names&
        iterate name, names

                define name
                restore name

                macro name
                end macro
                purge name

                struc name
                end struc
                restruc name

        end iterate
end macro    
For a given name, it forces such symbol to be variable in all the classes (expression, instruction and labeled instruction). Since DD is defined both as an instruction and as a labeled instruction, it is not much of an overkill here:
Code:
namespace Win32_Sample

        var dd?,dq?

        include 'win32.asm'

end namespace    
The PE formatter also re-defines the SECTION instruction, but it does it multiple times on its own, so this one is a variable anyway.

Now, this helps with the recursion, but the above sample would still not assemble - this time because of the POSTPONE used by the PE formatter, since the postponed code gets executed outside of the namespace where we tried to encapsulate this whole program. But in the previous part we already had prepared a macro that allows to execute postponed blocks locally:
Code:
namespace Win32_Sample

        macro postpone?!
            esc macro postponed
        end macro

        macro end?.postpone?!
                postponed
            esc end macro
        end macro

        macro postponed
        end macro

        var dd?,dq?

        include 'win32.asm'

        postponed
        purge postpone?,end?.postpone?

end namespace    
In fact, we could use the entire set of macros that were used to virtualize output and combine them with the namespace encapsulation, to assemble entire program sources in their own "sandboxes":
Code:
macro encapsulate? Namespace

        virtual at 0
        $$% = 0
        @% = 0
        define $% ($$% + $ - $$)
        define $%% ($$% + $@ - $$ - 1/($@-$$+1)*@%)

        macro org? address
                local addr
                addr = address
                $$%. = $%
                @%. = $% - $%%
                end virtual
                virtual at addr
        end macro

        macro section? address
                local addr
                addr = address
                $$%. = $%%
                @%. = 0
                end virtual
                virtual at addr
        end macro

        macro postpone?!
            esc macro Namespace.postponed
        end macro

        macro end?.postpone?!
                Namespace.postponed
            esc end macro
        end macro

        namespace Namespace

        macro postponed
        end macro

        macro end?.encapsulate?

                postponed
                end namespace

                purge org?,section?,postpone?,end?.postpone?
                restore $%,$%%

                repeat 1, Length:($$% + $ - $$)
                        display `Namespace,': ',`Length,' bytes.',13,10
                end repeat

                end virtual
        end macro
end macro

macro var? names&
        iterate name, names

                define name
                restore name

                macro name
                end macro
                purge name

                struc name
                end struc
                restruc name

        end iterate
end macro



encapsulate Win32_Sample

        var dd?,dq?

        include 'win32.asm'

end encapsulate


encapsulate Win64_Sample

        var dd?,dq?

        include 'win64.asm'

end encapsulate    
In the above example the combined set of macros allows to assemble both win32.asm and win64.asm programs within a single source. All the generated bytes are placed in virtual blocks and not written into actual output, so the additional DISPLAY instruction is added there to prove that the programs really got assembled.

For a general use, we could hide "var" inside the "encapsulate" macro and - just in case - declare every single one of fasmg's instructions as variable. But even then this encapsulation macros are still not perfect. For example, if an encapsulated module placed POSTPONE block inside another nested namespace, our macro would define "postponed" in the wrong namespace and this block would then never get executed. There is a simple method to deal with this risk, but this is a topic for another time.


Last edited by Tomasz Grysztar on 02 Oct 2018, 10:34; edited 2 times in total
Post 05 Oct 2016, 22:19
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 01 Nov 2016, 13:19
Trying to use the encapsulate macro with fasm g.hld82, it fails running out of memory.
Post 01 Nov 2016, 13:19
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 01 Nov 2016, 14:36
Grom PE wrote:
Trying to use the encapsulate macro with fasm g.hld82, it fails running out of memory.
The above sample assembles fine with hld82. Perhaps you have some additional recursion that you need to correct with "var"? Please try setting some small value for "-r" switch in command line (like -r100) to detect any infinite recursion before you run out of memory.
Post 01 Nov 2016, 14:36
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 01 Nov 2016, 14:50
Just copied the last chunk of code from the first post in fasmg/examples/x86/encapsulate.asm and running "fasmg encapsulate.asm encapsulate.bin".

If I do "fasmg encapsulate.asm encapsulate.bin -r 20", it gives:
Code:
flat assembler  version g.hld82
Win32_Sample: 3072 bytes.
Win64_Sample: 2048 bytes.

encapsulate.asm [80] win32.asm [4] macro format [71] V:\fasmg\examples\x86\include/p6.inc [2] V:\fasmg\examples\x86\include/p5.inc [12] V:\fasmg\examples\x86\include/80486.inc [2] V:\fasmg\examples\x86\include/80386.inc [4]:
        element x86.reg
macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6] macro element [6]:
        element definition
Processed: element x86.reg
Error: exceeded the maximum allowed depth of stack.    
Post 01 Nov 2016, 14:50
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 01 Nov 2016, 15:34
As you see, the recursion is caused by the "element" re-definition, so you need to add "element" to the "var" line. The samples I used for testing did use 80386.inc instead of p5.inc, that's why I did not notice this.

The point of this entire article was to explain these potential problems and ways to handle them. If you skip most of the content and just try to copy the macros from the examples, you may easily become confused by the problems you encounter.
Post 01 Nov 2016, 15:34
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 01 Nov 2016, 15:57
The example should work though, shouldn't it? Otherwise it's hard to learn just by the theory.

Added "element" to "var", still stumped:
Code:
flat assembler  version g.hld82
Win32_Sample: 3072 bytes.
Win64_Sample: 2048 bytes.

encapsulate.asm [84] win32.asm [7]:
        section '.text' code readable executable
macro section [3] macro section [10]:
        DATA_END = $-($%-$%%)
Processed: DATA_END = $-($%-$%%)
Error: variable term used where not expected.    
Post 01 Nov 2016, 15:57
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 01 Nov 2016, 16:05
Grom PE wrote:
The example should work though, shouldn't it? Otherwise it's hard to learn just by the theory.
The example did work when I published it, but then the examples in the fasmg package that it referenced got changed. I need to update this to work with the examples from current packages, please wait
Post 01 Nov 2016, 16:05
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 01 Nov 2016, 16:17
Unfortunately, at the time when it mattered the most, updating, I've managed to overwrite the fasmg package instead of renaming! Will be more careful next time.
Post 01 Nov 2016, 16:17
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 01 Nov 2016, 17:39
The other problem you uncovered is actually a bug in fasmg. I'm going to upload new, corrected version, and I'm modifying p5.inc so that it no longer redefines the case-insensitive "element", so the examples from this thread are again going to work without changes with the win32.asm from the fasmg package.
Post 01 Nov 2016, 17:39
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 01 Nov 2016, 19:41
I have uploaded the new version of fasmg with corrections. The examples in this text do not need changes.
Post 01 Nov 2016, 19:41
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 02 Nov 2016, 02:54
Thanks, with fasmg.hll54 now it works! It looks like adding "element?" to "var" is still desirable as otherwise there's a need to specify lower recursion limit for a quick assembly.
Post 02 Nov 2016, 02:54
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 02 Nov 2016, 07:05
Grom PE wrote:
It looks like adding "element?" to "var" is still desirable as otherwise there's a need to specify lower recursion limit for a quick assembly.
If you update p5.inc to the one that comes with current package, it no longer redefines case-insensitive "element" and this is not an issue then.
Post 02 Nov 2016, 07:05
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 02 Nov 2016, 08:08
Oops, my bad! Sorry.
Post 02 Nov 2016, 08:08
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 26 Nov 2017, 21:39
More things changed in the packaged examples in the meantime, so the above sample requires some further changes. On the other hand fasmg also has more features, like the ability to store external files. I have further refined the encapsulation example so that it not only assembles two separate programs from within one source, but also stores them in two output files:
Code:
macro encapsulate? Namespace,output

        local blocks

        virtual at 0
        $$% = 0 
        @% = 0 
        define $% ($$% + $ - $$) 
        define $%% ($$% + $@ - $$ - 1/($@-$$+1)*@%) 

        macro org? address
                local b
                b:: define blocks b:0
                local addr
                addr = address 
                $$%. = $% 
                @%. = $% - $%%
                end virtual
                virtual at addr 
        end macro 

        macro section? address
                local b
                b:: define blocks b:1
                local addr
                addr = address 
                $$%. = $%% 
                @%. = 0
                end virtual 
                virtual at addr 
        end macro 

        macro postpone?! arg
            esc macro Namespace.postponed 
        end macro 

        macro end?.postpone?! 
                Namespace.postponed 
            esc end macro 
        end macro 

        namespace Namespace 

        macro postponed 
        end macro

        local format_extension

        macro format? choice
                match =binary? =as? ext, choice
                        format_extension := ext
                end match
        end macro

        macro end?.encapsulate?

                postponed 
                end namespace 

                local b
                b:: define blocks b:0

                purge org?,section?,postpone?,end?.postpone? 
                restore $%,$%% 

                local extension

                match out, output
                        extension = out
                else
                        if defined format_extension
                                extension = format_extension
                        end if
                end match
                repeat 1, Length:($$% + $ - $$)
                        if defined extension
                                display `Namespace,': ',`Length,' bytes stored as ',extension,'.',13,10
                        else
                                display `Namespace,': ',`Length,' bytes.',13,10
                        end if
                end repeat 

                end virtual

                if defined extension
                        local data,reserve
                        virtual as extension
                        irpv block, blocks
                                match area:cutoff, block
                                        virtual area
                                                load data:$@-$$ from $$
                                                reserve = $-$@
                                        end virtual
                                        db data
                                        if ~ cutoff
                                                rb reserve
                                        end if
                                end match
                        end irpv
                        end virtual
                end if

        end macro 
end macro 

macro var? names& 
        iterate name, names 

                define name 
                restore name 

                macro name 
                end macro 
                purge name 

                struc name 
                end struc 
                restruc name 

        end iterate 
end macro 



encapsulate Win32_Sample,'exe32'

        var dd?,dq?,store?

        include 'win32.asm' 

end encapsulate 


encapsulate Win64_Sample,'exe64'

        var dd?,dq?,store?

        include 'win64.asm' 

end encapsulate    
The "postpone ?", which is another feature of fasmg added after this tutorial was written, is not handled well by these macro, though for the purpose of this example it works OK. In general it may require a better handling. In fact, the IRPV that copies areas into output files could itself be moved into a "postpone ?" block.
Post 26 Nov 2017, 21:39
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.