flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > Tricky stuff in fasmg, part 1: ORG inside VIRTUAL

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8268
Location: Kraków, Poland
Tomasz Grysztar 03 Oct 2016, 18:44
This is the first part of a series where I plan to talk about some unusual and complex trickery that can be done with fasmg. I'm going to deal with problems that I consider too eccentric or difficult to mention in my introduction to fasmg, therefore these texts are going to require an "advanced" level of knowledge about fasmg - please keep the manual at hand.

The other installments: part 2, part 3, part 4.
_______

We're going to start with a problem that appears simple on the surface, but it is not. In fasmg the ORG and SECTION directives both begin a new section of output file (they differ only in how they treat the uninitialized data that came before) and to accentuate their function they are not allowed inside a VIRTUAL blocks. Of course, as almost everything in fasmg, this can be adjusted with macros, so let's try it.

If what is needed is simply to change the origin address for a portion of virtual block, this classic behavior is easy to emulate with a set of macros that look like this:
Code:
macro virtual? definition
        virtual definition
        macro org? address
                local addr
                addr = address
                end virtual
                virtual at addr
        end macro
end macro

macro end?.virtual?
        purge org?
        end virtual
end macro    
This alters every virtual block so inside it ORG is a macro that switches to a new virtual addressing space. The local "addr" variable is used to compute the value of address before closing the virtual block, just in case the specified address is an expression that uses values like $ or $$.

But there is also a different scenario, where the specific function of ORG is more apparent and this simple emulation does not help. Let's say that we need to take an entire source of program, like the basic Win32 example from fasmg package, and put it inside a VIRTUAL block, so that we can read the values of symbols defined there without generating actual output (perhaps because we want to output some other data instead).

Now this poses a complex problem, because the PE formatting macros not only use ORG and SECTION, but also the values of $% and $%% - the offsets within generated output file. Of course PE format macros could themselves be modified to use only relative offsets (like $-$$), but for the exercise let's say that we want to assemble a foreign source without changing it. We need to provide the correctly computed values of $% and $%% then. The following sample prepares what is necessary to assemble the PE macros inside a VIRTUAL block:
Code:
virtual at 0

$$% = 0
define $% ($$% + $ - $$)
define $%% ($$% + $@ - $$)

macro org? address
        local addr
        addr = address
        $$%. = $%
        end virtual
        virtual at addr
end macro

macro section? address
        local addr
        addr = address
        $$%. = $%%
        end virtual
        virtual at addr
end macro

postpone
        end virtual
end postpone

include 'win32.asm'    
A variable named $$% is used to hold the offset within the virtualized output file. Note that definitions inside ORG and SECTION macros use an identifier with attached dot to change this value - this ensures that the global symbol is modified even if the macros are used inside a descendant namespace. The END VIRTUAL is postponed, because PE formatting macros themselves use postponed block to finish the executable. We can go one step further and intercept the postponed blocks in the foreign source, to launch them in a controlled fashion:
Code:
macro postpone?!
    esc macro postponed
end macro

macro end?.postpone?!
        postponed
    esc end macro
end macro

macro postponed
end macro

$$% = 0
define $% ($$% + $ - $$)
define $%% ($$% + $@ - $$)

macro org? address
        local addr
        addr = address
        $$%. = $%
        end virtual
        virtual at addr
end macro

macro section? address
        local addr
        addr = address
        $$%. = $%%
        end virtual
        virtual at addr
end macro

virtual at 0

        include 'win32.asm'

        postponed

end virtual    
However even though those macros work correctly with the PE example, their method of computing $%% is not correct in general. The difference between such defined $% and $%% is always the same as between $ and $@, and this is the size of uninitialized data at the end of current addressing space. But if there are multiple consecutive areas containing only uninitialized data, computation of $%% should take them all into consideration. The following modification deals with this problem, though it does it in a tricky way:
Code:
$$% = 0
@% = 0
define $% ($$% + $ - $$)
define $%% ($$% + $@ - $$ - 1/($@-$$+1)*@%)

macro org? address
        local addr
        addr = address
        $$%. = $%
        @%. = $% - $%%
        end virtual
        virtual at addr
end macro

macro section? address
        local addr
        addr = address
        $$%. = $%%
        @%. = 0
        end virtual
        virtual at addr
end macro

macro postpone?!
    esc macro postponed
end macro

macro end?.postpone?!
        postponed
    esc end macro
end macro

macro postponed
end macro

virtual at 0

        include 'win32.asm'

        postponed

end virtual

purge postpone?,end?.postpone?,org?,section?
restore $%,$%%    
The additional @% variable holds the size of uninitialized data in previous spaces (I used the cryptic names for these variables to keep the expressions short, but longer names would work just as well). The trick is that we need to subtract this value from $%% only when there is no initialized data in current space, and this is true when $@ is equal to $$. The expression 1/($@-$$+1) is a contraption that generates 1 only when $@-$$ is zero, and 0 otherwise.

The last variant does handle everything that is needed to "hide" it from the foreign source text that it is assembled inside a virtual block and it even cleans up after itself with PURGE and RESTORE, so any source text that follows can use the ORG and file offsets in their original meaning. The only problem we could still have are the symbol name clashes, and this will be covered in the next part.


Last edited by Tomasz Grysztar on 11 May 2017, 13:25; edited 1 time in total
Post 03 Oct 2016, 18:44
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 02 Nov 2016, 08:19
Tomasz Grysztar wrote:
Let's say that we need to take an entire source of program, like the basic Win32 example from fasmg package, and put it inside a VIRTUAL block, so that we can read the values of symbols defined there without generating actual output (perhaps because we want to output some other data instead)


The only way to read the whole output within the program is to do it in multiple chunks, one per each virtual block, correct?
Since the "load" instruction cannot read across an "org", "section" or "virtual" boundary yet the offsets can be changed only by those.
It seems faking $ and $$ won't be enough since the address of labels won't be adjusted this way.
Post 02 Nov 2016, 08:19
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8268
Location: Kraków, Poland
Tomasz Grysztar 02 Nov 2016, 08:31
Grom PE wrote:
The only way to read the whole output within the program is to do it in multiple chunks, one per each virtual block, correct?
Since the "load" instruction cannot read across an "org", "section" or "virtual" boundary yet the offsets can be changed only by those.
Yes, this is the only way. For example pe.inc has to deal with this problem when computing the checksum - it collects the labels of all the separate areas into CheckSumBlocks variable and later iterates through all of them.
Grom PE wrote:
It seems faking $ and $$ won't be enough since the address of labels won't be adjusted this way.
Well, you could try to catch label definitions with "struc ?" and emulate them appropriately. I'm not sure if that would be practical, though.
Post 02 Nov 2016, 08:31
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.