flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > [FASM1] Empty file with virtual as and org

Author
Thread Post new topic Reply to topic
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 17:21
Code:
virtual at 0 as 'tmp'
  org 100h
  db 42
end virtual    

results in two empty files: xxxx.bin and xxxx.tmp. Is the behaviour intended? If yes, what workarounds are available?
Post 17 Nov 2019, 17:21
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 17:59
By definition, ORG opens a new addressing space (and therefore closes the previous one). In this case is equivalent to doing END VIRTUAL and then VIRTUAL AT 100h.

Because of such semantic problems I changed this in fasmg, where ORG is not allowed inside VIRTUAL. In fasm 1 this option remains only for backward compatibility.
Post 17 Nov 2019, 17:59
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 18:38
So, my guess about the reason was right. Now let me argue that such behaviour is not quite consistent. This code
Code:
org 100h
end virtual    

doesn’t compile for very clear reasons: org is documented to start a new addressing space but is NOT equivalent to virtual. For me, logically, unlike virtual which doesn’t let the data pass through to the main output file, org just changes the starting value of the internal address counter. So, I’d say not allowing org inside a virtual block is not really a solution of the semantic problem.

Are there any reasons why making org a bit more limited would hurt? My suggestion is as follows:
* virtual makes an addressing space (and allows nesting)
* org just sets current address counter to a particular value

Naming addressing spaces doesn’t seem to work with org, so would not be affected. The only real problem I can see is the meaning of $$ symbol. While I doubt there’s much code out there requiring $$ to point right after the nearest org directive, this seems to be the only concern from semantical point of view, and can safely be left as it is by just using another words: “$$ is always equal to the last explicitly set address, by either org directive or a new virtual block.” I don’t know about actual implementation problems though.

Such definition would extend FASM capabilities to produce multiple output files. Say, compiling two or more files from the same source: debug, release, with different ISA requirements (FPU, SSE, etc.).
Code:
virtual at 0 as 'Debug.exe'
...

virtual at 0 as 'Release.exe'
...

virtual at 0 as 'NoSSE.exe'
...    

Not sure about EXEs though (might have problems with format directive). May still be useful for osdev purposes: compiling a separate blob directly into the current project instead of creating intermediate binaries just to use file directive.


Last edited by DimonSoft on 17 Nov 2019, 18:45; edited 1 time in total
Post 17 Nov 2019, 18:38
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 18:45
You cannot change current address counter without starting a new addressing space, this is the very nature of what the addressing space is in fasm.

This attribute of addressing spaces is especially important for directives like LOAD and STORE - they need addresses to be unambiguous. This is why you can never LOAD/STORE past the addressing space boundaries, and every directive like ORG or VIRTUAL makes such impassable boundary.

As for producing multiple output files, fasmg demonstrates how it can be designed in a clean way - with RESTARTOUT directive.
Post 17 Nov 2019, 18:45
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 18:54
Tomasz Grysztar wrote:
You cannot change current address counter without starting a new addressing space, this is the very nature of what the addressing space is in fasm.

This attribute of addressing spaces is especially important for directives like LOAD and STORE - they need addresses to be unambiguous. This is why you can never LOAD/STORE past the addressing space boundaries, and every directive like ORG or VIRTUAL makes such impassable boundary.

So, is virtual just a “save current state, temporarily change the base, then restore”?

Judging from this piece of code
Code:
org 200h
x dd $12345678

org 201h
dw $ABCD

load a byte from 202h
db a    
procuding
Code:
78 56 34 12 CD AB AB    
data definition directives just add data to the output, no overlapping is detected and virtual blocks just seem to work like output buffering. So the suggestion to allow org inside virtual doesn’t seem to break anything.

Note the addition I made to my original post before seeing your response: there might be valid use cases for allowing org inside virtual blocks.
Post 17 Nov 2019, 18:54
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 19:04
DimonSoft wrote:
So the suggestion to allow org inside virtual doesn’t seem to break anything.
Yes, it does not break anything - for this reason it was always allowed in fasm 1. But it starts a new addressing space. Because ORG always starts a new addressing space - this is how it is defined in the manual.

Note that in your sample you cannot change your LOAD to get a byte from address 200h. You need to label the addressing space to be able to do that.
Post 17 Nov 2019, 19:04
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 19:18
BTW, if you'd be interested how is it possible to assemble multiple executable files with fasmg, here's a quick demonstration:
Code:
struc encapsulate? file*, outext:'bin'
        namespace .

                catch_postpone

                include file

                execute_postponed

                load OUTPUT:$%% from :0
                virtual as outext
                        db OUTPUT
                end virtual
                restartout 0

        end namespace
end struc

macro catch_postpone?

        macro execute_postponed?
                purge postpone?,end?.postpone?
        end macro

        macro postpone?
                esc macro execute_postponed?
        end macro

        macro end?.postpone?!
                        execute_postponed
                esc end macro
        end macro

end macro

User encapsulate 'library_user.asm', 'EXE'
Library encapsulate 'library.asm', 'DLL'    
This generates two PE files with sources from my file formats tutorial.

To get it working with the standard PE formatting macros, POSTPONE emulation would need to be further extended to handle "POSTPONE ?" variant properly (it should be executed after all the regular postponed blocks).
Post 17 Nov 2019, 19:18
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 19:22
I see. So, we can define an addressing space as a range of addresses, right? I can formulate my suggestion another way.

Why not allow data from org addressing space to pass through to the outer virtual block (if any) or to the main file output (if no virtual blocks are involved)?

Wouldn’t separating these concerns—address calculation and data generation—make more sense than just losing the data? After all, I’d say using virtual has clear intention to not allow data passing through, while in case of org the intention is not there. After all, when people write their bootloaders they might use (at least I saw such examples) org just to change the base address, and the addressing space concept is just something that allows to do it.
Post 17 Nov 2019, 19:22
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 19:28
DimonSoft wrote:
Wouldn’t separating these concerns—address calculation and data generation—make more sense than just losing the data?
The data is not lost. It is just that you started a new addressing space and the AS feature only saves the contents of a single addressing space (this is actually a limitation of fasm 1 engine, even if I wanted to define it differently).

What I just realized, though, is that I never really documented the AS feature in fasm's manual. Turns out I forgot about it and I only have it properly documented for fasmg.
Post 17 Nov 2019, 19:28
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 19:35
Tomasz Grysztar wrote:
It is just that you started a new addressing space and the AS feature only saves the contents of a single addressing space (this is actually a limitation of fasm 1 engine, even if I wanted to define it differently).

Ah, so that’s the problem. Still would it break something if org addressing space started to pass its data to the outer block as if it was normally generated data (while still being an addressing space)? Or is it just too much work to do?

I guess, doing it manually with a copy loop and load/store directives would have more impact on compiler memory usage. Not to mention the difficulties loading from addressing spaces that cannot be named.
Post 17 Nov 2019, 19:35
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 19:44
DimonSoft wrote:
Ah, so that’s the problem. Still would it break something if org addressing space started to pass its data to the outer block as if it was normally generated data (while still being an addressing space)? Or is it just too much work to do?
I'd see this as conceptually wrong, because the data in the outer block would be seen as being at a different address than the one it is supposed to be at.

With fasmg's language definitions this is all a bit cleaner. It defines ORG and SECTION as starting a new area of main output file and this is why you cannot embed them inside a VIRTUAL block (which is an area detached from the main output).

DimonSoft wrote:
I guess, doing it manually with a copy loop and load/store directives would have more impact on compiler memory usage. Not to mention the difficulties loading from addressing spaces that cannot be named.
If you overload ORG with macro, you can give label to every addressing space.
Post 17 Nov 2019, 19:44
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 20:25
Tomasz Grysztar wrote:
I'd see this as conceptually wrong, because the data in the outer block would be seen as being at a different address than the one it is supposed to be at.

Piece of code that is supposed to run at different addresses than the main program. It becomes a blob after addresses are calculated anyway.

You already have the behaviour of producing a single blob from multiple addressing spaces in plain FASM without virtual blocks, the suggestion is to extend the behaviour to achieve a more useful behaviour.

Quote:
If you overload ORG with macro, you can give label to every addressing space.

It won’t make
Code:
load a byte from Name:Offset    
possible, just labels with offsets valid in current addressing space.
Post 17 Nov 2019, 20:25
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 20:49
DimonSoft wrote:
Tomasz Grysztar wrote:
I'd see this as conceptually wrong, because the data in the outer block would be seen as being at a different address than the one it is supposed to be at.

Piece of code that is supposed to run at different addresses than the main program. It becomes a blob after addresses are calculated anyway.
I mean the behavior with directives like LOAD/STORE. The strict rules concerning addressing spaces make it easier to avoid various tricky corners in the design of the language. In fasmg you can see the direction that I found the most clean.

DimonSoft wrote:
Quote:
If you overload ORG with macro, you can give label to every addressing space.

It won’t make
Code:
load a byte from Name:Offset    
possible, just labels with offsets valid in current addressing space.
Why not? Just do something like this:
Code:
macro org addr {
        close_org
        local area
        define __AREA area
        org addr
        area::
        area#.$$ = $$
}

macro close_org { match previous, __AREA \{ previous\#.$ = $ \} }

org 0

        db 'a'

org 100h

        db 'b'

close_org
irpv area, __AREA
{
        repeat area#.$-area#.$$
                load a byte from area : area#.$$+%-1
                display a
        end repeat
}    
Post 17 Nov 2019, 20:49
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 21:06
Tomasz Grysztar wrote:
DimonSoft wrote:
Quote:
If you overload ORG with macro, you can give label to every addressing space.

It won’t make
Code:
load a byte from Name:Offset    
possible, just labels with offsets valid in current addressing space.
Why not? Just do something like this:
Code:
macro org addr {
        close_org
        local area
        define __AREA area
        org addr
        area::
        area#.$$ = $$
}

macro close_org { match previous, __AREA \{ previous\#.$ = $ \} }

org 0

        db 'a'

org 100h

        db 'b'

close_org
irpv area, __AREA
{
        repeat area#.$-area#.$$
                load a byte from area : area#.$$+%-1
                display a
        end repeat
}    

Oops, my test piece of code didn’t work with namespace labels. I guess, this was due to address values I used.

I think, I came up with a good example of task to discuss. Consider writing a set of macros for generating a FAT12 image that allows one to specify both loader code and disk contents, as well as custom values for the boot sector (BPB). The macro set would obviously contain org 7C00h somewhere.

Now imagine we want to have this single source generate multiple output files. Say, CoolOS.1.44MB.img, CoolOS.1.2MB.img, etc. If you find FAT12 outdated, let’s think about CoolOS.iso as well. The easy way to do that might have been to wrap the disk definition in virtual at 0 as '1.44MB.img', etc. Another possible task is supporting partitioned media image generation with macros: each volume might have its own boot sector, and there’s also an MBR with code and org directive inside.

In its current implementation it seems one has to play around with temporary files and/or multiple projects that basically solve the same task and differ in a few lines of code.
Post 17 Nov 2019, 21:06
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Nov 2019, 21:10
DimonSoft wrote:
Now imagine we want to have this single source generate multiple output files.
I have already shown you how fasmg makes it possible (note that it also provides a namespace separation). This is the direction in which my language design has naturally evolved. Keep in mind that for several years I have been considering design of fasm 2 impossible to make self-consistent. I only started working on fasmg after I finally found a set of design choices that seemed to work for me. And even then it was not free of problems - but once I got it running, I liked it so much that I had more motivation work around them. Wink
Post 17 Nov 2019, 21:10
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Nov 2019, 21:22
If only fasmg worked as fast as fasm1…
Post 17 Nov 2019, 21:22
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20357
Location: In your JS exploiting you and your system
revolution 18 Nov 2019, 16:47
DimonSoft wrote:
If only fasmg worked as fast as fasm1…
+1
Post 18 Nov 2019, 16:47
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 18 Nov 2019, 17:51
Well, everything comes with a price.

Comparing the two would perhaps be less ridiculous if fasmg had instructions encoders implemented natively (but then it would deserve to be simply called fasm 2) instead of having them in form of complex macros that unroll to hundreds of lines to be interpreted to assemble just a single instruction. But the thing is, even with all that craziness the assembly times still turned out (just barely) bearable enough for my purposes, and I found it very hard to give up this kind of flexibility once I started using it.
Post 18 Nov 2019, 17:51
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.