flat assembler
Message board for the users of flat assembler.

Index > Main > Multiple section fragments

Author
Thread Post new topic Reply to topic
alexfru



Joined: 23 Mar 2014
Posts: 76
alexfru
I'm exploring the option of supporting FASM (in addition to NASM) as the assembler for my Smaller C compiler.

Smaller C generates assembly code like this:

Code:
; ...
; glb Fopen : (
; prm     filename : * char
; prm     mode : * char
;     ) * struct __stream
section ".text" executable
        public  _Fopen
_Fopen:
; some code

section ".rodata"
L16:
        db      "Can't open/create file ",34,"%s",34,10
        times   1 db 0

section ".text" executable
; _Fopen's code continues here
; ...
    


The produced ELF object file ends up having multiple .text sections, multiple .data sections and multiple .bss sections (and their relocation sections).

The problem with multiple sections under the same name is that each of them has to be aligned, which means that the code or the data in some of these sections will be padded with NOPs/INT3s (I choose INT3s in my linker) or db 0's and when that happens in the middle of a subroutine or an object, it becomes broken.

One specific case is even worse.
Code:
; glb OutName : * char
section ".data" writable
        align 4
        public  _OutName
_OutName:
; =

section ".rodata"
L1:
        db      "floppy.img"
        times   1 db 0

section ".data" writable
; RPN'ized expression: "L1 "
; Expanded expression: "L1 "
        dd      L1
    

Here I get two .data sections and the first one ends up having zero size and NOBITS attribute in the object file, while the second one is PROGBITS. The section attribute/flag inconsistency checks go off in my linker as a result.

In contrast, NASM, GNU as and other assemblers combine fragments of sections together into one section if the section name matches in all of them and there's no unexpected alignment/padding somewhere inside the combined section.

Can this be fixed in FASM?
Post 21 Jun 2015, 21:49
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
alexfru
Quote:
Can this be fixed in FASM?

As long as the Smaller C compiler is fixed to not produce torn sections...

Fasm is not supposed to collect equally named sections together: it would just be contradictory to its very name. "Flat" means in particular that whatever code or data you define it will have exactly the same layout in the binary as it has in the source code.

Another option is to make Smaller C include some macros into the produced source. E.g. you can redefine the section directive with a macro that would collect whole sections into macro bodies. In either case you need to adjust the Smaller C compiler accordingly.

_________________
Faith is a superposition of knowledge and fallacy
Post 21 Jun 2015, 22:55
View user's profile Send private message Reply with quote
alexfru



Joined: 23 Mar 2014
Posts: 76
alexfru
l_inc wrote:
alexfru
Quote:
Can this be fixed in FASM?

As long as the Smaller C compiler is fixed to not produce torn sections...


Ain't gonna happen any time soon. You see, it's a single pass compiler with minimum buffering, which is what makes it very small and simple. And when things start overlapping, e.g. like here:
Code:
char* apc[] =
{
  "abc", "rst", "xyz"
};
    

there isn't much to be done other than emit them overlapping for the assembler to combine properly. The character data goes into .rodata and the pointers (apc[0], apc[1], apc[2]) go into .data. The assembler is usually capable of making several passes over the input in order to resolve forward references or use shortest branch instructions. I see FASM proudly reports 21 passes on one of my files. At first glance, it looks like it could do more passes to combine section fragments.

l_inc wrote:
Fasm is not supposed to collect equally named sections together: it would just be contradictory to its very name.

Well, then probably Smaller C and FASM are both too primitive to work together.

l_inc wrote:
"Flat" means in particular that whatever code or data you define it will have exactly the same layout in the binary as it has in the source code.

But we aren't talking about a flat binary. We're talking about an ELF intermediate object file, which FASM claims to be supporting and to a certain degree that's true. I'm not exactly sure if inconsistent section flags/attributes or multiple sections of the same name are allowed within the same ELF object file.

l_inc wrote:
Another option is to make Smaller C include some macros into the produced source. E.g. you can redefine the section directive with a macro that would collect whole sections into macro bodies.

You mean like start with an empty macro and then change what is normally "section ".text" executable" into some kind of other macro that would append its argument (the rest of the code that's supposed to be in the .text section) to the first macro and then expand it at the end?

And then there's "inline assembly" support of the form asm("text"), which outputs the text as-is and that text too could contain section directives... It gets complicated.

l_inc wrote:
In either case you need to adjust the Smaller C compiler accordingly.

Yeah, that's what I'm playing with. I changed the syntax to suit FASM and then ran into this issue with sections. I might be able to make some adjustments in the linker to allow for the unconventional ELF files produced by FASM. E.g. I could just ignore alignment reported in ELF and just concatenate all the fragments. Most things will work as the CPU isn't very picky to alignment of data, but some, where the CPU is picky, won't.

I think, it could be worked around / fixed if I can specify section alignment explicitly. E.g. when an aligned object starts, I emit a section with alignment of 4, but when I need to continue that section, I set its alignment to 1. But I can't see such an option (to set section alignment) in the documentation and by default all sections come out aligned to 4 bytes. Is there such an option?
Post 22 Jun 2015, 01:52
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
alexfru
Quote:
At first glance, it looks like it could do more passes to combine section fragments.

This task can be done in a single pass. Multiple passes a meant to resolve other things.
Quote:
But we aren't talking about a flat binary.

It doesn't matter. The "flatness" property holds for all output formats and can only be visually overridden using the preprocessor capabilities.
Quote:
I'm not exactly sure if inconsistent section flags/attributes or multiple sections of the same name are allowed within the same ELF object file.

Equally named sections are explicitly allowed:
Executable and Linkable Format. Special Sections wrote:
An object file may have more than one section with the same name.

Quote:
You mean like start with an empty macro and [...] and then expand it at the end?

There are different ways to implement this. I'm gonna show you one. Put the following at the very beginning of your compiler output:
Code:
macro section args&
{
    include 'macrosection.inc'
    mSection args
}    


Put an invocation putsections at the very end of your compiler output (postpone is not able to handle this situation). And here is the content of the "macrosection.inc":
Code:
; macrosection.inc:
section fix } mSection
realsection fix section
putsections fix } mPutSections
match,
{
    local descriptors
    macro mSection args&
    \{
        match name \rest, args +
        \\{
            \\local sbody
            irpv d,descriptors \\\{ common match name sect \\\rest, d +
            \\\\{
                define sect sbody
                rept 0 \\\\\{
            \\\\} \\\}
            match,
            \\\{
                \\\local wrap,sect,shead
                define sect shead
                define sect sbody
                define wrap sect
                define descriptors name wrap
                shead equ realsection args
            \\\}
            macro sbody \\\{
        \\}
    \}
    macro mPutSections \{ irpv d,descriptors \\{ match name sect, d
    \\\{
        irpv s,sect \\\\{ s \\\\}
    \\\} \\} \}
}    


This creates a list of section descriptors. Each descriptor consists of a tuple: section name and a list of its corresponding section bodies. putsections at the end is supposed to finalize the last macroblock and to invoke a section instantiation macro. Section properties are not checked for consistency, but it's not much of a problem to add the checks either. Note that this way you are not allowed to explicitly define macroblocks inside sections, which is normally not a good style anyway.

Quote:
And then there's "inline assembly" support of the form asm("text"), which outputs the text as-is and that text too could contain section directives...

This is processed by the Smaller C compiler anyway. So it doesn't matter what features you allow for your compiler as long as the intermediate output is acceptable for fasm.

Quote:
I think, it could be worked around / fixed if I can specify section alignment explicitly [...] But I can't see such an option (to set section alignment) in the documentation and by default all sections come out aligned to 4 bytes.

You are thinking too complicated IMHO. But fasm surely does allow to specify section alignment:
2.4.4. Executable and Linkable Format wrote:
optionally also align operator followed by the number specifying the alignment of section (it has to be the power of two), if no alignment is specified, the default value is used

_________________
Faith is a superposition of knowledge and fallacy
Post 22 Jun 2015, 13:00
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17474
Location: In your JS exploiting you and your system
revolution
alexfru : Perhaps I can point you towards an already existing topic concerning combining multiple parts of things into a single section (or sections) by building lists of each section and using postpone or manually placing the built sections in the order of your choosing.
Post 22 Jun 2015, 15:18
View user's profile Send private message Visit poster's website Reply with quote
alexfru



Joined: 23 Mar 2014
Posts: 76
alexfru
OK, I did miss the section alignment option in the ELF format. However, if I use it to align a section to 1 byte, then I can't use an alignment directive to align to 4 bytes inside this section. I get an error about FASM's inability to provide/guarantee such internal alignment.

As for the macros, did I get it right that these are extremely expensive memory-wise? E.g. if I have an asm file of 50K lines, will it cause FASM's macros eat many megabytes of RAM?
Post 02 Jul 2015, 05:33
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
alexfru
Quote:
However, if I use it to align a section to 1 byte, then I can't use an alignment directive to align to 4 bytes inside this section.

That is correct. And letting the linker combine sections inside of a single object file is not a proper solution for your problem anyway.
Quote:
As for the macros, did I get it right that these are extremely expensive memory-wise?

They obviously do consume memory, but not as much as you think. The consumption grows linearly with the number of lines, which is the same as if you don't use any macros at all.

I must admit that in case you don't have to use the standard section declaration syntax, you'd probably better stick to the list building macros suggested by revolution: the proper solutions are at the end of the linked topic. In this case the grouping key for combining section bodies would be a name of a unique symbolic constant and the grouping would be implicitly resolved by fasm mechanisms. In the macros I provided the grouping is made explicitly with a match directive using actual section names as keys, which might involve a bit of additional overhead.

_________________
Faith is a superposition of knowledge and fallacy
Post 02 Jul 2015, 12:11
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.