flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > paragraph aligning

Author
Thread Post new topic Reply to topic
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
The subject of paragraph-aligning functions came up on CLAX. I said this:

I think the solution here is that the function will occupy the minimum number of 16-byte paragraphs if:
(function_start mod 16)+(function_size mod 16) < 16

Unfortunately, there is no way to write a macro to wrap a function and accomplish this, because there is no way for the macro to know the function_size prior to assembling the function, at which time it is too late to paragraph align the function_start.

This could only be accomplished by the assembler itself. I will ask the FASM and HLA implementers (Tomasz and Randy, respectively) to include this feature in their assemblers. It would be a great benefit because ALWAYS aligning every function_start by 16 wastes memory half of the time --- and wasting memory is a great sin, as it can result in code-cache thrashing --- there is no point in having excess garbage in the cache!

As it stands right now, there is no way to accomplish this except by manually going through the listing to calculate the function_size for every function, and then paragraph aligning only those functions which need it --- a rather tedious exercise! It might be possible to write a program that examines the listing file and modifies the corresponding source file --- I could do it, but I will first try to get the assembler implementers to do it.


Tomasz?
Post 12 Nov 2013, 05:25
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17664
Location: In your JS exploiting you and your system
revolution
In the general case the problem is unsolvable because when you shift the position of code it can shrink or grow in size causing an optimal solution to not exist.

I had the same problem in the ARM assembler when choosing between 16-bit and 32-bit encodings in thumb mode. One can end up with an infinite loops trying to find a solution. There would need to be some mechanism to detect such cases and essentially "give up" and move on.

Also, one needs to realise the fasm does not natively know about procedure/function demarcation points. That is all handled by the macros so if you do want to code a solution then the place to do it is in the macros.
Post 12 Nov 2013, 05:32
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
revolution wrote:
In the general case the problem is unsolvable because when you shift the position of code it can shrink or grow in size causing an optimal solution to not exist.

I had the same problem in the ARM assembler when choosing between 16-bit and 32-bit encodings in thumb mode. One can end up with an infinite loops trying to find a solution. There would need to be some mechanism to detect such cases and essentially "give up" and move on.

The only case in which this could happen, is if two functions did a CALL to each other, which would be mutual recursion. That is pretty rare. Also, CALL only has 16-bit and 32-bit offsets, but no 8-bit offset, so it is is pretty rare that there would be two functions right on the border of a 64K distance between each other that would cause this infinite-recursion problem to come up.

If there is any question, then don't bother with the recursive search for optimality, but just go ahead and do the paragraph align --- the worst case is that doing so will waste memory, but this is rarer than rare, and right now we have the worst case every time. It is better to be optimal 99.9999% of the time than optimal 50% of the time --- I'm not so demanding as to require 100%. Smile

revolution wrote:
Also, one needs to realise the fasm does not natively know about procedure/function demarcation points. That is all handled by the macros so if you do want to code a solution then the place to do it is in the macros.

I was assuming that there would be two new words introduced into the assembler syntax, such as FUNC and FEND (function-start and function-end) that would wrap each function. Note that a "function" doesn't necessarily end in RET --- in my Forth system, functions end in NEXT which is a macro, or in a JMP to another function. Right now, it is common to have ALIGN 16 in front of each function --- so the programmer would just replace that with FUNC --- and put a FEND after the function.

I'm trying to get my Forth VM to fit entirely in the 32K code-cache --- so, reducing code size is pretty important to me --- I'm planning on just doing the tedious manual fix-up after the program is finished, which I'm not looking forward to. Sad
Post 12 Nov 2013, 07:00
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17664
Location: In your JS exploiting you and your system
revolution
Hugh Aguilar wrote:
The only case in which this could happen, is if two functions did a CALL to each other, which would be mutual recursion.
No, that are other cases also. But regardless of how many ways there are to get into the situation, when it happens it can be very difficult and frustrating to find it and fix it if it is not handled by the alignment function automatically.

As for adding new directives/opcodes into the fasm core: I don't think that will happen. The usage case is small, it can be done by macros, and is a code management function (not a code generation function) so the design principles of fasm (if I understand them correctly) would preclude adding such directives/opcodes.
Post 12 Nov 2013, 07:13
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
Hugh Aguilar wrote:
I'm trying to get my Forth VM to fit entirely in the 32K code-cache --- so, reducing code size is pretty important to me --- I'm planning on just doing the tedious manual fix-up after the program is finished, which I'm not looking forward to. Sad

Note that a Forth system is different from most assembly-language programs --- a Forth system has a lot of functions (400ish), most of which are quite short (1 to 3 paragraphs), whereas the typical assembly-language program has relatively few functions which are relatively large. I'm saying "typical" here to describe what I've seen over the years --- although I don't really pay much attention to what other people are doing, so this may not be representative.

Anyway, because I'm writing a Forth system, this feature would help me more than it would help the "typical" FASM user. It should be pretty easy to implement though, so I would appreciate if it were done. Smile
Post 12 Nov 2013, 07:18
View user's profile Send private message Send e-mail Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
revolution wrote:
it can be done by macros

How is it possible to determine the size of a function at compile time, without assembling the code???
Post 12 Nov 2013, 07:20
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17664
Location: In your JS exploiting you and your system
revolution
fasm is a multi-pass assembler. Function size can be determined at each pass with something like this:
Code:
function_start:
  ;some code
function_size=$-function_start    
Post 12 Nov 2013, 07:23
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
Hugh Aguilar wrote:
Hugh Aguilar wrote:
I'm trying to get my Forth VM to fit entirely in the 32K code-cache --- so, reducing code size is pretty important to me --- I'm planning on just doing the tedious manual fix-up after the program is finished, which I'm not looking forward to. Sad

Note that a Forth system is different from most assembly-language programs --- a Forth system has a lot of functions (400ish), most of which are quite short (1 to 3 paragraphs), whereas the typical assembly-language program has relatively few functions which are relatively large. I'm saying "typical" here to describe what I've seen over the years --- although I don't really pay much attention to what other people are doing, so this may not be representative.

Anyway, because I'm writing a Forth system, this feature would help me more than it would help the "typical" FASM user. It should be pretty easy to implement though, so I would appreciate if it were done. Smile

Not just Forth --- any VM. There are a lot of people around who have invented some language that they are implementing as a VM --- this is pretty much the standard hobby that assembly-language programmers do in their free time. Laughing
Post 12 Nov 2013, 07:28
View user's profile Send private message Send e-mail Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
revolution wrote:
fasm is a multi-pass assembler. Function size can be determined at each pass with something like this:
Code:
function_start:
  ;some code
function_size=$-function_start    

If it can be done with a macro, then that is fine with me --- just so long as I don't have to do the manual fix-up myself.

I have no idea how to write such a macro though --- I haven't learned much about the macro language though, so maybe it can be done.
Post 12 Nov 2013, 07:33
View user's profile Send private message Send e-mail Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1419
Location: Toronto, Canada
AsmGuru62
Cool topic!

Let me add some information from my experience.
I had coded a large project a few years ago and all my functions were
preceded with this line:
Code:
align METHOD_ENTRY
Method1:
        ;
        ; ... some code here ...
        ;
        ret

align METHOD_ENTRY
Method2:
        ;
        ; ... some code here ...
        ;
        ret

... and so on in every file ...
    

So, I was able to define METHOD_ENTRY to be some value and then by
re-assembling the code I could see how many bytes I can 'save' by decreasing this value.

I must say. that difference was not as impressive as I hoped.
PE file was built around 60K +- ~2Kb, so it was approx. 58 to 62 Kb, which was not
that decisive for me. Only few percent winnings. I went with 32 bytes as
final value for METHOD_ENTRY, because that is how C compiler would do it.
Post 12 Nov 2013, 14:56
View user's profile Send private message Send e-mail Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7796
Location: Kraków, Poland
Tomasz Grysztar
Hugh Aguilar wrote:
I think the solution here is that the function will occupy the minimum number of 16-byte paragraphs if:
(function_start mod 16)+(function_size mod 16) < 16
This is quite easy to be translated into a macro in fasm:
Code:
macro FUNC {
        local function_start, function_size

        if (($-$$) mod 16)+(function_size mod 16) >= 16
          align 16
        end if

        function_start = $
        start@FUNC equ function_start
        size@FUNC equ function_size
}

macro FEND {
  size@FUNC = $ - start@FUNC
  restore start@FUNC
  restore size@FUNC 
}    
Post 12 Nov 2013, 15:14
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
Tomasz Grysztar wrote:
This is quite easy to be translated into a macro in fasm:


Thanks! Your macro language is a lot more powerful than I had realized. I had expected that to require a modification of the assembler itself.

My background is in Forth, which does a single-pass --- I may not have been grasping what a multi-pass assembler can do.

BTW: I asked the same question on the HLA mailing list, and Randy also said that it was easy and provided a macro --- so apparently all of you guys are way ahead of me.
Post 13 Nov 2013, 04:50
View user's profile Send private message Send e-mail Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
Hugh Aguilar wrote:
BTW: I asked the same question on the HLA mailing list, and Randy also said that it was easy and provided a macro --- so apparently all of you guys are way ahead of me.

I take that back --- HLA isn't doing it.

Randy says in regard to your macros:
Randy Hyde wrote:
These macros will not work in all cases.
They are a perfect example of a solution that will work for a good number of cases, but are not guaranteed to be optimal. However, they might work just fine for what you need to do. Especially if you're not changing branch displacements or data displacements by moving functions around.

He apparently doesn't want to do it if it won't work 100% of the time --- and I can't blame him for that.
Post 17 Nov 2013, 04:44
View user's profile Send private message Send e-mail Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7796
Location: Kraków, Poland
Tomasz Grysztar
Indeed, fasm never guarantees to find the optimal solution (as in: the shortest, the smallest) - that is not possible in general case. What it does guarantee is that the solution (if it finds one at all) will be correct - that means that all the relations in code that are defined (like: "if this function size is X then it must be preceded by Y) have to be fulfilled exactly.

You can find an interesting example of what fasm's code resolving does here: http://board.flatassembler.net/topic.php?t=4703
There is also a more detailed description of the whole process in my Understanding fasm article.

It is possible that fasm will fail to find the correct solution at all, and will simply exit with an error message "code cannot be generated" - this what revolution mentioned.
Post 17 Nov 2013, 10:20
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
So far, FASM seems to be the only x86 assembler that can do this with macros. I've been told that NASM can't do it. I don't remember TASM or A86 having this ability, although it was many many years ago when I used those, and I've never used MASM so I can't say in regard to it. I've never seen this in any other assembler, such as for the IBM370 or the many micro-controller assemblers I've used. This is why I expected that it would require an upgrade to the assembler itself, rather than being done in macros.

AFAIK, the only assemblers that support 64-bit x86 are NASM and FASM. I just chose FASM on a whim (because the waitress at the restaurant where I eat is from Poland, and so are you), but FASM seems to be a working out pretty well so far. Smile

I had previously been of the opinion that a macro assembler was a bad idea. I thought that a better design would be to have a simple assembler without macros, and then have a separate preprocessor to provide the macro language. It is easier to write 2 small programs that do distinct jobs (assembling, and preprocessing) rather than 1 big program that does both. Also, the preprocessor can be made to work with various assemblers for various micro-controllers, which wouldn't be possible if it were integrated into the assembler. Now I have to revise my opinion --- your FUNC and FEND wouldn't be possible in a preprocessor.

BTW: Does anybody here have an opinion on M4? I just recently heard about it. Right now, the most complicated thing that I'm doing with macros is building a linked-list at compile-time. The FASM macros are fine for this (as were HLA's, and NASM looks adequate too although I didn't delve into NASM). If I upgrade to a self-balancing binary tree in the future I would really be stretching your macro language, and I think a hash table would be impossible --- when that time comes, I may want to introduce a preprocessor. By then, my own language should be up and running, so I figured I would write the preprocessor in it, so it could help in generating version 2 of itself. But M4 might be another possibility. This is all in the future though --- right now, I can just use FASM macros for version 1.
Post 21 Nov 2013, 07:17
View user's profile Send private message Send e-mail Reply with quote
dogman



Joined: 18 Jul 2013
Posts: 114
dogman
Hugh Aguilar wrote:
I had previously been of the opinion that a macro assembler was a bad idea.


A good macro assembler is one of the most expressive and extensible programming languages around. The macro processor has to be integral to the assembler to get the most power. When it has access to the assembler's symbol table and other structures it has full meta-knowledge of the code being assembled and can provide the most power and flexibility possible. No external preprocessor is worth anything compared to a good macro assembler. The two best and possibly earliest examples of this are IBM's MVS assembler and their PL/I both of which have integrated macro languages.

Hugh Aguilar wrote:
I thought that a better design would be to have a simple assembler without macros, and then have a separate preprocessor to provide the macro language. It is easier to write 2 small programs that do distinct jobs (assembling, and preprocessing) rather than 1 big program that does both.


After you write the code, it doesn't make a difference if it was easier. It only makes a difference what's better. Sooner or later people need to learn to stop taking the easy way out and think things through, even though the UNIX-mindset opposes that and so does the lazy-man's mindset.

Hugh Aguilar wrote:
Also, the preprocessor can be made to work with various assemblers for various micro-controllers, which wouldn't be possible if it were integrated into the assembler.


There are always tradeoffs. Stuff that is more specific to the problem to be solved is usually better than something more general. This is just engineering.


Hugh Aguilar wrote:
BTW: Does anybody here have an opinion on M4? I just recently heard about it. Right now, the most complicated thing that I'm doing with macros is building a linked-list at compile-time. The FASM macros are fine for this (as were HLA's, and NASM looks adequate too although I didn't delve into NASM). If I upgrade to a self-balancing binary tree in the future I would really be stretching your macro language, and I think a hash table would be impossible --- when that time comes, I may want to introduce a preprocessor. By then, my own language should be up and running, so I figured I would write the preprocessor in it, so it could help in generating version 2 of itself. But M4 might be another possibility. This is all in the future though --- right now, I can just use FASM macros for version 1.


m4 is an abomination. It is so complicated and ugly it's hard to understand the sick mind(s) that came up with it. There is no excuse for m4. There is always a better solution, even if it means writing your own from scratch. If it weren't for sendmail configuration files m4 could be wiped off the face of the earth and things would be that much the better for it.

As far as I know you would not want to write tree handling in a macro language anyway. You need dynamic storage management and macro languages don't have that. You could (should) certainly write some good helper macros for managing and generating the assembly code though.

_________________
Sources? Ahahaha! We don't need no stinkin' sources!
Post 21 Nov 2013, 10:52
View user's profile Send private message Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar
dogman wrote:
The two best and possibly earliest examples of this are IBM's MVS assembler and their PL/I both of which have integrated macro languages.

I worked for almost 2 years as an IBM370 assembly-language programmer. I agree that it has a great macro language. I was the only person at my workplace who wrote reusable code, as everybody else just did cut-and-paste from old programs. I liked IBM370 assembly-language, but I found the mainframe culture to be rather sodden.

I don't know anything about PL/I, although I have heard other people say that it was a great language. Is it still used anywhere today?

My own language that I'm writing is Forth with the addition of quotations and the deletion of a ton of cruft --- so it will not only be significantly more powerful than ANS-Forth (especially for writing reusable libraries), but it will also be a lot simpler. This is what I'm focusing on right now --- although I'm always interested in learning new languages and new ideas, mostly with an eye toward incorporating those ideas into my language if I think they are worthwhile.

dogman wrote:
m4 is an abomination. It is so complicated and ugly it's hard to understand the sick mind(s) that came up with it. There is no excuse for m4. There is always a better solution, even if it means writing your own from scratch. If it weren't for sendmail configuration files m4 could be wiped off the face of the earth and things would be that much the better for it.

I've heard other people say that it is ugly, although your reaction is stronger than most.

My own predilection is to write programs from scratch myself, as that is often easier than learning how somebody else's program works. Sometimes the available program is so powerful as to be worth learning --- I doubt that is the case here though, as a preprocessor is not a complicated program to write (this is scripting language territory). I will experiment with M4 a little bit --- even if I don't end up using it, I may yet learn some good ideas that I can incorporate into my own preprocessor.
Post 23 Nov 2013, 00:51
View user's profile Send private message Send e-mail Reply with quote
dogman



Joined: 18 Jul 2013
Posts: 114
dogman
Hugh Aguilar wrote:
dogman wrote:
The two best and possibly earliest examples of this are IBM's MVS assembler and their PL/I both of which have integrated macro languages.

I worked for almost 2 years as an IBM370 assembly-language programmer. I agree that it has a great macro language. I was the only person at my workplace who wrote reusable code, as everybody else just did cut-and-paste from old programs. I liked IBM370 assembly-language, but I found the mainframe culture to be rather sodden.


I don't have time right now to look up sodden but we're still using assembler and the macro facility. And I still like it and the systems it runs on better than any other architecture or system or language I've ever come across.

Hugh Aguilar wrote:
I don't know anything about PL/I, although I have heard other people say that it was a great language. Is it still used anywhere today?


Yes, it is. It was always more heavily used in Europe (especially Germany) than elsewhere as a general rule but IBM is still updating their PL/I compiler and doc. I mean new features and other enhancements, not just bug fixes. I did not mean to suggest PL/I was a great language- it is, but I was trying to point out that PL/I's macro facility is another example of a great macro language that could not have been done in the style of C's isolated "macro" preprocessor.

Hugh Aguilar wrote:
dogman wrote:
m4 is an abomination. It is so complicated and ugly it's hard to understand the sick mind(s) that came up with it. There is no excuse for m4. There is always a better solution, even if it means writing your own from scratch. If it weren't for sendmail configuration files m4 could be wiped off the face of the earth and things would be that much the better for it.


I've heard other people say that it is ugly, although your reaction is stronger than most.


Given I consider UNIX ugly as sin and most of the people who have opinions on m4 come from a UNIX background (where ugly as sin comes standard and things like integrity and design are qualities that simply don't exist at all) I think you can now understand my comment in the proper context. Wink

Hugh Aguilar wrote:
Sometimes the available program is so powerful as to be worth learning --- I doubt that is the case here though, as a preprocessor is not a complicated program to write (this is scripting language territory).


I think you will find any preprocessor worth writing is as complicated as writing an entire new scripting language, because that's really not far from the truth.

Hugh Aguilar wrote:
I will experiment with M4 a little bit --- even if I don't end up using it, I may yet learn some good ideas that I can incorporate into my own preprocessor.


I don't believe there is anything to be gained from trying to become familiar with m4 except possibly as an example to never do that again.

_________________
Sources? Ahahaha! We don't need no stinkin' sources!
Post 27 Nov 2013, 17:38
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.