flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > paragraph aligning |
Author |
|
revolution 12 Nov 2013, 05:32
In the general case the problem is unsolvable because when you shift the position of code it can shrink or grow in size causing an optimal solution to not exist.
I had the same problem in the ARM assembler when choosing between 16-bit and 32-bit encodings in thumb mode. One can end up with an infinite loops trying to find a solution. There would need to be some mechanism to detect such cases and essentially "give up" and move on. Also, one needs to realise the fasm does not natively know about procedure/function demarcation points. That is all handled by the macros so if you do want to code a solution then the place to do it is in the macros. |
|||
12 Nov 2013, 05:32 |
|
Hugh Aguilar 12 Nov 2013, 07:00
revolution wrote: In the general case the problem is unsolvable because when you shift the position of code it can shrink or grow in size causing an optimal solution to not exist. The only case in which this could happen, is if two functions did a CALL to each other, which would be mutual recursion. That is pretty rare. Also, CALL only has 16-bit and 32-bit offsets, but no 8-bit offset, so it is is pretty rare that there would be two functions right on the border of a 64K distance between each other that would cause this infinite-recursion problem to come up. If there is any question, then don't bother with the recursive search for optimality, but just go ahead and do the paragraph align --- the worst case is that doing so will waste memory, but this is rarer than rare, and right now we have the worst case every time. It is better to be optimal 99.9999% of the time than optimal 50% of the time --- I'm not so demanding as to require 100%. revolution wrote: Also, one needs to realise the fasm does not natively know about procedure/function demarcation points. That is all handled by the macros so if you do want to code a solution then the place to do it is in the macros. I was assuming that there would be two new words introduced into the assembler syntax, such as FUNC and FEND (function-start and function-end) that would wrap each function. Note that a "function" doesn't necessarily end in RET --- in my Forth system, functions end in NEXT which is a macro, or in a JMP to another function. Right now, it is common to have ALIGN 16 in front of each function --- so the programmer would just replace that with FUNC --- and put a FEND after the function. I'm trying to get my Forth VM to fit entirely in the 32K code-cache --- so, reducing code size is pretty important to me --- I'm planning on just doing the tedious manual fix-up after the program is finished, which I'm not looking forward to. |
|||
12 Nov 2013, 07:00 |
|
revolution 12 Nov 2013, 07:13
Hugh Aguilar wrote: The only case in which this could happen, is if two functions did a CALL to each other, which would be mutual recursion. As for adding new directives/opcodes into the fasm core: I don't think that will happen. The usage case is small, it can be done by macros, and is a code management function (not a code generation function) so the design principles of fasm (if I understand them correctly) would preclude adding such directives/opcodes. |
|||
12 Nov 2013, 07:13 |
|
Hugh Aguilar 12 Nov 2013, 07:18
Hugh Aguilar wrote: I'm trying to get my Forth VM to fit entirely in the 32K code-cache --- so, reducing code size is pretty important to me --- I'm planning on just doing the tedious manual fix-up after the program is finished, which I'm not looking forward to. Note that a Forth system is different from most assembly-language programs --- a Forth system has a lot of functions (400ish), most of which are quite short (1 to 3 paragraphs), whereas the typical assembly-language program has relatively few functions which are relatively large. I'm saying "typical" here to describe what I've seen over the years --- although I don't really pay much attention to what other people are doing, so this may not be representative. Anyway, because I'm writing a Forth system, this feature would help me more than it would help the "typical" FASM user. It should be pretty easy to implement though, so I would appreciate if it were done. |
|||
12 Nov 2013, 07:18 |
|
Hugh Aguilar 12 Nov 2013, 07:20
revolution wrote: it can be done by macros How is it possible to determine the size of a function at compile time, without assembling the code??? |
|||
12 Nov 2013, 07:20 |
|
revolution 12 Nov 2013, 07:23
fasm is a multi-pass assembler. Function size can be determined at each pass with something like this:
Code: function_start: ;some code function_size=$-function_start |
|||
12 Nov 2013, 07:23 |
|
Hugh Aguilar 12 Nov 2013, 07:28
Hugh Aguilar wrote:
Not just Forth --- any VM. There are a lot of people around who have invented some language that they are implementing as a VM --- this is pretty much the standard hobby that assembly-language programmers do in their free time. |
|||
12 Nov 2013, 07:28 |
|
Hugh Aguilar 12 Nov 2013, 07:33
revolution wrote: fasm is a multi-pass assembler. Function size can be determined at each pass with something like this: If it can be done with a macro, then that is fine with me --- just so long as I don't have to do the manual fix-up myself. I have no idea how to write such a macro though --- I haven't learned much about the macro language though, so maybe it can be done. |
|||
12 Nov 2013, 07:33 |
|
AsmGuru62 12 Nov 2013, 14:56
Cool topic!
Let me add some information from my experience. I had coded a large project a few years ago and all my functions were preceded with this line: Code: align METHOD_ENTRY Method1: ; ; ... some code here ... ; ret align METHOD_ENTRY Method2: ; ; ... some code here ... ; ret ... and so on in every file ... So, I was able to define METHOD_ENTRY to be some value and then by re-assembling the code I could see how many bytes I can 'save' by decreasing this value. I must say. that difference was not as impressive as I hoped. PE file was built around 60K +- ~2Kb, so it was approx. 58 to 62 Kb, which was not that decisive for me. Only few percent winnings. I went with 32 bytes as final value for METHOD_ENTRY, because that is how C compiler would do it. |
|||
12 Nov 2013, 14:56 |
|
Tomasz Grysztar 12 Nov 2013, 15:14
Hugh Aguilar wrote: I think the solution here is that the function will occupy the minimum number of 16-byte paragraphs if: Code: macro FUNC { local function_start, function_size if (($-$$) mod 16)+(function_size mod 16) >= 16 align 16 end if function_start = $ start@FUNC equ function_start size@FUNC equ function_size } macro FEND { size@FUNC = $ - start@FUNC restore start@FUNC restore size@FUNC } |
|||
12 Nov 2013, 15:14 |
|
Hugh Aguilar 13 Nov 2013, 04:50
Tomasz Grysztar wrote: This is quite easy to be translated into a macro in fasm: Thanks! Your macro language is a lot more powerful than I had realized. I had expected that to require a modification of the assembler itself. My background is in Forth, which does a single-pass --- I may not have been grasping what a multi-pass assembler can do. BTW: I asked the same question on the HLA mailing list, and Randy also said that it was easy and provided a macro --- so apparently all of you guys are way ahead of me. |
|||
13 Nov 2013, 04:50 |
|
Hugh Aguilar 17 Nov 2013, 04:44
Hugh Aguilar wrote: BTW: I asked the same question on the HLA mailing list, and Randy also said that it was easy and provided a macro --- so apparently all of you guys are way ahead of me. I take that back --- HLA isn't doing it. Randy says in regard to your macros: Randy Hyde wrote: These macros will not work in all cases. He apparently doesn't want to do it if it won't work 100% of the time --- and I can't blame him for that. |
|||
17 Nov 2013, 04:44 |
|
Tomasz Grysztar 17 Nov 2013, 10:20
Indeed, fasm never guarantees to find the optimal solution (as in: the shortest, the smallest) - that is not possible in general case. What it does guarantee is that the solution (if it finds one at all) will be correct - that means that all the relations in code that are defined (like: "if this function size is X then it must be preceded by Y) have to be fulfilled exactly.
You can find an interesting example of what fasm's code resolving does here: http://board.flatassembler.net/topic.php?t=4703 There is also a more detailed description of the whole process in my Understanding fasm article. It is possible that fasm will fail to find the correct solution at all, and will simply exit with an error message "code cannot be generated" - this what revolution mentioned. |
|||
17 Nov 2013, 10:20 |
|
Hugh Aguilar 21 Nov 2013, 07:17
So far, FASM seems to be the only x86 assembler that can do this with macros. I've been told that NASM can't do it. I don't remember TASM or A86 having this ability, although it was many many years ago when I used those, and I've never used MASM so I can't say in regard to it. I've never seen this in any other assembler, such as for the IBM370 or the many micro-controller assemblers I've used. This is why I expected that it would require an upgrade to the assembler itself, rather than being done in macros.
AFAIK, the only assemblers that support 64-bit x86 are NASM and FASM. I just chose FASM on a whim (because the waitress at the restaurant where I eat is from Poland, and so are you), but FASM seems to be a working out pretty well so far. I had previously been of the opinion that a macro assembler was a bad idea. I thought that a better design would be to have a simple assembler without macros, and then have a separate preprocessor to provide the macro language. It is easier to write 2 small programs that do distinct jobs (assembling, and preprocessing) rather than 1 big program that does both. Also, the preprocessor can be made to work with various assemblers for various micro-controllers, which wouldn't be possible if it were integrated into the assembler. Now I have to revise my opinion --- your FUNC and FEND wouldn't be possible in a preprocessor. BTW: Does anybody here have an opinion on M4? I just recently heard about it. Right now, the most complicated thing that I'm doing with macros is building a linked-list at compile-time. The FASM macros are fine for this (as were HLA's, and NASM looks adequate too although I didn't delve into NASM). If I upgrade to a self-balancing binary tree in the future I would really be stretching your macro language, and I think a hash table would be impossible --- when that time comes, I may want to introduce a preprocessor. By then, my own language should be up and running, so I figured I would write the preprocessor in it, so it could help in generating version 2 of itself. But M4 might be another possibility. This is all in the future though --- right now, I can just use FASM macros for version 1. |
|||
21 Nov 2013, 07:17 |
|
dogman 21 Nov 2013, 10:52
Hugh Aguilar wrote: I had previously been of the opinion that a macro assembler was a bad idea. A good macro assembler is one of the most expressive and extensible programming languages around. The macro processor has to be integral to the assembler to get the most power. When it has access to the assembler's symbol table and other structures it has full meta-knowledge of the code being assembled and can provide the most power and flexibility possible. No external preprocessor is worth anything compared to a good macro assembler. The two best and possibly earliest examples of this are IBM's MVS assembler and their PL/I both of which have integrated macro languages. Hugh Aguilar wrote: I thought that a better design would be to have a simple assembler without macros, and then have a separate preprocessor to provide the macro language. It is easier to write 2 small programs that do distinct jobs (assembling, and preprocessing) rather than 1 big program that does both. After you write the code, it doesn't make a difference if it was easier. It only makes a difference what's better. Sooner or later people need to learn to stop taking the easy way out and think things through, even though the UNIX-mindset opposes that and so does the lazy-man's mindset. Hugh Aguilar wrote: Also, the preprocessor can be made to work with various assemblers for various micro-controllers, which wouldn't be possible if it were integrated into the assembler. There are always tradeoffs. Stuff that is more specific to the problem to be solved is usually better than something more general. This is just engineering. Hugh Aguilar wrote: BTW: Does anybody here have an opinion on M4? I just recently heard about it. Right now, the most complicated thing that I'm doing with macros is building a linked-list at compile-time. The FASM macros are fine for this (as were HLA's, and NASM looks adequate too although I didn't delve into NASM). If I upgrade to a self-balancing binary tree in the future I would really be stretching your macro language, and I think a hash table would be impossible --- when that time comes, I may want to introduce a preprocessor. By then, my own language should be up and running, so I figured I would write the preprocessor in it, so it could help in generating version 2 of itself. But M4 might be another possibility. This is all in the future though --- right now, I can just use FASM macros for version 1. m4 is an abomination. It is so complicated and ugly it's hard to understand the sick mind(s) that came up with it. There is no excuse for m4. There is always a better solution, even if it means writing your own from scratch. If it weren't for sendmail configuration files m4 could be wiped off the face of the earth and things would be that much the better for it. As far as I know you would not want to write tree handling in a macro language anyway. You need dynamic storage management and macro languages don't have that. You could (should) certainly write some good helper macros for managing and generating the assembly code though. _________________ Sources? Ahahaha! We don't need no stinkin' sources! |
|||
21 Nov 2013, 10:52 |
|
Hugh Aguilar 23 Nov 2013, 00:51
dogman wrote: The two best and possibly earliest examples of this are IBM's MVS assembler and their PL/I both of which have integrated macro languages. I worked for almost 2 years as an IBM370 assembly-language programmer. I agree that it has a great macro language. I was the only person at my workplace who wrote reusable code, as everybody else just did cut-and-paste from old programs. I liked IBM370 assembly-language, but I found the mainframe culture to be rather sodden. I don't know anything about PL/I, although I have heard other people say that it was a great language. Is it still used anywhere today? My own language that I'm writing is Forth with the addition of quotations and the deletion of a ton of cruft --- so it will not only be significantly more powerful than ANS-Forth (especially for writing reusable libraries), but it will also be a lot simpler. This is what I'm focusing on right now --- although I'm always interested in learning new languages and new ideas, mostly with an eye toward incorporating those ideas into my language if I think they are worthwhile. dogman wrote: m4 is an abomination. It is so complicated and ugly it's hard to understand the sick mind(s) that came up with it. There is no excuse for m4. There is always a better solution, even if it means writing your own from scratch. If it weren't for sendmail configuration files m4 could be wiped off the face of the earth and things would be that much the better for it. I've heard other people say that it is ugly, although your reaction is stronger than most. My own predilection is to write programs from scratch myself, as that is often easier than learning how somebody else's program works. Sometimes the available program is so powerful as to be worth learning --- I doubt that is the case here though, as a preprocessor is not a complicated program to write (this is scripting language territory). I will experiment with M4 a little bit --- even if I don't end up using it, I may yet learn some good ideas that I can incorporate into my own preprocessor. |
|||
23 Nov 2013, 00:51 |
|
dogman 27 Nov 2013, 17:38
Hugh Aguilar wrote:
I don't have time right now to look up sodden but we're still using assembler and the macro facility. And I still like it and the systems it runs on better than any other architecture or system or language I've ever come across. Hugh Aguilar wrote: I don't know anything about PL/I, although I have heard other people say that it was a great language. Is it still used anywhere today? Yes, it is. It was always more heavily used in Europe (especially Germany) than elsewhere as a general rule but IBM is still updating their PL/I compiler and doc. I mean new features and other enhancements, not just bug fixes. I did not mean to suggest PL/I was a great language- it is, but I was trying to point out that PL/I's macro facility is another example of a great macro language that could not have been done in the style of C's isolated "macro" preprocessor. Hugh Aguilar wrote:
Given I consider UNIX ugly as sin and most of the people who have opinions on m4 come from a UNIX background (where ugly as sin comes standard and things like integrity and design are qualities that simply don't exist at all) I think you can now understand my comment in the proper context. Hugh Aguilar wrote: Sometimes the available program is so powerful as to be worth learning --- I doubt that is the case here though, as a preprocessor is not a complicated program to write (this is scripting language territory). I think you will find any preprocessor worth writing is as complicated as writing an entire new scripting language, because that's really not far from the truth. Hugh Aguilar wrote: I will experiment with M4 a little bit --- even if I don't end up using it, I may yet learn some good ideas that I can incorporate into my own preprocessor. I don't believe there is anything to be gained from trying to become familiar with m4 except possibly as an example to never do that again. _________________ Sources? Ahahaha! We don't need no stinkin' sources! |
|||
27 Nov 2013, 17:38 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.