flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > Why two layers of preprocessing in fasm1? |
Author |
|
Tomasz Grysztar 28 Dec 2016, 13:22
The main reason for this separation was the performance. After the initial preprocessing (which was originally a very simple operation, fasm's preprocessor became more sophisticated over the years) the source text is converted into an internal "byte code" which is much faster to process during the multiple assembly passes. This internal code has no longer anything to do with original text, therefore any text processing (like macro processing) is no longer possible after this conversion.
For example "mov eax,offset+1" line becomes a sequence of codes that look like: [code: instruction; pointer to MOV handler] [code: register; size/type: 4, number: 0] [code: comma] [code: numeric expression; micro-operations: push value from a label structure at given pointer, push number 1, add two numbers] There is no longer a need to recognize any names, symbols like instruction names and labels are replaced with pointers to appropriate handlers or data structures. This allows fasm 1 to do multiple passes over such source quite quickly. The idea of fasm 2 (now implemented in fasmg) was to get rid of these layers, but it required abandoning the internal code, and even when I first talked about fasm 2 at fasmcon 2009 I warned that for this reason fasm 2 would be much slower - because it would need to perform all the assembly passes on the source text. In fasmg the text is pre-tokenized (in a manner similar to how preprocessor was doing it in fasm 1), but the textual content of the tokens needs to be preserved. The macros and other features like namespace switching may cause the same name to refer to many different entities depending on context, so any name needs to be re-interpreted every time it is encountered. On the other hand, the modern machines are fast enough that even assembly with an instruction set implemented in a form of macros is bearable on small projects, as fasmg's self-assembly demonstrates. In fact, I am so satisfied with these macros and how flexible they are that I no longer feel any pressure to create fasm 2 with internal x86 encoder, I just want to write more and more macros. |
|||
28 Dec 2016, 13:22 |
|
JohnFound 28 Dec 2016, 14:50
Tomasz Grysztar wrote: In fact, I am so satisfied with these macros and how flexible they are that I no longer feel any pressure to create fasm 2 with internal x86 encoder, I just want to write more and more macros. But what is the asymptotic behavior of fasmg on big sources? This can be much more important than the speed of compilation on particular (newer or older) processors. _________________ Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9 |
|||
28 Dec 2016, 14:50 |
|
Tomasz Grysztar 28 Dec 2016, 15:36
JohnFound wrote: But what is the asymptotic behavior of fasmg on big sources? This can be much more important than the speed of compilation on particular (newer or older) processors. In cases where the number of passes stays constant, the assembly with fasmg should be slower by a constant factor compared to fasm 1. You could very roughly approximate it by imagining how many lines from nested macros fasmg needs to process where fasm 1 processed just one line with internally handled instruction. |
|||
28 Dec 2016, 15:36 |
|
JohnFound 28 Dec 2016, 19:38
|
|||
28 Dec 2016, 19:38 |
|
Tomasz Grysztar 28 Dec 2016, 20:47
OK, if you are interested in some pure engine comparisons, I did some tests with very simple but huge sources generated with a fasmg script that looks like:
Code: N = 4000000 db 'test0 = 0',13,10 rept N, i:0,j:1 db 'test',`j,' = (test',`i,'*33797+1) and 0FFFFFFFFh',13,10 if %=%% db 'dd test',`j end if end rept Code: N fasm 1.71.58 fasm g.hll54 100000 0.3 s 0.2 s 200000 0.6 s 0.3 s 300000 1.1 s 0.6 s 400000 1.5 s 0.8 s 600000 2.3 s 1.3 s 800000 3.1 s 1.8 s 1000000 4.1 s 2.1 s 2000000 8.7 s 4.1 s 4000000 out of memory 8.1 s In the past I also performed similar tests with semi-randomized label names and deeper cross-referencing between values, but the results were similar. |
|||
28 Dec 2016, 20:47 |
|
l_inc 30 Dec 2016, 01:13
It's important to note that the comparison is made for the initial pass only. To make the comparison fair it's important to compare subsequent passes as well:
Code: N = 4000000 P = 2 rept 1, v:P db 'P = ',`v,13,10 end rept db 'if P > 2',13,10 db ' PASS = PASS + (1-PASS/(P-2))',13,10 db 'end if',13,10 rept N, i:0,j:1 db 'test',`j,' = (test',`i,'*33797+1) and 0FFFFFFFFh',13,10 end rept db 'test0 = 0',13,10 rept 1, v:N db 'dd test',`v,13,10 end rept The following tests are made with fixed N = 4000000 and varying P ∊ {2..8}: Code: fasm 1.71.57 fasm g.hnq73 2 passes, 10.9 seconds 2 passes, 11.3 seconds 3 passes, 11.2 seconds 3 passes, 16.7 seconds 4 passes, 11.6 seconds 4 passes, 21.7 seconds 5 passes, 11.8 seconds 5 passes, 27.1 seconds 6 passes, 12.0 seconds 6 passes, 32.0 seconds 7 passes, 12.3 seconds 7 passes, 37.6 seconds 8 passes, 12.4 seconds 8 passes, 42.3 seconds Results for the original fasmg script on my machine: Code: fasm 1.71.57 fasm g.hnq73 1 passes, 10.7 seconds 1 pass, 5.9 seconds P.S. To summarize, the first pass is about 2 times more expensive for fasm 1 (1.8 for my machine), but each subsequent pass is about 20 times cheaper (21.4 for my machine). _________________ Faith is a superposition of knowledge and fallacy |
|||
30 Dec 2016, 01:13 |
|
l4m2 30 Dec 2016, 02:42
What about the kinds of loop(1rept,1repeat,1hardloop,grept,ghardloop)
|
|||
30 Dec 2016, 02:42 |
|
Tomasz Grysztar 30 Dec 2016, 09:51
l_inc wrote: It's important to note that the comparison is made for the initial pass only. To make the comparison fair it's important to compare subsequent passes as well (...) In case of fasm 1 the expensive preprocessing and parsing are done only once and the passes are performed quickly over the internal bytecode (as explained above), so adding another pass has little effect on the overall time. If fasm displayed the durations of preprocessing and parsing stages, you could notice that all the assembly passes also take approximately the same time each - but the overhead of parsing increases the total length. l4m2 wrote: What about the kinds of loop(1rept,1repeat,1hardloop,grept,ghardloop) Yes, there is another place where the same effect shows up - the assembly loops (like REPEAT or WHILE in fasm 1). Again, fasm 1 preprocesses and parses the text of such loop only once (that's why you cannot have labels local to single repetition in fasm 1) and then repeatedly processes the internal bytecode, while fasmg operates on the source text every time. So you should find REPEAT and WHILE much slower than in fasm 1, too. On the other hand, REPT should be comparable. PS. l_inc, nice to see you show up again to give a sober comment. And I'm glad that you still find some time to play with fasm (1 or g). |
|||
30 Dec 2016, 09:51 |
|
l_inc 30 Dec 2016, 14:59
Tomasz Grysztar
Quote: As I mentioned repeatedly... I'm not saying you did not warn about differences in source processing behaviour and corresponding timing consequences. I know you did, many times since the very appearance of fasmg. I'm just saying the actual numbers are incomplete. I see no reason to limit oneself to comparison of textual source code processing if fasm 1 uses intermediate byte-code interpretation as well. Quote: But since there is no simple correspondence between the length of source text and the number of passes, I saw it as an independent problem from the asymptotic data that JohnFound asked for. The asymptotic timing behaviour can be expressed by a multivariate function. Quote: adding another pass has little effect on the overall time Sure. It's just nice to have this "little" being expressed in numbers. I.e., 20 times less than fasmg's textual processing. Especially because, as you note, this "little" is not limited to subsequent passes, but also to subsequent assembly-time loop iterations, both of which are common causes of long compilation times. Quote: nice to see you show up again to give a sober comment. And I'm glad that you still find some time to play with fasm (1 or g) Unfortunately I don't. I hope I will — maybe, in a few months — as I still have community-supportive tasks on my todo-list. But thank you for the motivation. _________________ Faith is a superposition of knowledge and fallacy |
|||
30 Dec 2016, 14:59 |
|
Tomasz Grysztar 30 Dec 2016, 16:09
l_inc wrote: The asymptotic timing behaviour can be expressed by a multivariate function. l_inc wrote: Unfortunately I don't. I hope I will — maybe, in a few months — as I still have community-supportive tasks on my todo-list. But thank you for the motivation. |
|||
30 Dec 2016, 16:09 |
|
ACP 06 Jan 2017, 14:53
Sorry for interrupting such interesting thread and making an off-topic post but I am in the processes of rewriting very specific assembler, and that gives me a chance to (re)design it as well. Since this was a very simple multi pass assembler (actually 2 passes only as the syntax is so simple) I am thinking about extending syntax and making it truly a multi pass assembler. This gives an opportunity to either work internally on source code representation after preprocessing it or using fasmg approach.
I'd like to hear Tomasz opinion on which method you think is the future of assemblers. I don't have any speed/memory constrains to consider as this is a cross assembler. |
|||
06 Jan 2017, 14:53 |
|
Tomasz Grysztar 06 Jan 2017, 15:24
I think that fasmg approach is generally more in line with the expectations of users. There are many cases in fasm 1 where the separation of text-processing phase from the actual assembly is a source of unintuitive limitations. So when performance and resource usage is not your concern, I think it is better to use a text-based processing which is more predictable for a user.
|
|||
06 Jan 2017, 15:24 |
|
ACP 06 Jan 2017, 18:13
Tomasz Grysztar wrote: I think that fasmg approach is generally more in line with the expectations of users. Thank you for your feedback Tomasz. Again I don't want to hijack this thread so maybe we can move this discussion to another one, but I do wonder how you handle internally macros like "include" or inclusion of binary data from external files. I'm asking since I think it is easier to implement in old fasm approach. |
|||
06 Jan 2017, 18:13 |
|
Tomasz Grysztar 06 Jan 2017, 18:42
ACP wrote: Thank you for your feedback Tomasz. Again I don't want to hijack this thread so maybe we can move this discussion to another one, but I do wonder how you handle internally macros like "include" or inclusion of binary data from external files. I'm asking since I think it is easier to implement in old fasm approach. |
|||
06 Jan 2017, 18:42 |
|
ACP 06 Jan 2017, 20:14
Wow, sounds pretty complex for a simple task but I can understand where are you coming from. That is probably a great overhead that would introduced not needed complexity in case of my project. Overall I really like your elegant design of this functionality. Thanks for saving me trying to get all those implementation and design bits from the source code.
|
|||
06 Jan 2017, 20:14 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.