flat assembler
Message board for the users of flat assembler.
Index
> Main > alpha testing - speed improvements Goto page Previous 1, 2, 3, 4 Next |
Author |
|
scientica 23 Jan 2004, 20:46
W00t?!?!?! Is there something wrong or is just fasm insanely fast all of a sudden??:
Code: [frekla@ns1 fasm_tmp]$ ./fasm501a3 dwordGEN.ASM dwordGEN.501a3OUT flat assembler version 1.51 alpha 3 2 passes, 19.2 seconds, 4744967 bytes. [frekla@ns1 fasm_tmp]$ ll dwordGEN.501a3OUT -rw-r----- 1 frekla frekla 4744967 23 jan 21.38 dwordGEN.501a3OUT [frekla@ns1 fasm_tmp]$ Can any one test this one? (second run gave me 20.5 sec !!) (I attach the modifyed source, note you must change line 428 from: Code: if (first < 0) __asm__ ("int $3 "); // YUCK GAS syntax!!! to this to get it to compile with non gcc compiler Code: if (first < 0) __asm int 3; ) as I'm in linux I can't provide a pre-compiled exe (only elf you some wants) [EDIT]Pre-generated asm file can be downloaded from here: http://w1.364.telia.com/~u36404088/tmp/dwordGEN.ASM.tgz [/EDIT]
_________________ ... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself. - Bradley Kuhn |
|||||||||||
23 Jan 2004, 20:46 |
|
khanh 24 Jan 2004, 10:38
Great work!
I'm hearing forward from your success |
|||
24 Jan 2004, 10:38 |
|
Tommy 24 Jan 2004, 18:22
Great Privalov! Every little improvement is worth it! Keep it up!
Cheers, Tommy |
|||
24 Jan 2004, 18:22 |
|
Randall Hyde 30 Jan 2004, 23:37
Hi all,
I've modified my assembly benchmark generator program to produce some FASM compatible output. I'm sure the macros I wrote really stink, so if anyone has some suggestions for the code generation for FASM, I'd appreciate some input. You can read about the assembler benchmark generator here: http://www.masmforum.com/topic.php?t=1873&postdays=0&postorder=asc&start=0 And you can download the code with the FASM generator below. Cheers, Randy Hyde BTW, here's a typical code sequence generated by the benchmark generator: Code: format MS COFF _fk_73n equ 0560A7409h _14udcdksnb00 equ 1896619531 w_ equ 0AABE57BDh u_t393h equ 02C65D964h cxfra_p10bp03vs equ 3305213952 section '.data' data readable writeable align 16 sv__fk_73n dd _fk_73n-1443525641+cxfra_p10bp03vs-0C5019000h+u_t393h sv__14udcdksnb00 dd _14udcdksnb00-0710C1E0Bh+u_t393h-744872292+_fk_73n sv_w_ dd w_-2864601021+w_-2864601021+u_t393h sv_u_t393h dd u_t393h-744872292+_14udcdksnb00-0710C1E0Bh+u_t393h sv_cxfra_p10bp03vs dd cxfra_p10bp03vs-0C5019000h+_fk_73n-1443525641+_fk_73n macro empty { } macro parms a, b, c, d, e { db a, b, c, d, e } empty empty empty empty empty parms 0,1,2,3,4 parms 0,1,2,3,4 parms 0,1,2,3,4 parms 0,1,2,3,4 parms 0,1,2,3,4 section '.text' code readable executable align 16 macro _if r1, r2 { cmp r1, r2 jne @f } macro _elseif r1, r2 { @@: cmp r1, r2 jne @f } macro _else { @@: } macro _endif { @@: } macro _while r1, r2 { @@: cmp r1, r2 jnb @f } macro _endw { jmp @b @@: } _if eax, ebx _elseif ecx, 1 _else _endif _while eax, 10 inc eax _endw _if eax, ebx _elseif ecx, 1 _else _endif _while eax, 10 inc eax _endw _if eax, ebx _elseif ecx, 1 _else _endif _while eax, 10 inc eax _endw _if eax, ebx _elseif ecx, 1 _else _endif _while eax, 10 inc eax _endw _if eax, ebx _elseif ecx, 1 _else _endif _while eax, 10 inc eax _endw
|
|||||||||||
30 Jan 2004, 23:37 |
|
Tomasz Grysztar 30 Jan 2004, 23:42
You shouldn't use "equ" for equates, as in fasm "equ" is used for defining symbolic constants - a kind of macros. Numerical constants are defined using = symbol in fasm.
|
|||
30 Jan 2004, 23:42 |
|
Tomasz Grysztar 31 Jan 2004, 00:48
I'm attaching the corrected version of benchmark, which defines the equates fasm's way.
And here are some result of the benchmark with parameter n only (time in seconds, tested on Duron 600). As you can see, the improvement of fasm 1.51 over 1.50 is really large for bigger files. I have put also results for MASM 6.14 for comparision, though it refused to assemble files with n larger than 3000, due to problems with such a number of .if-like structures. Code: n fasm 1.50 fasm 1.51 MASM 6.14 1000 0.25 0.15 0.35 2000 0.45 0.35 0.65 3000 0.75 0.45 1.1 4000 1.1 0.65 ? 5000 1.5 0.75 ? 10000 7.9 1.5 ? 20000 71.2 3.0 ? So I have made another test, attaching the h1000 parameter after n setting to keep the amount of .if-structures small. The results are: Code: n fasm 1.50 fasm 1.51 MASM 6.14 1000 0.25 0.15 0.25 2000 0.35 0.25 0.35 5000 1.2 0.45 0.65 10000 7.3 0.75 1.3 20000 69.4 1.4 3.4 50000 487.9 3.6 15.6 100000 ? 7.9 59.5 I wasn't just patient enough to wait for fasm 1.50 to assemble the "n100000 h1000" benchmark. Also, for higher n values I tried the benchmark generated some duplicated labels, so it became unusable.
|
|||||||||||
31 Jan 2004, 00:48 |
|
comrade 31 Jan 2004, 01:23
Unbelievable!
|
|||
31 Jan 2004, 01:23 |
|
Randall Hyde 31 Jan 2004, 05:52
Privalov wrote: You shouldn't use "equ" for equates, as in fasm "equ" is used for defining symbolic constants - a kind of macros. Numerical constants are defined using = symbol in fasm. Okay, that's an easy fix. And a good thing to know (e.g., for HLA code generation which uses EQUs). Cheers, Randy Hyde |
|||
31 Jan 2004, 05:52 |
|
Randall Hyde 31 Jan 2004, 05:55
Privalov wrote: Also, for higher n values I tried the benchmark generated some duplicated labels, so it became unusable. Hmmm... That's interesting, I'll have to figure out why the duplicate symbol problem is occurring. Amazing speedups on the large files, though. Cheers, Randy Hyde |
|||
31 Jan 2004, 05:55 |
|
JohnFound 31 Jan 2004, 08:15
Hi all.
Here are the results of above tests in graphical view. You can see, that FASM 1.51 graphic is very close to straight line, that IMHO shows that it uses almost ideal algorithms for building label lists. Even MASM's graphic becomes warped at big values. Unfortunately that means that future improvements in the speed (of parser) are almost imposible. Regards.
Last edited by JohnFound on 03 Feb 2004, 22:20; edited 1 time in total |
||||||||||
31 Jan 2004, 08:15 |
|
Randall Hyde 02 Feb 2004, 16:20
JohnFound wrote: Hi all. I'm assuming the Y-axis was supposed to be some unit involving time . As for parser improvement, that depends mainly on whether the symbol table routines are still consuming the most time or whether the bottleneck has shifted elsewhere. OTOH, assembly times have probably gotten to the point where, even for large project, we don't need to worry about them anymore. What's the current assembly time for FRESH? Cheers, Randy Hyde |
|||
02 Feb 2004, 16:20 |
|
JohnFound 02 Feb 2004, 16:36
Randall Hyde wrote: I'm assuming the Y-axis was supposed to be some unit involving time . Well, the units are: X - n parameter of your benchmark, and Y - time in seconds. You know, some people still uses "," for decimal separator and "nothing" for thousends. Quote: As for parser improvement, that depends mainly on whether the symbol table routines are still consuming the most time or whether the bottleneck has shifted elsewhere. Your test in it's present stage can significaly load only the parser of FASM, the results indicate only the performance of the parser and maybe a little of the preprocessor. Regards. |
|||
02 Feb 2004, 16:36 |
|
Randall Hyde 03 Feb 2004, 21:19
JohnFound wrote:
I was wondering more about the numbers themselves. 10,000 seconds at the low end seems kind of high Cheers, Randy Hyde |
|||
03 Feb 2004, 21:19 |
|
pelaillo 03 Feb 2004, 21:32
2hr 42min
|
|||
03 Feb 2004, 21:32 |
|
JohnFound 03 Feb 2004, 22:22
It is fixed. Thank you.
|
|||
03 Feb 2004, 22:22 |
|
HarryTuttle 04 Feb 2004, 11:02
it seems that Fasm1.47 is faster compared with all tested here.
What do you think about it? _________________ Microsoft: brings power of yesterday to computers of today. |
|||
04 Feb 2004, 11:02 |
|
JohnFound 04 Feb 2004, 11:34
HarryTuttle wrote: it seems that Fasm1.47 is faster compared with all tested here. Hm, what actually you mean??? |
|||
04 Feb 2004, 11:34 |
|
Octavio 21 Mar 2004, 03:32
[quote="Privalov"]Recently I've decided it is the high time to suspend adding new features into fasm and concentrate on speed optimizations of what we've already got -
I have been testing a new algorithm to calculate branch displacements with a correction value for backwards jumps ,whith this algoritm the numbers of pases seems to be independent of the program size and in most cases only 3-6 passes are required. the correction value is the diference betwen the actual value of a label and the value stored in a previous pass . |
|||
21 Mar 2004, 03:32 |
|
CodeForce 22 Jun 2004, 15:09
Hi,
I came accross this thread when searching for hash algorihms and think I can add some info/suggestions: One of the better sites I seen about hashing is this one. It compares different hasing algorithms and I am sure some of them will fit fasm's workload better then others, its worth taking a look. ANother link is this one. As for the statiement that hasing problably won't improve the application beyond its current state, I like to add some food for thought. Hasing performance consists of various parameters: 1. Hash table size 2. Key distribution 3. Key collision handling 4. Open/close hash tables And some more. Try experimeting beyond just replacing one hash function for some other and change collision handling for example. For closed tables considder using the next slot in the table when having a collision, if that does not work, use the one after that. Why? Because it improves locality of reference, somehting which current processors are very sensitive to. For open tables, try to use sorted lists after each table entry. Even better then lists are arrays fo corse (for the same reason, but you need a very good distribution for this). Also considder mixing both methods, eg, each hash table entry its own fixed size array, but when missed, considder the next array in the hash table as an alternative. This can keep the hash table relatively small and cost you only more comparisons (sometimes). Also try to measure how much time is currenlty spend in the label storage/lookup code. If this is next to nothing, then don't bother optimising for speed, just optimise for memory usage while preserving speed! Beyond all this rumbling, I am plesantly surprised there are still folks out there that program things in assembler. It used to be a hobby of mine, shuffling data with the limiter number of registers available, exctracting every ounce of performance out of an arcane x86 architecure (did 68K and 6502 as well). All in all some good memories, even abused the parity bit once to do some cheap multiple xor operation in a linear shift feedback random number gererator (or something like that). Good luck with your project and thats all I have to say now . CodeForce |
|||
22 Jun 2004, 15:09 |
|
Goto page Previous 1, 2, 3, 4 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.