flat assembler
Message board for the users of flat assembler.

flat assembler > Main > alpha testing - speed improvements

Goto page 1, 2, 3, 4  Next
Author
Thread Post new topic Reply to topic
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7310
Location: Kraków, Poland
Recently I've decided it is the high time to suspend adding new features into fasm and concentrate on speed optimizations of what we've already got - because, as the quite big fasm-based projects like Fresh are growing, the speed gained by simple fact that fasm is written in assembler starts to be not enough (at least for those with slower processors). And so I have begun working on the 1.51 release - no new features are in plans this time, only speed optimizations.

I'm attaching the first alpha version, which has the first small step completed on my way - I have extended the parser's hashing mechanism with the binary hash tree, and it should make processing large amounts of labels and numerical constants (like Win32 equates) faster.

The next step should include some optimizations for preprocessor - wish me good luck. Smile

The attached version is the Win32 console version, but you can use the source from .zip archive to recompile any other. I've chosed Win32 version instead of DOS one (which I was traditionally attaching compiled in prereleases), because Windows by default gives DPMI programs only about 16 MB of extended memory (though you can change it in program properties), and for the larger tests fasm would easily run out of memory with such settings.

PS. I forgot to add that the improvements are aimed to make compilation of large source faster - at the same time assembling of the very small programs may get a bit slower, but compilation of small sources is fast anyway.

(attachment removed - please look below for the latest pre-release)


Last edited by Tomasz Grysztar on 21 Jan 2004, 16:48; edited 1 time in total
Post 13 Jan 2004, 16:42
View user's profile Send private message Visit poster's website Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1095
Location: Poland
Hi Privalov,
Nice to hear that you still want to optimize FASM for speed, although it seems to be fastest assembler avaible...
I made a test (compiling FRESH on my P266). And:
fasm.exe wrote:

39,1 sec with v1.50
29,7 sec with v1.51 alpha 1

Shocked
wow, that's real optimization... I can't belive it Smile. You are great, Privalov! Keep up your great work!!!

PS: and what about code aligment? It should give even more speed...
Post 13 Jan 2004, 17:14
View user's profile Send private message Visit poster's website Reply with quote
BiDark



Joined: 22 Jun 2003
Posts: 110
Location: .th
I can see only a little faster on 1.2 Ghz Celeron (1.3 down to 1.2 sec) with Fasmw's source compilation.

May be I'm the only one who think Masm is still the only multi-pass assembler that can handle the large equate files best. Rolling Eyes
Post 14 Jan 2004, 04:53
View user's profile Send private message Reply with quote
fasm9



Joined: 19 Jun 2003
Posts: 439
Kudos to all,

Good luck!

--
i would like to see Data structures in FASM,
linked list, stacks queues, heaps, recursion, binary tree, B-tree family,
tries, graphs, sorting, hashing, data compression,
memory management.
Post 14 Jan 2004, 06:23
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
BiDark wrote:
I can see only a little faster on 1.2 Ghz Celeron (1.3 down to 1.2 sec) with Fasmw's source compilation.


Well, FASMW is not so big source. Better try to compile Fresh sources.
Of course FASM 1.51 alpha 1 have improved only parser. (but now it is 4 to 5 times faster on my tests).
If you want to test what actually is the time of every stage of compilator, try Fresh, it have advanced timing compilator mesages. The last work version of Fresh (1.0.1C.05) is with FASM 1.51 alpha 1 inside.

Quote:
May be I'm the only one who think Masm is still the only multi-pass assembler that can handle the large equate files best. Rolling Eyes


Yea, I think that only a few people think so lately. Very Happy

Regards


Last edited by JohnFound on 14 Jan 2004, 10:16; edited 1 time in total
Post 14 Jan 2004, 07:36
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7310
Location: Kraków, Poland
Please don't refer to this release as to "FASM 1.51" - it's only the "1.51 alpha 1", I still have some more work to do with it.

If you want to compare the speed of improved parser with the old version, you might try the equates.inc file from this package (I have used it for tests myself): http://board.win32asmcommunity.net/showthread.php?s=&threadid=10110

BiDark: probably I could still make parser run slightly faster on your processor by making some loop alignments (there again are none in "alpha 1"), would you help me testing it?
Post 14 Jan 2004, 10:04
View user's profile Send private message Visit poster's website Reply with quote
Tommy



Joined: 17 Jun 2003
Posts: 492
Location: Norway
Hi Privalov,

My results (compiling FASM sources):
FASM 1.50: 0.7 secs
FASM 1.51 (alpha 1): 0.4 secs

Wink Keep it up!
Post 14 Jan 2004, 10:55
View user's profile Send private message Visit poster's website Reply with quote
gorshing



Joined: 27 Jul 2003
Posts: 72
Location: Okla, US
fasm9 wrote:
i would like to see Data structures in FASM,
linked list, stacks queues, heaps, recursion, binary tree, B-tree family,
tries, graphs, sorting, hashing, data compression,
memory management.


I could see a couple of these things, but data compression?

As for the 1.51 alpha ... another reason why I like fasm Smile

Continuing to make it better ... Good work privalov!

_________________
gorshing
Post 14 Jan 2004, 13:39
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
Wan't there a *huge* source file posted on one if these forums, iirc the file was 'bout 1 MB - I think the pass counter was changed from byte size to word size due to the file, maybe that's a good test source Smile (the version where fasm has to optimize all jumps and so, not the version where dword size was 'forced')

Doe's any one know that source I'm talking about? (I'll try to find it when I've read the other posts)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 14 Jan 2004, 15:14
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
This is the source I'm talking about, it's 22 MB source Smile
http://board.win32asmcommunity.net/showthread.php?s=&threadid=13854&highlight=pass

let's see how much time and how many passes it requires for each alpha Smile

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 14 Jan 2004, 18:18
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7310
Location: Kraków, Poland
The current hash algorithm is very unefficient with such kind of synthetical labels like those; on the other hand it works very well with the Win32 equates (which have very variated names). Should I consider trying some other hash generators?
Post 14 Jan 2004, 18:48
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
(I'm in linux now)
I keep getting out of memory errors when trying to compile the 22MB source, it's when there is free memory, only that the RAM is filled with chache and such so it looks like it's full even if it isn't. I can compile the file when there is lite used of the RAM for chache.
(I'm trying to compile the file with the alpha 1 now)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 14 Jan 2004, 19:47
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
Randall Hyde



Joined: 03 Dec 2003
Posts: 57
BiDark wrote:

May be I'm the only one who think Masm is still the only multi-pass assembler that can handle the large equate files best. Rolling Eyes


TASM also does a pretty good job, if you manually resize the hash table from the command line.

Other than a synthetic file I've created, I've never noticed FASM running particularly slow. And the output of HLA, fed into FASM, tends to contain some ugly looking labels (long, and often containing lots of the same characters).
Cheers,
Randy Hyde
Post 14 Jan 2004, 20:14
View user's profile Send private message Visit poster's website Reply with quote
Randall Hyde



Joined: 03 Dec 2003
Posts: 57
Privalov wrote:
The current hash algorithm is very unefficient with such kind of synthetical labels like those; on the other hand it works very well with the Win32 equates (which have very variated names). Should I consider trying some other hash generators?


Do keep in mind that macros and other compilers tend to generate synthetic labels (not necessarily like the ones from my "benchmark"). So I would consider that fact in the design of your hash function.

A while back I did generate a little function that created *lots* of random (non-duplicated) labels. Random lengths and random characters (alphanumeric). That might be something you want to consider as a test. Win32 equates are definitely *not* random and might skew any tests you run against other large equate files (e.g., Linux equates).

One last possiblity, if you're simply trying to improve the performance of processing the Win32 equates file, is to consider adding the option of a "precompiled header file" like certain C compilers do. That is, store the symbol table entries for a given include file in some canonical form that's easy to process and just load that into your symbol table with a special directive. There's still the issue of looking up those symbols as they appear in the source file, but at least you'll save a lot of time building the symbol table entries in the first place.

Cheers,
Randy Hyde
Post 14 Jan 2004, 20:20
View user's profile Send private message Visit poster's website Reply with quote
fasm9



Joined: 19 Jun 2003
Posts: 439
First: i am just googling it myself, please somebody clarify it! Wink

1.How about 'dancing trees' algorithm? it is used in reiserfs4.

2.New hash algorithm - FNV

3. Donald E. Knuth's dancing links

--
Interview With the People Behind JFS, ReiserFS & XFS
Post 14 Jan 2004, 21:37
View user's profile Send private message Reply with quote
BiDark



Joined: 22 Jun 2003
Posts: 110
Location: .th
Privalov wrote:

BiDark: probably I could still make parser run slightly faster on your processor by making some loop alignments (there again are none in "alpha 1"), would you help me testing it?


Ok!
Post 15 Jan 2004, 02:27
View user's profile Send private message Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
Here is the resutls of the compiles of the 22 Mb source:
Code:
#Original 1.50 fasm:
[frekla@ns1 fasm_tmp]$ fasm GEN.ASM GEN.500OUT
flat assembler  version 1.50
1849 passes, 2477.0 seconds, 4281699 bytes.

# the 1.51 alpha1:
[frekla@ns1 fasm_tmp]$ sync && ./fasm501a1 GEN.ASM  GEN.501a1OUT && sync
flat assembler  version 1.50
1849 passes, 2398.7 seconds, 4281699 bytes.
[frekla@ns1 fasm_tmp]$     

Still the same pass count but a whole 78.3 secs faster Smile

[edit]the syncs before and after doesn't affect the compile time - it just fixes parts of an memmory issue, fasm don't seem to be able to allocate memory when the entire RAM is full by 'stuff', buffeer and cache - while other apps don't have this problem.[/edit]

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 15 Jan 2004, 21:27
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
madmatt



Joined: 07 Oct 2003
Posts: 1046
Location: Michigan, USA
Tried to compile the assembly file generated by the genasm2.exe and neither fasm windowed version nor the fasm dos version would compile
the 20mb behemoth. Each fasm compiler gave me an out of memory error. loading it in the fasm windows version slowed the whole system to a crawl for about a minute. Is there a way to increase the memory available to fasm so this and other programs will compile? BTW, I'm using an AMD Athlon XP 2300+ with windows xp and 384MB of memory.
Post 16 Jan 2004, 00:38
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
Hm, actually IMHO, this huge file is not very interesting for testing 1.51 alpha 1. The big count of passes shows that it mainly is a problem for the assembling stage, not for the parser (that is improved in this version). For testing the parser, it is good to have a file only with equates - like windows include file mentioned by Privalov a few posts above. In this case, other stages are not heavy loaded and the time is proportional only to parser time. (BTW: you can use Fresh to see exactly how long continue every compiling stage.) You can see in this case that the parser in 1.51 alpha 1 is about 4..5 times faster than the one in 1.50.

Regards.
Post 16 Jan 2004, 00:51
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7310
Location: Kraków, Poland
madmatt: they way to solve the "out of memory" problems depends on which release are you using. All problems generally emerge from the fact, that fasm allocates all the memory at the beginning, and so it tries to allocate as much memory as it's possible, but it's not always so easy. DOS version, for instance, has by default some fixed limited amount of extended memory available, when you execute it from Windows, but you can adjust it in the program properties. The Windows console version currently tries to allocate the half of your physical memory, I did it this way, because with other settings some people were complaining about the unexpectedly growing swap file. If you want it to allocate more, you can change it by editing the init_memory routine in the Win32/system.inc file. And the case of Win32 GUI version (fasmw) is the simplest one, because there you can just adjust amount of memory used for compiler in the configuration dialog.

scientica: and about Linux version - it just simply allocates all the available memory that is indicated to be free by syscall. It never caused any problems for me, but I was not testing it much in the Linux environment. Currently I haven't got any better idea how to allocate the maximum possible amount of memory in Linux, do you have any suggestions? (I am willing to change it in the official release, if you convince that some other solution causes less problems; as I did it with Win32 console version to solve the problems with growing swap).
Post 16 Jan 2004, 19:58
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3, 4  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2019, Tomasz Grysztar.

Powered by rwasa.