flat assembler
Message board for the users of flat assembler.

Index > Main > alpha testing - speed improvements

Goto page Previous  1, 2, 3, 4  Next
Author
Thread Post new topic Reply to topic
madmatt



Joined: 07 Oct 2003
Posts: 1045
Location: Michigan, USA
madmatt 16 Jan 2004, 20:38
Hi Privalov, I am using FASM 1.50, the dos version. I've looked over the system.inc file and made a couple of modifications, but still get out of memory errors. I noticed that you maybe using older xms bios interrupts in the dos version, which again, if I am correct, only supports a maximum return value of 64MB xms memory. I'll have to check this with "Ralph Browns Interrupt List". I know there are new functions that go beyond the 64MB barrier. I'll check into this and see if this works and/or I could post the bios functions on this topic if you like.
Thanks Smile ,
Madmatt

P.S. I'll try your suggestions with FASMW/C versions too!
Post 16 Jan 2004, 20:38
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 08:18
The new alpha is here. Only one little update - I've changed the hash algorithm, but with big amounts of labels it shows to be significant change.
I have used the following fasm code to generate the test source with huge amount of synthetical labels:
Code:
macro num
{
  n = %-1
  repeat 8
   d = (n shr ((8-%)*4)) and 0Fh
   if d > 9
    d = 'A'+d-10
   else
    d = '0'+d
   end if
   db d
  end repeat
}

repeat 50000
db 'lbl'
num
db ':',13,10
end repeat    

in this version it generates 50000, but even with 100000 parsing with "alpha 2" was quite fast on my machine. Please test it!
Post 21 Jan 2004, 08:18
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 21 Jan 2004, 09:02
Wow, great work. Alpha 2 is really fast on my work machine (PII 450), especially for your example with serial numbered labels:

For 50000 labels:

alpha 1 - 25 seconds
alpha 2 - 1 second

Is this because of faster hash algorithm, or because better distributed hash values? And do you think this will give some performance gain on standard applications?

Regards.
Post 21 Jan 2004, 09:02
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 09:10
This is mainly because of much better distributed hash (I have used the FNV-1a algorithm), and so it'll affect only really large sources.
Post 21 Jan 2004, 09:10
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 21 Jan 2004, 09:57
Privalov wrote:
...and so it'll affect only really large sources.


Well, it is good, because I hope assembly written projects will become bigger and bigger. Very Happy
btw: IMHO, now the bottle neck is the preprocessor, it is still slow on compiling multiply files projects. (I mean Fresh Wink but it is valid for big modular projects in general). What you think about it.

Regards.
Post 21 Jan 2004, 09:57
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 21 Jan 2004, 10:57
mike.dld wrote:
bad test example or what [tested 5 times]?

This source is only to generate the actual test file. It will generate a sourcefile that contains 50000 labels. Now just assemble this output file.
Post 21 Jan 2004, 10:57
View user's profile Send private message Visit poster's website Reply with quote
Tommy



Joined: 17 Jun 2003
Posts: 489
Location: Norway
Tommy 21 Jan 2004, 15:21
Very good work Privalov!!! Cool Mine improved from 15.4 secs to 0.2 secs! Wink Very impressive!!! Keep it up!

Cheers,
Tommy
Post 21 Jan 2004, 15:21
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 16:02
JohnFound: It's very hard to do any optimizations in preprocessor due to its complexity. But here's my first try - I have added hashing for the symbolic constants and aligned a few loops. Please test whether it does any preprocessing faster for you.
I could make also some hashes for macro names, but in my test it was not affecting speed of compilation at all.
Post 21 Jan 2004, 16:02
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 21 Jan 2004, 16:42
Hi Privalov.
Good work. On my machine the performance gain for the preprocessor is about 25%

Here are times for Fresh sources:

alpha 2 - preprocessing - 4.973s
alpha 3 - preprocessing - 3.759s

Regards
Post 21 Jan 2004, 16:42
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
mike.dld



Joined: 03 Oct 2003
Posts: 235
Location: Belarus, Minsk
mike.dld 21 Jan 2004, 17:36
feel fool... Embarassed please someone delete my previous post

trying again:

P4-2800, 256, Win2k sp4

50000 labels:

Code:
flat assembler  version 1.50
1 passes, 1.7 seconds, 0 bytes.
flat assembler  version 1.51 alpha 1
1 passes, 1.2 seconds, 0 bytes.
flat assembler  version 1.51 alpha 2
1 passes, 0 bytes.    


238308 labels:

Code:
flat assembler  version 1.50
1 passes, 117.9 seconds, 0 bytes.
flat assembler  version 1.51 alpha 1
1 passes, 93.9 seconds, 0 bytes.
flat assembler  version 1.51 alpha 2
1 passes, 1.0 seconds, 0 bytes.    


238309 labels:

Code:
flat assembler  version 1.50
error: out of memory.
flat assembler  version 1.51 alpha 1
1 passes, 94.0 seconds, 0 bytes.
flat assembler  version 1.51 alpha 2
1 passes, 1.0 seconds, 0 bytes.    
Post 21 Jan 2004, 17:36
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 18:34
scientica: you can try to reduce the pass count on that huge source by using the same trick I have proposed to Fresh recently. I've tested it with alpha 3, and it finally does it in "acceptable" time.
Post 21 Jan 2004, 18:34
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 21 Jan 2004, 19:39
privalov: are you going to make some low-level optimizations?
Post 21 Jan 2004, 19:39
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 19:47
Not really, they would be too much processor-dependent.
Post 21 Jan 2004, 19:47
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 21 Jan 2004, 19:54
i meant general low-level optimization.
Like "lods word [esi]" in get_size_operator (which is used by half instructions) is slower on almost all processors than doing it with mov and add/inc. I meant such things.
Post 21 Jan 2004, 19:54
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 21 Jan 2004, 20:31
On most, but not at all. And I was writing it on 80386. Wink
Post 21 Jan 2004, 20:31
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 21 Jan 2004, 21:34
timmings from alpha 2:
Code:
[frekla@ns1 fasm_tmp]$ ./fasm501a2 GEN.ASM GEN.501a2OUT
flat assembler  version 1.51 alpha 2
1849 passes, 2154.9 seconds, 4281699 bytes.    

243.8 secs faster than the alpha 1 (322,1sec (~5.5 min) faster than 1.50)

privalov, I don't know how to solve the memory issue (but it's present for me, I might be due to my uptime 8d 7.5h - lot's of stuf in ram) - but you can't like use malloc(), can you? (I think that would work)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 21 Jan 2004, 21:34
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 22 Jan 2004, 07:04
I have tried modifying memory allocation under Linux to be a bit more agressive - please test how does it work now.
Post 22 Jan 2004, 07:04
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 23 Jan 2004, 18:23
It seems to work fine now Smile

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 23 Jan 2004, 18:23
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 23 Jan 2004, 19:26
This time trying to speed up the assembler module a bit. Please let me know whether there is any visible difference for you.
Post 23 Jan 2004, 19:26
View user's profile Send private message Visit poster's website Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 23 Jan 2004, 20:12
Well this time it wasn't really faster - Fresh compiled about 0,5 sec faster on my machine.
Post 23 Jan 2004, 20:12
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.