flat assembler
Message board for the users of flat assembler.

Index > Main > another speedups

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 02:07
Check out the latest development release (1.59.2) for the few more speedups - finally I have implemented some nice hash tree algorithm for preprocessor structures to make it faster with huge amounts of definitions, like parser already was. But while designing this new algorithm (it had to be a bit different than the simple one that was used by parser, because of the preprocessor features like redefinition and restoring) I have realized that also the one used by parsed can be improved, with only changing a few lines of code, and it becomes much less slowed down by hash conflicts - on some synthetical test files it gave me really good results.
Please test this new version, the new algorithms seems to be working smoothly, but some good testing is still needed.
Post 17 Feb 2005, 02:07
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 17 Feb 2005, 07:09
Really great work. The preprocessor was the most slow part of FASM.
Now the preprocessor is twice faster than the old one. As a test I use Fresh sources:

Old preprocessor: 0.669 s
New preprocessor: 0.338 s

Unfortunately, the computer in my office is too fast and the absolute times are too small to take more precise measurements.
I will repeat the tests when I get back home, on slower machines.

Regards.
Post 17 Feb 2005, 07:09
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
madmatt



Joined: 07 Oct 2003
Posts: 1045
Location: Michigan, USA
madmatt 17 Feb 2005, 10:55
Very good work! I compiled the sources and could only get the command line version, Is the win editor version available yet?
Post 17 Feb 2005, 10:55
View user's profile Send private message Reply with quote
MCD



Joined: 21 Aug 2004
Posts: 602
Location: Germany
MCD 17 Feb 2005, 11:33
Whewre can I get this alpa of the new 64bit fasm?
Post 17 Feb 2005, 11:33
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 17 Feb 2005, 11:55
MCD wrote:
Whewre can I get this alpa of the new 64bit fasm?


http://flatassembler.net/download.php
Post 17 Feb 2005, 11:55
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 12:22
madmatt wrote:
Very good work! I compiled the sources and could only get the command line version, Is the win editor version available yet?

Copy the SOURCE catalog over the one from fasmw distribution and you can recompile the fasmw.exe with the new core.
Post 17 Feb 2005, 12:22
View user's profile Send private message Visit poster's website Reply with quote
pelaillo
Missing in inaction


Joined: 19 Jun 2003
Posts: 878
Location: Colombia
pelaillo 17 Feb 2005, 12:39
Quote:

Unfortunately, the computer in my office is too fast and the absolute times are too small to take more precise measurements.
I will repeat the tests when I get back home, on slower machines.

Yesterday at home I do a test and fasm complies itself in 10.7 seconds
in my *new* 80386DX with 8 MB RAM Smile Tomorrow I will publish the results of improved fasm.
Post 17 Feb 2005, 12:39
View user's profile Send private message Yahoo Messenger Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 15:35
However more optimizations are "on the way" - I've just finished another set of nifty improvements to the new preprocessor symbols management and I still have the macro processor left - it also needs cleaning up and making some better loop-up algorithm for macro arguments. And only then I'm going back to ELF64 support.

If possible, please test thoroughly all the development releases - I would prefer any bug reports before I release the stable (even number) version.
Post 17 Feb 2005, 15:35
View user's profile Send private message Visit poster's website Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 17 Feb 2005, 15:42
I make old P166 (with 64MB RAM) working, so I could test it with that machine. I compiled FASMW and Fresh.

Code:
      | 1.58  | 1.59.2
-----------------------
fasmw | 3.8s  | 2.6s
fresh | 23.2s | 13.8s
-----------------------    


on my PIII 600 I have only tried to compile Fresh, and it took 5 seconds with 1.58 and 3s with 1.59. Actually now there is no such a big programs written in FASM that would need faster compiler Very Happy


Last edited by decard on 17 Feb 2005, 16:01; edited 1 time in total
Post 17 Feb 2005, 15:42
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 15:48
Please write the full version number - development versions have two digits. 1.59.3 is still going to be faster. Wink
Post 17 Feb 2005, 15:48
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 17 Feb 2005, 17:43
OK, here is more detailed report about FASM v1.59.2:
Code:
Version tested: FASM 1.59.2
Reference version: FASM 1.58

Both compilers are implemented inside Fresh v1.1.D because of 
more precise time measurement for every compilation stage.

Testing computer:
  CPU: AMD K6-2 450Mhz, overclocked to 500MHz (x5 100MHz FSB)
  Memory: 128MBytes
  OS: Windows 98 (without IE)

Test sources:
1. Fresh v1.1.D with FASM 1.59.2 (Fresh)
   Total lines preprocessed: 211950
2. FASMW v1.58 with implemented compiler v1.59.2 (FASMW)
   Total lines preprocessed: 55163
3. Thingamy (current work version) (Thingamy)
   Total lines preprocessed: 73336
+--------+-------------------------+
|        |  Preprocessing time [s] |
+--------+-------------+-----------+
| Source | FASM 1.59.2 | FASM 1.58 |
+--------+-------------+-----------+
| Fresh  |    1.743    |   4.736   |
| FASMW  |    0.301    |   0.768   |
|Thingamy|    0.999    |   1.925   |
+--------+-------------+-----------+    


BTW: Privalov, could you post some breaf info about internals changins since 1.58, because when I implemented FASM 1.59.2 in Fresh some of its extended functions, concerning collecting of label information stop to work properly.

Regards
Post 17 Feb 2005, 17:43
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 17 Feb 2005, 18:30
Here's some times from me and my 64-bitter Smile
Code:
frekla@zeus elf64 $ time ./fasm64 test_program0.asm 
flat assembler  version 1.59.1
3 passes, 741 bytes.

real    0m0.002s
user    0m0.000s
sys     0m0.001s
-----------------
frekla@zeus elf64 $ time ./fasm64.2 test_program0.asm 
flat assembler  version 1.59.2
3 passes, 741 bytes.

real    0m0.002s
user    0m0.000s
sys     0m0.000s
    

test_program is too small to beable tog give any reliable results (multiple runs gives too big variations ±0.002s - which is basically ±100%)

so here's from a little (no irony, it's still very very small, tiny really) bigger project, though, please note that not only fasm is invoked but also gcc (among others) - but in this phase of the development the major time is spend in fasm - or well, before this release at least Razz
Code:
frekla@zeus /mnt/dev/frekla/development/######## $ time make -B    
done! see /mnt/dev/frekla/development/########/compile.log for details.

real    0m1.089s
user    0m0.132s
sys     0m0.148s
(second run gave: "real\t0m0.299s\nuser\t0m0.114s\nsys\t0m0.167s")
---------------------------
frekla@zeus /mnt/dev/frekla/development/######## $ time make -B AS="/mnt/dev/frekla/development/fasm/elf64/fasm64"
done! see /mnt/dev/frekla/development/########/compile.log for details.

real    0m0.291s
user    0m0.119s
sys     0m0.152s    

(first version is 1.57 the second 1.59.2)

I'm on getoo, using linux 2.6.10-ck5. The CPU AMD Athlon(tm) 64 Processor 3200+ (2202.918MHz), running in long mode, but the kernel got the IA32 support enabled (so fasm runs as 32-bit) and 1Gb RAM.

btw, privalov, could you send me that big 400k labels-test file of yours? (I should get more reliable timings from that, the above files are small - there by not giving that accurate timings)
Post 17 Feb 2005, 18:30
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 21:00
JohnFound wrote:
BTW: Privalov, could you post some breaf info about internals changins since 1.58, because when I implemented FASM 1.59.2 in Fresh some of its extended functions, concerning collecting of label information stop to work properly.

The changes in preprocessor are quite wide and large parts of my guide to internals are becoming obsolete. Please tell me which features are you using and I will check why it doesn't work no more (it might be also because some of the features are not complete in 1.59.2, I have fixed them later).
There is also a small change in parser. The first double word of parser's label structure used to contain hash value, but now it contains the pointer to next label entry of the same hash, if there is no more such entries, it contains zero.
Post 17 Feb 2005, 21:00
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 21:06
scientica wrote:
btw, privalov, could you send me that big 400k labels-test file of yours? (I should get more reliable timings from that, the above files are small - there by not giving that accurate timings)

I was just making it this way:
Code:
repeat 400000
 db 'label'
 db '0'+(%/100000) mod 10
 db '0'+(%/10000) mod 10
 db '0'+(%/1000) mod 10
 db '0'+(%/100) mod 10
 db '0'+(%/10) mod 10
 db '0'+(%/1) mod 10
 db ' = 0',13,10
 db 'db label'
 db '0'+(%/100000) mod 10
 db '0'+(%/10000) mod 10
 db '0'+(%/1000) mod 10
 db '0'+(%/100) mod 10
 db '0'+(%/10) mod 10
 db '0'+(%/1) mod 10
 db 13,10
end repeat     

You can also replace the = with EQU to test the new preprocessor engine, but beware when using earlier fasm versions - they were in no way adapted to handle such huge amounts of preprocessor symbols in a reasonable time.
Post 17 Feb 2005, 21:06
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 17 Feb 2005, 22:22
ok, here are the timings:
Code:
[10:37:08]frekla@zeus elf64 $ time ./fasm64 -m 64000 400kLabelGen.asm 
flat assembler  version 1.59.2
1 passes, 1.2 seconds, 13200000 bytes.

real    0m1.233s
user    0m1.172s
sys     0m0.044s
frekla@zeus elf64 $ time ./fasm64 -m 128000 400kLabelGen.bin 400kLabelGen.bin.out 
flat assembler  version 1.59.2
1 passes, 1.9 seconds, 400000 bytes.

real    0m1.967s
user    0m1.850s
sys     0m0.096s
frekla@zeus elf64 $ time ./fasm64 -m 64000 400kLabelGenEqu.asm 
flat assembler  version 1.59.2
1 passes, 1.2 seconds, 14000000 bytes.

real    0m1.234s
user    0m1.186s
sys     0m0.041s
frekla@zeus elf64 $ time ./fasm64 -m 128000 400kLabelGenEqu.bin 400kLabelGenEqu.bin.out 
flat assembler  version 1.59.2
1 passes, 1.2 seconds, 400000 bytes.

real    0m1.278s
user    0m1.137s
sys     0m0.090s
frekla@zeus elf64 $ diff -s 400kLabelGenEqu.bin.out 400kLabelGen.bin.out 
Files 400kLabelGenEqu.bin.out and 400kLabelGen.bin.out are identical    

(400kLabelGenEqu.asm is using 'equ' instead of '=')

and just for the fun of it, I made one version with fix instead of equ:
Code:
frekla@zeus elf64 $time ./fasm64 -m 64000 400kLabelGenFix.asm 
flat assembler  version 1.59.2
1 passes, 1.2 seconds, 14000000 bytes.

real    0m1.230s
user    0m1.177s
sys     0m0.043s
frekla@zeus elf64 $time ./fasm64 -m 128000 400kLabelGenFix.bin 400kLabelGenFix.bin.out 
flat assembler  version 1.59.2
1 passes, 0.9 seconds, 400000 bytes.

real    0m1.001s
user    0m0.881s
sys     0m0.099s
frekla@zeus elf64 $ diff -s 400kLabelGenFix.bin.out  400kLabelGen.bin.out 
Files 400kLabelGenFix.bin.out and 400kLabelGen.bin.out are identical
frekla@zeus elf64 $ ll *.out
-rw-r--r--  1 frekla users 400000 Feb 17 22:38 400kLabelGen.bin.out
-rw-r--r--  1 frekla users 400000 Feb 17 22:40 400kLabelGenEqu.bin.out
-rw-r--r--  1 frekla users 400000 Feb 17 22:44 400kLabelGenFix.bin.out    


The timings are quite reasonable to me, but yeah 1.2 seconds is an eternity Razz Wink

oh, almost forgot, comparistion values: (didn't run the fix one with 1.57, should give about same time as equ (right?))
Code:
frekla@zeus elf64 $ time fasm -m 128000 400kLabelGen.bin 400kLabelGen.bin.out-1.57
flat assembler  version 1.57
1 passes, 4.1 seconds, 400000 bytes.

real    0m4.298s
user    0m4.020s
sys     0m0.098s
frekla@zeus elf64 $ time fasm -m 128000 400kLabelGenEqu.bin 400kLabelGenEgu.bin.out-1.57
flat assembler  version 1.57
1 passes, 1015.5 seconds, 400000 bytes.

real    16m55.522s
user    16m37.155s
sys     0m0.685s    

Unless I made an error in math, it's roughtly a 1:795 ratio (1.59.2 to 1.57)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 17 Feb 2005, 22:22
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 17 Feb 2005, 23:15
Privalov wrote:
There is also a small change in parser. The first double word of parser's label structure used to contain hash value, but now it contains the pointer to next label entry of the same hash, if there is no more such entries, it contains zero.


This was the problem, not the preprocessor. Thank you. The problem is within the LabelsList.asm - the module that creates list with labels for debuger and other IDE purposes after parsing stage.
Until now, I use the hash value as a flag that the label is not anonymous (@@) label. Now in these records there is no pointer to the name as a second dword in the label structure, but some strange number. What is this number and how can be identified anonumous label without using the hash value in the first dword?

Regards
Post 17 Feb 2005, 23:15
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 17 Feb 2005, 23:39
There is just garbage, this field was not initialized at all in case of anonymous labels. I can make it be always zero if this is suitable for you.

BTW, can you also check how the changes I made when adding x86-64 support affected the speed of assembler module?
Post 17 Feb 2005, 23:39
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 18 Feb 2005, 00:02
I've just uploaded the 1.59.3 - ready for testing.
Post 18 Feb 2005, 00:02
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 18 Feb 2005, 05:19
new timings:
Code:
[06:17:51]frekla@zeus elf64 $ time ./fasm64 -m 64000 400kLabelGen.asm
flat assembler  version 1.59.3
1 passes, 1.2 seconds, 13200000 bytes.

real    0m1.292s
user    0m1.178s
sys     0m0.043s
[06:17:52]frekla@zeus elf64 $ time ./fasm64 -m 128000 400kLabelGen.bin 400kLabelGen.bin.out
flat assembler  version 1.59.3
1 passes, 1.9 seconds, 400000 bytes.

real    0m1.980s
user    0m1.673s
sys     0m0.140s
[06:17:54]frekla@zeus elf64 $ time ./fasm64 -m 64000 400kLabelGenEqu.asm 
flat assembler  version 1.59.3
1 passes, 1.2 seconds, 14000000 bytes.

real    0m1.256s
user    0m1.175s
sys     0m0.050s
[06:17:55]frekla@zeus elf64 $ time ./fasm64 -m 128000 400kLabelGenEqu.bin 400kLabelGenEqu.bin.out 
flat assembler  version 1.59.3
1 passes, 0.8 seconds, 400000 bytes.

real    0m0.893s
user    0m0.803s
sys     0m0.077s
[06:17:56]frekla@zeus elf64 $ time ./fasm64 -m 64000 400kLabelGenFix.asm
flat assembler  version 1.59.3
1 passes, 1.2 seconds, 14000000 bytes.

real    0m1.259s
user    0m1.181s
sys     0m0.040s
[06:17:58]frekla@zeus elf64 $ time ./fasm64 -m 128000 400kLabelGenFix.bin 400kLabelGenFix.bin.out 
flat assembler  version 1.59.3
1 passes, 0.8 seconds, 400000 bytes.

real    0m0.819s
user    0m0.730s
sys     0m0.074s    
Post 18 Feb 2005, 05:19
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 18 Feb 2005, 05:23
Privalov wrote:
There is just garbage, this field was not initialized at all in case of anonymous labels. I can make it be always zero if this is suitable for you.


I just implement FASM 1.59.3 in Fresh and everything works OK. (including "variable.inc"). Thanks.

Quote:
BTW, can you also check how the changes I made when adding x86-64 support affected the speed of assembler module?


Actually my impression is that the speed of assembler module is the same or even insignificantly faster. Anyway, the difference is within the measurement error. I will try to test it once again tonight, on my home computer.

Regards.
Post 18 Feb 2005, 05:23
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.