flat assembler
Message board for the users of flat assembler.
Index
> Main > Why LOCK can make it faster? |
Author |
|
revolution 22 Sep 2023, 02:00
They both perform appallingly bad. Using lock makes is slightly less appalling, but still far short of anything near optimal.
You are writing to your active code cache line. The CPU has to assume you are doing self modifying code (SMC) and forces a cache reload and/or pipeline flush.on every cycle. Try with this code instead. Code: ; for P in cs ds es lock ; do fasm -d PREFIX=$P l4m2.asm && time ./l4m2 ; done format ELF64 executable 3 op equ add _start: mov eax, 10000000 a: PREFIX op dword [msg], 1 PREFIX op dword [msg], 2 PREFIX op dword [msg], 3 PREFIX op dword [msg], 4 PREFIX op dword [msg], 5 PREFIX op dword [msg], 1 PREFIX op dword [msg], 2 PREFIX op dword [msg], 3 PREFIX op dword [msg], 4 PREFIX op dword [msg], 5 dec eax jnz a mov eax, 1 mov ebx, 0 int 0x80 segment writeable align 4 msg dd 0 But to answer the specific question posed in the title? Don't know, CPU internals are weird and somewhat unpredictable. As a general rule, keep your cache happy and things usually go much smoother. Don't mix code with data. Specific results will always vary. |
|||
22 Sep 2023, 02:00 |
|
revolution 22 Sep 2023, 06:34
It is also possible to see lock perform worse than cs if the alignment and position of msg is changed.
Code: ; for P in cs lock ; do fasm -d PREFIX=$P l4m2.asm && time ./l4m2 ; done format ELF64 executable 3 op equ add _start: mov eax, 10000000 align 64 msg dd 0x90909090 a: PREFIX op dword [msg], 1 PREFIX op dword [msg], 2 PREFIX op dword [msg], 3 PREFIX op dword [msg], 4 PREFIX op dword [msg], 5 PREFIX op dword [msg], 1 PREFIX op dword [msg], 2 PREFIX op dword [msg], 3 PREFIX op dword [msg], 4 PREFIX op dword [msg], 5 dec eax jnz a mov eax, 1 mov ebx, 0 int 0x80 Code: ~ for P in cs lock ; do fasm -d PREFIX=$P l4m2.asm && time ./l4m2 ; done flat assembler version 1.73.31 (16384 kilobytes memory) 1 passes, 228 bytes. real 0m3.399s user 0m3.344s sys 0m0.000s flat assembler version 1.73.31 (16384 kilobytes memory) 1 passes, 228 bytes. real 0m4.269s user 0m4.176s sys 0m0.008s |
|||
22 Sep 2023, 06:34 |
|
l4m2 22 Sep 2023, 06:37
revolution wrote: You are writing to your active code cache line. Code: db ? dup 1024 |
|||
22 Sep 2023, 06:37 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.