flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2 |
Author |
|
LocoDelAssembly
Just in case the warnings I added are not seen by all I'm posting this. I made a very stupid mistake and printed the results in reverse order... So, at least on my PC, CMOV draws or wins, but never loses.
|
|||
![]() |
|
baldr
LocoDelAssembly,
That's quite predictable result. How about cmp eax, 1/sbb eax, eax/and eax, VALUE equivalent with cmov? I'm still trying to contrive something compact for if (eax==0) eax = VALUE; else eax = 0; ![]() |
|||
![]() |
|
LocoDelAssembly
Too many mistakes today
![]() Yep, we never tested the same functionality, the supposed to be equivalent to CMOV code is clearing the register when CMOV leaves it intact. So, lets agree in the functionality, we are pursuing the the behavior of the CMOV code we currently have or the other? |
|||
![]() |
|
baldr
LocoDelAssembly,
Previously tested snippets were equivalent: they both implement if (eax!=0) eax = VALUE; I still not found simple and/or elegant cmov solution for if (eax==0) eax = VALUE; else eax = 0; Only something like that: Code: mov ecx, VALUE test eax, eax cmovnz eax, ecx xor eax, ecx ![]() Interestingly, nobody questions usefulness of these transformations yet. ![]() |
|||
![]() |
|
LocoDelAssembly
baldr, although both do that when EAX is non-zero, for the case EAX is zero them differ, one left the previous value and the other clears EAX. If the first behavior is expected then CMOV will be very hard to beat, otherwise, perhaps the other variant will always win unless something very clever around CMOV appears.
I think your code won't work properly when cmovnz doesn't move, but when it does XOR will do its magic ![]() To put both in equal positions perhaps this should be simulated instead: Code: IF REG = 0 THEN REG = VALUE1 ELSE REG = VALUE2 END IF |
|||
![]() |
|
revolution
baldr wrote: Interestingly, nobody questions usefulness of these transformations yet. Personally I find it "interesting" that you are using isolated synthetic benchmarks on one system to judge the relative value of the codes. |
|||
![]() |
|
baldr
LocoDelAssembly,
When cmovnz does not move, eax is 0. So the magic is still there. ![]() For your proposal here is the solution: Code: neg reg sbb reg, reg and reg, VALUE1 xor VALUE2 xor reg, VALUE1 revolution, You're right, it's "just for kicks". ![]() |
|||
![]() |
|
LocoDelAssembly
I'm so affected by a recent exam in my faculty
![]() |
|||
![]() |
|
Borsuc
baldr wrote: Double cmov will surely beat it for r/m. Code: test eax, eax mov eax, [VALUE2] cmovz eax, [VALUE1] _________________ Previously known as The_Grey_Beast |
|||
![]() |
|
baldr
Borsuc,
Not sure. Probably cmov won't read r/m if condition is false? I tend to minimize unnecessary memory access. |
|||
![]() |
|
revolution
Unfortunately cmovcc does read all arguments before processing the instruction.
This means you can't do this to avoid reading a null pointer: Code: cmp ebx,0 cmovnz eax,[ebx] |
|||
![]() |
|
baldr
revolution,
Has to be something with speculative execution. Intel pseudocode for cmov clearly shows that src is read in any case, and AMD64 APM states that AMD64 Architecture Programmer's Manual (rev. 3.14, Sep 2007), Volume 3 wrote: If the condition is not sastisfied, the instruction has no effect. |
|||
![]() |
|
revolution
This is easy to test:
Code: include 'win32ax.inc' begin: mov ebx,0 cmp ebx,0 cmovnz eax,[ebx] invoke ExitProcess,0 .end begin |
|||
![]() |
|
LocoDelAssembly
I got an exception, however I think I've read somewhere in the manual that the memory cycle initiates always but it is stopped just before making the read/write.
|
|||
![]() |
|
MHajduk
LocoDelAssembly wrote: To put both in equal positions perhaps this should be simulated instead: Code: test eax, eax setnz al movzx eax, al mov eax, [Values + 4*eax] Code: Values dd VALUE1, VALUE2 Code: include 'win32ax.inc' VALUE1 = 0xABC VALUE2 = 0xDEF start: test eax, eax setnz al movzx eax, al mov eax, [Values + 4*eax] invoke ExitProcess, 0 Values dd VALUE1, VALUE2 .end start |
|||
![]() |
|
revolution
LocoDelAssembly wrote: ... I think I've read somewhere in the manual that the memory cycle initiates always but it is stopped just before making the read/write. IMO this is a major flaw with cmov. And another flaw is no available encoding for immediate values. It could have been so much better. Oh well. |
|||
![]() |
|
LocoDelAssembly
Quote:
Yes, also SETcc is annoying, how is possible they allow 8-bit destination only, almost always you will need to follow it by a movzx or clear the destination first ![]() |
|||
![]() |
|
LocoDelAssembly
Well, I've tried the MUX memory approach because with constants seems to have little sense to use CMOV.
Results (and cheating with the non-CMOV version): Code: With CMOV: 6391 ms Without CMOV: 10687 ms Code: TEST_COUNT = 01FFFFFFFh UNROLL = 7 format PE console include 'win32a.inc' entry start ; setPriority section ".data" data readable writeable szFormatCMOV db 'With CMOV: %d ms', 10, 0 szFormatNoCMOV db 'Without CMOV: %d ms', 10, 0 _repeat db 'Repeat Test?',0 _pause db 'pause',0 align 16 Values dd 1, 0 ; To force alternation section ".code" code readable executable start: ;;;;;; TEST MHajduk mov ebp, TEST_COUNT call [GetTickCount] push eax xor edx, edx ; The cheat, but actually made no difference to the "mov eax, 0" approach align 64 .TST1: repeat UNROLL test eax, eax ; mov eax, 0 ; was faster doing this and removing movzx setnz dl ; movzx eax, al mov eax, [Values + 4*edx] end repeat sub ebp, 1 jnz .TST1 call [GetTickCount] pop ebx sub eax, ebx push eax push szFormatNoCMOV ;;;;;; TEST baldr (well, double cmov, the code was not actually posted) mov ebp, TEST_COUNT call [GetTickCount] push eax align 64 .TST2: repeat UNROLL test eax, eax cmovnz eax, [Values + 4] cmovz eax, [Values] end repeat sub ebp, 1 jnz .TST2 call [GetTickCount] pop ebx sub eax, ebx push eax push szFormatCMOV ;;;;;; call [printf] add esp, 8 call [printf] add esp, 8 invoke MessageBox, 0, _repeat, _repeat, MB_YESNO cmp eax, IDYES je start cinvoke system, _pause invoke ExitProcess, 0 setPriority: invoke GetCurrentProcess mov ebx, eax invoke SetPriorityClass, eax, REALTIME_PRIORITY_CLASS invoke GetCurrentThread invoke SetThreadPriority, eax, THREAD_PRIORITY_TIME_CRITICAL jmp start section '.idata' import data readable writeable library kernel,'KERNEL32.DLL',\ msvcrt,'msvcrt.dll',\ user,'USER32.DLL' import kernel,\ GetCurrentProcess, 'GetCurrentProcess',\ SetPriorityClass, 'SetPriorityClass',\ GetCurrentThread, 'GetCurrentThread',\ SetThreadPriority, 'SetThreadPriority',\ GetTickCount,'GetTickCount',\ ExitProcess,'ExitProcess' import msvcrt,\ printf,'printf',\ system,'system' import user,\ wsprintf,'wsprintfA',\ MessageBox,'MessageBoxA' |
|||
![]() |
|
hopcode
using cmov i have found this where the 2 branches are very similiar.
Anyway, 7 bytes for the TRUE 9 bytes for the FALSE No way, at the moment, to use only 2 register (exceptued for the TRUE b.) here: EAX,ECX, and EDX to contain trace of the old quantity in EAX. But, of course, EDX may contain the EAX quantity ... ![]() testmacro Code: macro @setD result,reg,const,fbool { mov eax,result mov ecx,const neg reg sbb edx,edx cmovnz reg,ecx match =FALSE,fbool \{ xor reg,ecx \} } ready-2-use-macro Code: macro @set reg,const,fbool { mov ecx,const neg reg cmovnz reg,ecx sbb edx,edx match =FALSE,fbool \{ xor reg,ecx \} } |
|||
![]() |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.