flat assembler
Message board for the users of flat assembler.

Index > Main > Set value without conditional jmps

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I changed your code to use the same priority I used in mine and added one more F to TEST_COUNT. This is what I get:
Code:
8469
12688
8484
12672
8485
12672    


Oh, I've also moved the printf calls in the same way I did in my version because writing to the console polutes the times a bit. (even the computer gets unfrozen for a very little time)

PS: Your code untouched gives this:
Code:
797
547
797
531
797
531    


[edit]WARNING: My code prints the results in reverse order, so cmov is faster[/edit]
Post 23 Dec 2009, 22:28
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Just in case the warnings I added are not seen by all I'm posting this. I made a very stupid mistake and printed the results in reverse order... So, at least on my PC, CMOV draws or wins, but never loses.
Post 23 Dec 2009, 22:37
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
LocoDelAssembly,

That's quite predictable result. How about cmp eax, 1/sbb eax, eax/and eax, VALUE equivalent with cmov? I'm still trying to contrive something compact for
if (eax==0) eax = VALUE; else eax = 0; Wink
Post 23 Dec 2009, 22:46
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Too many mistakes today Smile

Yep, we never tested the same functionality, the supposed to be equivalent to CMOV code is clearing the register when CMOV leaves it intact.

So, lets agree in the functionality, we are pursuing the the behavior of the CMOV code we currently have or the other?
Post 23 Dec 2009, 23:23
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
LocoDelAssembly,

Previously tested snippets were equivalent: they both implement
if (eax!=0) eax = VALUE;
I still not found simple and/or elegant cmov solution for
if (eax==0) eax = VALUE; else eax = 0;
Only something like that:
Code:
        mov     ecx, VALUE
        test    eax, eax
        cmovnz  eax, ecx
        xor     eax, ecx    
but I feel like I'm cheating. Wink

Interestingly, nobody questions usefulness of these transformations yet. Wink
Post 23 Dec 2009, 23:39
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
baldr, although both do that when EAX is non-zero, for the case EAX is zero them differ, one left the previous value and the other clears EAX. If the first behavior is expected then CMOV will be very hard to beat, otherwise, perhaps the other variant will always win unless something very clever around CMOV appears.

I think your code won't work properly when cmovnz doesn't move, but when it does XOR will do its magic Wink

To put both in equal positions perhaps this should be simulated instead:
Code:
IF REG = 0 THEN
  REG = VALUE1
ELSE
  REG = VALUE2
END IF    
Post 24 Dec 2009, 00:51
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17280
Location: In your JS exploiting you and your system
revolution
baldr wrote:
Interestingly, nobody questions usefulness of these transformations yet. Wink
We all just assume that since you are prepared to put in so much time and effort to get them the way you want then they must be important and useful to you.

Personally I find it "interesting" that you are using isolated synthetic benchmarks on one system to judge the relative value of the codes.
Post 24 Dec 2009, 01:07
View user's profile Send private message Visit poster's website Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
LocoDelAssembly,

When cmovnz does not move, eax is 0. So the magic is still there. Wink

For your proposal here is the solution:
Code:
        neg     reg
        sbb     reg, reg
        and     reg, VALUE1 xor VALUE2
        xor     reg, VALUE1    
Works only for immediates, though. Double cmov will surely beat it for r/m.


revolution,

You're right, it's "just for kicks". Wink
Post 24 Dec 2009, 01:16
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I'm so affected by a recent exam in my faculty Smile Yes, right, EAX will already be zero before executing cmovnz.
Post 24 Dec 2009, 01:45
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
baldr wrote:
Double cmov will surely beat it for r/m.
Why "double" cmov?

Code:
test eax, eax
mov eax, [VALUE2]
cmovz eax, [VALUE1]    

_________________
Previously known as The_Grey_Beast
Post 24 Dec 2009, 02:57
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
Borsuc,

Not sure. Probably cmov won't read r/m if condition is false? I tend to minimize unnecessary memory access.
Post 24 Dec 2009, 07:27
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17280
Location: In your JS exploiting you and your system
revolution
Unfortunately cmovcc does read all arguments before processing the instruction.

This means you can't do this to avoid reading a null pointer:
Code:
cmp ebx,0
cmovnz eax,[ebx]    
The value at [ebx] is always going to be read regardless of the value of the conditional.
Post 24 Dec 2009, 07:56
View user's profile Send private message Visit poster's website Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
revolution,

Has to be something with speculative execution. Intel pseudocode for cmov clearly shows that src is read in any case, and AMD64 APM states that
AMD64 Architecture Programmer's Manual (rev. 3.14, Sep 2007), Volume 3 wrote:
If the condition is not sastisfied, the instruction has no effect.
Can somebody test that claim? LocoDelAssembly?
Post 24 Dec 2009, 08:41
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17280
Location: In your JS exploiting you and your system
revolution
This is easy to test:
Code:
include 'win32ax.inc'

begin:
       mov     ebx,0
       cmp     ebx,0
       cmovnz  eax,[ebx]
   invoke  ExitProcess,0

.end begin    
Post 24 Dec 2009, 08:49
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I got an exception, however I think I've read somewhere in the manual that the memory cycle initiates always but it is stopped just before making the read/write.
Post 24 Dec 2009, 15:26
View user's profile Send private message Reply with quote
MHajduk



Joined: 30 Mar 2006
Posts: 6034
Location: Poland
MHajduk
LocoDelAssembly wrote:
To put both in equal positions perhaps this should be simulated instead:
Code:
IF REG = 0 THEN
  REG = VALUE1
ELSE
  REG = VALUE2
END IF    
Here you have my solution:
Code:
    test    eax, eax
    setnz   al
  movzx   eax, al
     mov     eax, [Values + 4*eax]    
Where 'Values' is an array of two double words defined somewhere in the code:
Code:
     Values  dd VALUE1, VALUE2    
And simple program for easy tests in OllyDbg:
Code:
include    'win32ax.inc'

VALUE1   = 0xABC
VALUE2       = 0xDEF

start:
   test    eax, eax
    setnz   al
  movzx   eax, al
     mov     eax, [Values + 4*eax]
       
    invoke  ExitProcess, 0
      
    Values  dd VALUE1, VALUE2

.end start    
Post 24 Dec 2009, 15:31
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17280
Location: In your JS exploiting you and your system
revolution
LocoDelAssembly wrote:
... I think I've read somewhere in the manual that the memory cycle initiates always but it is stopped just before making the read/write.
I doubt that claim. I think the source of that claim must be confused with something else. The final stage of cmov is to selectively retire the instruction or not depending upon the condition (to improve performance) and that can only be done when all of the source operands have been read and available for use. This is why in x86 you can still get memory access exceptions for instructions that fail the condition.

IMO this is a major flaw with cmov. And another flaw is no available encoding for immediate values. It could have been so much better. Oh well.
Post 24 Dec 2009, 15:35
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

IMO this is a major flaw with cmov. And another flaw is no available encoding for immediate values. It could have been so much better. Oh well.

Yes, also SETcc is annoying, how is possible they allow 8-bit destination only, almost always you will need to follow it by a movzx or clear the destination first Evil or Very Mad
Post 24 Dec 2009, 16:02
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Well, I've tried the MUX memory approach because with constants seems to have little sense to use CMOV.

Results (and cheating with the non-CMOV version):
Code:
With CMOV: 6391 ms
Without CMOV: 10687 ms    


Code:
TEST_COUNT = 01FFFFFFFh
UNROLL = 7

format PE console
include 'win32a.inc'
entry start ; setPriority
section ".data" data readable writeable
szFormatCMOV   db 'With CMOV: %d ms', 10, 0
szFormatNoCMOV db 'Without CMOV: %d ms', 10, 0
_repeat    db 'Repeat Test?',0
_pause     db 'pause',0
align 16
Values dd 1, 0 ; To force alternation

section ".code" code readable executable
start:
;;;;;; TEST MHajduk
      mov       ebp, TEST_COUNT
      call      [GetTickCount]
      push      eax
      xor       edx, edx ; The cheat, but actually made no difference to the "mov eax, 0" approach
align 64
.TST1:
     repeat UNROLL
        test    eax, eax
      ;  mov     eax, 0 ; was faster doing this and removing movzx
        setnz   dl
;        movzx   eax, al
        mov     eax, [Values + 4*edx]
     end repeat

      sub       ebp, 1
      jnz       .TST1
      call      [GetTickCount]
      pop       ebx
      sub       eax, ebx
      push      eax
      push      szFormatNoCMOV

;;;;;; TEST baldr (well, double cmov, the code was not actually posted)
      mov       ebp, TEST_COUNT
      call      [GetTickCount]
      push      eax
align 64
.TST2:
     repeat UNROLL
        test    eax, eax
        cmovnz  eax, [Values + 4]
        cmovz   eax, [Values]
     end repeat

      sub       ebp, 1
      jnz       .TST2
      call      [GetTickCount]
      pop       ebx
      sub       eax, ebx
      push      eax
      push      szFormatCMOV

;;;;;;
      call      [printf]
      add       esp, 8
      call      [printf]
      add       esp, 8

      invoke    MessageBox, 0, _repeat, _repeat, MB_YESNO
      cmp       eax, IDYES
      je        start
      cinvoke   system, _pause
      invoke    ExitProcess, 0

setPriority:
      invoke   GetCurrentProcess
      mov      ebx, eax
      invoke   SetPriorityClass, eax, REALTIME_PRIORITY_CLASS
      invoke   GetCurrentThread
      invoke   SetThreadPriority, eax, THREAD_PRIORITY_TIME_CRITICAL
      jmp      start

 section '.idata' import data readable writeable
                library kernel,'KERNEL32.DLL',\
                        msvcrt,'msvcrt.dll',\
                        user,'USER32.DLL'
                 import kernel,\
                       GetCurrentProcess, 'GetCurrentProcess',\
                       SetPriorityClass, 'SetPriorityClass',\
                       GetCurrentThread, 'GetCurrentThread',\
                       SetThreadPriority, 'SetThreadPriority',\
                       GetTickCount,'GetTickCount',\
                       ExitProcess,'ExitProcess'
                 import msvcrt,\
                       printf,'printf',\
                       system,'system'
                 import user,\
                       wsprintf,'wsprintfA',\
                       MessageBox,'MessageBoxA'    
Post 24 Dec 2009, 17:26
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode
using cmov i have found this where the 2 branches are very similiar.
Anyway,
7 bytes for the TRUE
9 bytes for the FALSE

No way, at the moment, to use only 2 register (exceptued for the TRUE b.)
here: EAX,ECX, and EDX to contain trace of the old quantity in EAX.
But, of course, EDX may contain the EAX quantity ... Laughing

testmacro
Code:
macro @setD result,reg,const,fbool {
 mov eax,result
 mov ecx,const
 neg reg
 sbb edx,edx
 cmovnz reg,ecx
 match =FALSE,fbool \{
  xor reg,ecx
 \}
}
    


ready-2-use-macro
Code:
macro @set reg,const,fbool {
 mov ecx,const
 neg reg
 cmovnz reg,ecx
 sbb edx,edx
 match =FALSE,fbool \{
  xor reg,ecx
 \}
}
    
Post 25 Dec 2009, 04:27
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.