flat assembler
Message board for the users of flat assembler.

 Index > Main > ...because there is no algos section - here you are MAX(A,B)
Author

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
There was a discussion on Orkut.com community named "ASM".
Quote:
Don't Jump!!! 8/17/2005 1:14 AM
heres a programming in ASM problem.

Move the Greater of A and B to a Location C
WITHOUT using any jump instructions.

Code:
```;Provided EAX and EBX have some values 0...0FFFFFFFFh
;signed or unsigned
mov  edx,eax ; save A
sub  eax,ebx ; A-B
shr  eax,31  ; Carry (a<b?)
neg  eax     ; eax= 0 | 0FFFFFFFFh
and  ebx,eax ; eax=0?ebx=0:ebx=ebx
not  eax
and  edx,eax ; eax=0?edx=0:edx=edx
add  ebx,edx ; either one is zero so ebx is A+0 or 0+B
```

_________________
My updated idol http://www.agner.org/optimize/
03 Oct 2005, 20:37
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18612
revolution
I've always used this sequence for MAX:
Code:
```sub     eax,ebx ;a=a-b
setge   dl
movzx   edx,dl
neg     edx     ;d=0 if a<b, d=-1 if a>=b
and     eax,edx ;a=0 if a<b, a=a-b if a>=b
add     eax,ebx ;a=b if a<b, a=a if a>=b    ```
You can adjust the "setge" to whatever condition you need for MAX, MIN, signed, unsigned etc. If you make it "setns" then you get that same as your code above.
04 Oct 2005, 01:45
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18612
revolution
Hehe, just for comparison in ARM it looks like this:
Code:
```cmp r0,r1
movlt r0,r1    ```
And using a P2 or later this:
Code:
```cmp eax,ebx
cmovl eax,ebx    ```
04 Oct 2005, 02:02

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Hmm, that looks great and with CMOVxx the answer is trivial, but is SETxx supported instruction in 8086 for example?
04 Oct 2005, 08:26
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18612
revolution
Quote:
but is SETxx supported instruction in 8086 for example?
SETcc is an 80386 (and up) instruction.
04 Oct 2005, 09:48
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18612
revolution
For signed and unsigned like above you can shorten the code to this which doesn't use SETcc:
Code:
```sub eax,ebx ;a=a-b
mov edx,eax
sar edx,31
not edx ;d=0 if a<b, d=-1 if a>=b
and eax,edx ;a=0 if a<b, a=a-b if a>=b
add eax,ebx ;a=b if a<b, a=a if a>=b    ```
04 Oct 2005, 09:54

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
why I couldn't have thought about that :S SAR fills edx with signbits
04 Oct 2005, 10:57
r22

Joined: 27 Dec 2004
Posts: 805
r22
The following can be easily adapted to 32bit
(hint find and replace all R's with E's )

Benchmarking with randomly produced numbers will show that it's slightly faster than using Conditional Mov instructions.

Code:
```;;;;64bit Fastcall
Max:
;;RCX = num1, RDX = num2
MOV RAX,RCX
SUB RCX,RDX
SBB RDX,RDX
AND RDX,RCX
SUB RAX,RDX
RET

;;For inline usage you can make the MIN function smaller.
;;Removing the MOV RAX,RDX and replacing RAX with RDX in last line.
Min:
;;RCX = num1, RDX = num2
MOV RAX,RDX   ;;;drop this line of code for inline
SUB RCX,RDX
SBB R8,R8       ;;;use SBB RAX,RAX for inline
AND R8,RCX     ;;;;replace with AND RAX,RCX for inline
RET
```
04 Oct 2005, 20:35
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 18612
revolution
r22: Your code works for unsigned numbers, but Madis731 was using signed numbers. SBB x,x propagates the carry whereas SAR x,31 propagates the sign.

I am rather suprised that your code is faster than CMOVcc. What processor are you testing on?
05 Oct 2005, 00:48
r22

Joined: 27 Dec 2004
Posts: 805
r22
Yea signed numbers ruin everything.
Just have to replace SBB with CDQ or for 64bit CQO
Code:
```SUB EAX,ECX
CDQ
AND EAX,EDX
```

P4 3.2ghz
Ok the bencharmking with CMOVL I stand corrected
Trying both with random 32bit values running them 07FFFFFFh each
-----------------------------------
My re-code above 3734ms 3rd so close to beating cmov dang!
Cmp & Cmovl 3625ms 2nd
The code below 3406ms 1st
-----------------------------------

Code:
```;;;INLINE SSE MIN
CVTSI2SD XMM0, EAX
CVTSI2SD XMM1, ECX
MINSD XMM0,XMM1
CVTSD2SI EAX,XMM0
```

Using Scalar SINGLE (SS) instead of SCALAR DOUBLE (SD) shaves off another 80ms on the benchmark (3375ms).

SSE works great for 32bit signed integer, but in 64bit land and unsigned 32bit integers the conditional mov will likely come out on top.
05 Oct 2005, 04:22
hopcode

Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode
revolution wrote:
I've always used this sequence for MAX:
Code:
```sub     eax,ebx ;a=a-b
setge   dl
movzx   edx,dl
neg     edx     ;d=0 if a<b, d=-1 if a>=b
and     eax,edx ;a=0 if a<b, a=a-b if a>=b
add     eax,ebx ;a=b if a<b, a=a if a>=b    ```
....

Very cool !!!
A possible way with xmm/sse instruction on Agner Fog Chap 13.1
Edit:2

hopcode[mrk]

Last edited by hopcode on 09 Nov 2008, 16:36; edited 4 times in total
02 Nov 2008, 02:54
baldr

Joined: 19 Mar 2008
Posts: 1651
baldr
Signed subtraction can easily overflow, so code have to check OF as well... Thus cdq/sar rxx, 31 will give incorrect result (e.g. try 2³¹-1 and -1), while CMOVcc/SETcc examples will do fine.
02 Nov 2008, 22:04
hopcode

Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode
Edit
/to bladr/ Thanks a lot pointing it out.
I like working code so i would avoid the real mess
Perhaps, it is more useful choosing the range of the numbers,positive
or negative, signed unsigned, consequently apply the right code...

hopcode[mrk]

Last edited by hopcode on 09 Nov 2008, 19:43; edited 1 time in total
09 Nov 2008, 16:51
baldr

Joined: 19 Mar 2008
Posts: 1651
baldr
hopcode,

I'd just pointed to difference between signed and unsigned subtraction result flags (signed eax < ebx iff SF<>OF after sub/cmp eax, ebx, while for unsigned CF is sufficient). Your code appears to go great lengths to correct only one case of the OF being set, 2³¹-1 - (-1). How about 2³¹-2 - (-2)? By the way, signed 0x80000000 is -2³¹, much less than -1.
09 Nov 2008, 17:30
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum