flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
MazeGen 24 May 2005, 07:01
As I see it:
Code: mov edx,eax xor edx,ebx ;wait until mov is finished shr edx,1 ;wait until xor is finished and eax,ebx add eax,edx ;wait until and is finished What about: Code: mov edx,eax and eax,ebx xor edx,ebx shr edx,1 ;wait until xor is finished add eax,edx |
|||
![]() |
|
Octavio 24 May 2005, 09:54
Madis731 wrote: Would anyone care to explain to me why this first one is recommended on 'gems' sites and not the latter one. Compatibility? Why would my CPU waste perhaps because the second one only works on unsigned numbers. What about using mmx? |
|||
![]() |
|
MCD 24 May 2005, 10:53
Let me point out a small unprecision in your code suggestion
Code: mov edx,eax and eax,ebx xor edx,ebx sar edx,1 ;wait until xor is finished add eax,edx ;wait until sar is finished; <- forgot this This one is only slightly better Code: mov edx,eax xor eax,ebx and edx,ebx sar eax,1 add eax,edx ;wait sar is finished Quote: What about using mmx? Sure, this goes too: Code: ;a: mm0 ;<- this should not mean "ammo" actually, you can't only use word and dword data sizes, because of the restriction of the paddX and psraX instruction. * note: paddq was introduced with SSE2. Actually, if you allow MMX-II instructions introduced with SSE, you can calculate the average in one instruction: Code: pavgb/w mm0,mm1;only available for byte/words ;There is also something very similar for xmm FPU registers _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || Last edited by MCD on 25 May 2005, 11:07; edited 4 times in total |
|||
![]() |
|
MazeGen 24 May 2005, 11:01
Eh, you're right, MCD
![]() I was pointing in the first place to the fact that Madis have leaved out of consideration dependecies between the instructions. |
|||
![]() |
|
Madis731 24 May 2005, 15:31
You are right - there are dependancies but I would've noticed them if I had used Pentium II or earlier one. The code you suggested acts the same way because the "mov eax,edx" can't start before the last "add eax,edx" is finished so one pass is faster, but by running it 10 times in a row (NO jmp - only 10xcode) the dependancy problem occurs in a different place.
I want to argue about signed numbers: It works with POS+POS, POS+NEG, NEG+POS and NEG+NEG 0FFFFFFFFh+0FFFFFFFDh=Carry+0FFFFFFFCh => -1+(-3)=-4 rcr 0FFFFFFFCh,1 = 0FFFFFFFEh <= -2 What do you mean it works ONLY on unsigned numbers? BTW the SSE solution is elegant ![]() You didn't answer my question but were arguing about the opimization of the first algorithm. What I need to know is WHY is my version BAD? I don't want to know how optimized the first version is - thanks! |
|||
![]() |
|
r22 24 May 2005, 22:08
Code: mov eax,-4 mov edx,2 add eax,edx rcr eax,1 push eax push fmt ; '%li',0 push buffer call [wsprintf] add esp,0ch push 0 push buffer push buffer push 0 call [MessageBox] When finding the average of a negative and a positive number the ADD RCR algorithm fails. |
|||
![]() |
|
Madis731 25 May 2005, 10:18
Indeed
![]() Thanks for pointing that out. |
|||
![]() |
|
MCD 25 May 2005, 11:03
Ups, i missed this issue too. Just corrected my stuff also.
|
|||
![]() |
|
Madis731 26 May 2005, 16:45
Ok, now that its clear I say that I will stick to my version because I've never had to use signed values. This will do.
Thanks everybody for your input! Too bad there isn't rotate arithmetic right ![]() |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.