flat assembler
Message board for the users of flat assembler.
Index
> Main > mov eax,ebx or xchg eax,ebx |
Author |
|
LocoDelAssembly 10 Mar 2006, 22:33
"xchg" is one byte and "mov" two bytes.
mov is faster Did you noticed that xchg will copy ebx to eax but will overwrite the value of ebx with the value of eax too? |
|||
10 Mar 2006, 22:33 |
|
asmrus 10 Mar 2006, 22:45
yes i knew that...
x - change bytes but how many times can be mov faster than xchg ? ... its all assembly _________________ "...they track us, our interests and our hosts, we track them, their interests and their hosts, it's an interesting match and we'll always win, coz we do not do it for money... work well, +ORC" |
|||
10 Mar 2006, 22:45 |
|
penang 11 Mar 2006, 07:03
Test it under the following condition:
A. Exchange something for a hundred million times. Time the result. B. Move something for a hundred million times. Time the result. C. Compare the result of A and B. D. Walla ! |
|||
11 Mar 2006, 07:03 |
|
r22 11 Mar 2006, 08:34
MOV is about 4X faster than XCHG
On an AMD x2 3800+ (2.0ghz) 1gig ram Results from the below code snippet: XCHG 10076 milliseconds MOV 2357 milliseconds Code: push 0 push 0 push 0 push 0 call [MessageBox] ;--------------------------------------- call [GetTickCount] mov edi,eax mov esi,07FFFFFFh tst1: repeat 100 xchg eax,ebx end repeat dec esi jnz tst1 call [GetTickCount] sub eax,edi push eax push result1 call [printf] call [GetTickCount] mov edi,eax mov esi,07FFFFFFh tst2: repeat 100 mov eax,ebx end repeat dec esi jnz tst2 call [GetTickCount] sub eax,edi push eax push result2 call [printf] ;+++++++++++++++++++++++++++++++++++++++++++++ push 0 push buffer push buffer push 0 call [MessageBox] push 0 call [ExitProcess] |
|||
11 Mar 2006, 08:34 |
|
Tomasz Grysztar 11 Mar 2006, 10:47
This is perhaps because XCHG always locks the bus, as if you had used the LOCK prefix with it.
|
|||
11 Mar 2006, 10:47 |
|
Borsuc 11 Mar 2006, 13:38
I thought XCHG locks the bus only when you do memory operations with it? Register-to-register is not a memory operation.. or am I wrong?
|
|||
11 Mar 2006, 13:38 |
|
LocoDelAssembly 11 Mar 2006, 13:52
Thomasz, not sure if you are right but I think that the problem in the Athlon64 is that XCHG is a VectorPath instruction even if the operands are regs only.
Software Optimization Guide for the AMD64 Processors wrote:
|
||||||||||
11 Mar 2006, 13:52 |
|
Tomasz Grysztar 11 Mar 2006, 13:53
Right, I missed the point that it's only about register exchaning here.
|
|||
11 Mar 2006, 13:53 |
|
LocoDelAssembly 11 Mar 2006, 14:24
I'd tested with a code similar to r22's code using 16 bytes aligned loops and realtime priority but I had the same times.
|
|||
11 Mar 2006, 14:24 |
|
Borsuc 11 Mar 2006, 14:39
Mobile AMD Sempron 2800+ (1.6 Ghz) and tested with 1000 repeats:
xchg eax, ebx: 1500 cycles mov eax, ebx: 329 cycles But that isn't fair, as xchg exchanges the values. Code: mov ecx, eax mov eax, ebx mov ebx, ecx xchg is the best (at least on this processor) to exchange the registers (even the trick with 3 xors is about 3000 cycles, not to mention it's a lot larger in size) |
|||
11 Mar 2006, 14:39 |
|
bogdanontanu 11 Mar 2006, 16:34
As Grey_Beast say, XCHG really DOES something else.
So the comparison is not fair. XCHG is also usefull because it uses an internal CPU temeporary register and by doing this is avoids making dirty one on the standard registers. This property can be very usefull at times! I would first care for simple and easy to understand code and only later for speed. However i agree that for speed issues it is better to avoid using XCHG inside "inner loops" unless you desperately need its extra temporary register. |
|||
11 Mar 2006, 16:34 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.