flat assembler
Message board for the users of flat assembler.

Index > Main > Mov DWORD to DWORD

Author
Thread Post new topic Reply to topic
Kinex



Joined: 16 Jul 2004
Posts: 35
Kinex 19 Jan 2006, 14:45
Does anyone know how this can be realized in the fastest way?:

Code:
use32
dw1 dd ?
dw2 dd ?

mov [dw1],[dw2]    


This doesn't assemble but I need the fastest way.

or do I have to write:

Code:
mov eax,[dw1]
mov [dw2],eax    


And there is no workarround?
Post 19 Jan 2006, 14:45
View user's profile Send private message Reply with quote
Vasilev Vjacheslav



Joined: 11 Aug 2004
Posts: 392
Vasilev Vjacheslav 19 Jan 2006, 15:05
push [dw1]
pop [dw2]
Post 19 Jan 2006, 15:05
View user's profile Send private message Reply with quote
Kinex



Joined: 16 Jul 2004
Posts: 35
Kinex 19 Jan 2006, 15:39
Thanks! Wink
Really great solution! didn't thought of that.
But I have to tell you that this is much slower at least on my CPU. Hmm
Is there another way?
Post 19 Jan 2006, 15:39
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 19 Jan 2006, 16:25
Move through temporary register is the fastest way.

The only another (slow) way:

Code:
mov esi, dw1
mov edi, dw2
movsd
    
Post 19 Jan 2006, 16:25
View user's profile Send private message Visit poster's website Reply with quote
Kinex



Joined: 16 Jul 2004
Posts: 35
Kinex 19 Jan 2006, 17:32
Thanks! I'll just will go this way.
But anyway I got another problem here..
does anyone know how to speed up this?

Code:
use32
dword1 dd 0
dword2 dd 1000000000

for:
                mov     eax,[dword1]
                mov     edx,[dword2]
                cmp     eax,edx
                jc      next
                jge     outta
next:
                add     [dword1],00000001h
                jmp     for
outta:    


I know I just could use cmp eax,100000000
but I need to have those as variables.

I thought of test instruction or something but it didn't work.

Thanks for any help Wink

Kinex
Post 19 Jan 2006, 17:32
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 19 Jan 2006, 17:53
Kinex wrote:
Thanks! Wink
Really great solution! didn't thought of that.
But I have to tell you that this is much slower at least on my CPU. Hmm
Is there another way?

There's a way to do it without modifying any register and without touching the stack:
Code:
        mov     [dw2],eax
        xor     eax,[dw1]
        xor     [dw2],eax
        xor     eax,[dw1]    

On my machine it's sometimes faster than PUSH/POP and sometimes slower, depending on the context.
Post 19 Jan 2006, 17:53
View user's profile Send private message Visit poster's website Reply with quote
RedGhost



Joined: 18 May 2005
Posts: 443
Location: BC, Canada
RedGhost 20 Jan 2006, 02:09
Quote:
Code:
mov     eax,[dword1]
mov     edx,[dword2]
cmp     eax,edx 
    


you can do
Code:
mov eax, [dword1]
cmp eax, [dword2]
    


they don't both have to be in registers

_________________
redghost.ca
Post 20 Jan 2006, 02:09
View user's profile Send private message AIM Address MSN Messenger Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 20 Jan 2006, 08:41
Hi,
I don't want you to get offended by no means, but here are some tips that you might find useful. What you are doing is not wrong, but there are better ways to solving this:
Code:
use32
dword1 dd 0
dword2 dd 1000000000

for:
                mov     eax,[dword1]
                ;mov     edx,[dword2] 
                ;cmp     eax,edx
                cmp     eax,[dword2]  ; Like some posts before - not much quicker but still shorter
                ;jc      next
                ;jge     outta
; You don't want two comparisons here. When you test it for carry, you know
; its not smaller (in unsigned world) and you don't have to check again if it
; is bigger. What you have done is checking if dword1 is smaller than dword2
; and then checking if dword1 is positive and dword2 is negative. I'm not
; sure if you wanted to do that. So there are three ways out of the loop.
                jae     outta ;same as jnc and you don't need anything else
;next:
                add     [dword1],00000001h
                jmp     for
outta:
    


Of course this is not a real life situation, but I would do the adding on a register and then, when I'm out of the loop - write it back to memory. I know that there are caches in CPU, which allow you 3 clock operations or less on any memory location, but the fastest way is to take full use of the registers.

Why did they put 16 x 64-bit registers in AMD64/EM64T? Wink think about it.

Oh, and why is push/pop method slower? I noticed heavy use of it in some quicksort routines that weren't so quick anymore, when compared to my optimizations:
Code:
push [dword1]
pop [dword2]
;These are actually macroinstructions, something like:
mov tempreg,[dword1] ;Its CPU's internal register
mov [esp],tempreg
sub esp,4
; This one pushed dword1 on the stack
add esp,4
mov tempreg,[esp]
mov [dword2],tempreg
; This one popped a value from the stack to dword2
    


The reason why some CPUs are quicker is that they have special circuitry for that. Some have optimized multiplications, others cache-efficiency and longer term optimizations.

Why is using your own register better, can be seen from this example, because you can remove the modifying of stack easily. The example seems pointless, but you can't brake up instructions into smaller ones in the architectual level. Although Pentium III and up are able to do 3 microoperations per clock and maybe (I don't have any source on that) there might happen miracles like optimizing on the microcode level.

_________________
My updated idol Very Happy http://www.agner.org/optimize/
Post 20 Jan 2006, 08:41
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.