flat assembler
Message board for the users of flat assembler.

Index > Windows > SSE2 Question

Author
Thread Post new topic Reply to topic
Kuemmel



Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany
Kuemmel
Hi,

I try to get into SSE2 and got some problems. Whereas
Code:
...
  plot_double       rq 2           ;reserve 2 quad words = 128 bit sse register
...
        mov ecx,plot_double    ; get address of 
        movapd xmm0,[ecx]    ; get whole xmm0 register
        mulpd  xmm0,xmm0    ; xmm0*xmm0
        movapd [ecx],xmm0    ; store result at same address
...
    

doesn't work (crash), the following works correctly:
Code:
...
  plot_double       rq 2           ;reserve 2 quad words = 128 bit sse register
...
         mov ecx,plot_double
         movhpd xmm0,[ecx]
         movlpd xmm0,[ecx+8]
         mulpd  xmm0,xmm0
         movhpd [ecx],xmm0
         movlpd [ecx+8],xmm0
...
    

Why the first version doesn't work !?
Post 28 Feb 2006, 17:55
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
This one crashes
Code:
format PE GUI 4.0
entry start

start:
        mov ecx,plot_double    ; get address of  
        movapd xmm0,[ecx]    ; get whole xmm0 register 
        mulpd  xmm0,xmm0    ; xmm0*xmm0 
        movapd [ecx],xmm0    ; store result at same address
        ret

  plot_double       rq 2           ;reserve 2 quad words = 128 bit sse register    


This not:
Code:
format PE GUI 4.0
entry start

start:
        mov ecx,plot_double    ; get address of  
        movapd xmm0,[ecx]    ; get whole xmm0 register 
        mulpd  xmm0,xmm0    ; xmm0*xmm0 
        movapd [ecx],xmm0    ; store result at same address
        ret

  align 16
  plot_double       rq 2           ;reserve 2 quad words = 128 bit sse register
    


I'd never programmed using SSE but there are instructions to load unaligned data, I think.
Post 28 Feb 2006, 18:28
View user's profile Send private message Reply with quote
Reverend



Joined: 24 Aug 2004
Posts: 408
Location: Poland
Reverend
Yes, align the SSE data to 16 and use 'movapd' or otherwise 'movupd' if data not aligned.
Post 28 Feb 2006, 21:08
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.