flat assembler
Message board for the users of flat assembler.

Index > Main > movupd versus movups whats the point?

Author
Thread Post new topic Reply to topic
lazer1



Joined: 24 Jan 2006
Posts: 185
lazer1 23 Aug 2008, 18:20
Using AMD notation, see Volume 4 from the AMD website
document downloads,

movupd xmm1, xmm2/mem128
movupd xmm1/mem128, xmm2

versus

movups xmm1, xmm2/mem128
movups xmm1/mem128, xmm2


movupd moves an unaligned vector of 2 x 64 bit floating points,

movups moves an unaligned vector of 4 x 32 bit floating points

HOWEVER neither instruction checks the actual bits,

ie AFAICT they are IDENTICAL in action and could be moving

say 2 qword's

whats the point of wasting opcodes on identical effect instructions?

why not just have:

mov_arbitrary_128_bit_pattern xmm1, xmm2/mem128
mov_arbitrary_128_bit_pattern xmm1/mem128, xmm2

?

and free up opcodes?

SSE does this all the time having different instructions which
only differ at the interpretative level, IDENTICAL at the h/w level.

checking the bit pattern in hardware would slow things down
and be pointless if you have type checking in software.

both have identical exceptions and there is no wrong bit format
exception.

if you want to copy memory fast you COULD use movapd

and the memory being copied could be bytes.

that would copy 16 bytes at a time,

movapd has identical effect to movaps, these being
the aligned versions of movupd and movups.
Post 23 Aug 2008, 18:20
View user's profile Send private message Reply with quote
mattst88



Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88 23 Aug 2008, 18:24
I have wondered this too.

lazer1 wrote:
SSE does this all the time having different instructions which
only differ at the interpretative level, IDENTICAL at the h/w level.


How do you know they are the same at the hardware level?

_________________
My x86 Instruction Reference -- includes SSE, SSE2, SSE3, SSSE3, SSE4 instructions.
Assembly Programmer's Journal
Post 23 Aug 2008, 18:24
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 23 Aug 2008, 18:32
See here for a discussion about this.
Post 23 Aug 2008, 18:32
View user's profile Send private message Visit poster's website Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan 23 Aug 2008, 18:36
There much more:
movaps (SSE) vs. movapd (SSE2) vs. movdqa (SSE2).
the same for unaligned.
Uncached movntps, movntpd, movntdq.
xorps (SSE) vs. xorpd (SSE2) vs. pxor (since MMX)
same for other bit instructions.
Post 23 Aug 2008, 18:36
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 28 Aug 2008, 17:14
Post 28 Aug 2008, 17:14
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 16 Sep 2008, 06:51
Intel Larrabee SIGGRAPH paper wrote:

Additional load and store instructions
support a wider variety of conversions between floating point
values and the less common or more complex data formats found
on most GPUs. Using separate instructions for these formats saves
significant area and power at a small performance cost.

If this document is any clue to Intel's design model, it seems that movupd+movups is simpler to implement, saving die space and losing only a bit of power. Having a conversion later would add one operation and that is bad, thinks Intel.

EDIT: the link: http://isdlibrary.intel-dispatch.com/isd/1999/Siggraph_Larrabee_paper.pdf
Post 16 Sep 2008, 06:51
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.