flat assembler
Message board for the users of flat assembler.

Index > Main > SSE4 white paper is now available from Intel

Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20516
Location: In your JS exploiting you and your system
revolution 22 Nov 2006, 12:40
Post 22 Nov 2006, 12:40
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8367
Location: Kraków, Poland
Tomasz Grysztar 22 Nov 2006, 12:48
Of course opcodes are yet confidential?
Post 22 Nov 2006, 12:48
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 22 Nov 2006, 12:50
well... both FASM and YASM already have SSSE3 / SSE4 support
Post 22 Nov 2006, 12:50
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Garthower



Joined: 21 Apr 2006
Posts: 158
Location: Ukraine
Garthower 22 Nov 2006, 12:51
Interesting expansions. Commands for work with strings can be very useful. I think, that it's necessary to add support of these commands in FASM.
Post 22 Nov 2006, 12:51
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8367
Location: Kraków, Poland
Tomasz Grysztar 22 Nov 2006, 12:55
vid: SSSE3 is not SSE4.
Post 22 Nov 2006, 12:55
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 22 Nov 2006, 14:26
so this SSE4 is *not* SSSE3? some real SSE4 this time? wow
Post 22 Nov 2006, 14:26
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
smiddy



Joined: 31 Oct 2004
Posts: 557
smiddy 22 Nov 2006, 17:06
Hi Tomasz, here ya go: http://www.intel.com/design/processor/manuals/253667.pdf
&
http://developer.intel.com/design/processor/manuals/253666.pdf

Whoops, a quick look and those instructions from the white paper are not in here, bummer!
Post 22 Nov 2006, 17:06
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20516
Location: In your JS exploiting you and your system
revolution 22 Nov 2006, 23:06
Sorry if the title confused people. I didn't mean to suggest that the opcodes had been published. Maybe a moderator can change the title to "SSE4 white paper is now available from Intel".

Anyhow the new instructions are not particularly exciting unless you happen to be programming in a specific field that can make use of them. The dot product and rounding instructions look to be the most useful. Maybe the string functions that mentions ZLIB as benefitting could also be worthwhile.
Post 22 Nov 2006, 23:06
View user's profile Send private message Visit poster's website Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 24 Nov 2006, 06:27
Packed multiply with 32bit and 64bit is great, something I've personally wanted.

CRC32 opcode sounds like it would be very useful. I'll hold my final judgements until the performance and latency data is available.

I can see why dot product is being added, might as well already have it in the architecture when we start seeing CPU/GPU combo processors in a few years.

WHAT I STILL WANT:
PLB/PLW/PLD/PLDQ packed load byte/word/dword/qword - replacing the addresses stored in the XMM register with the data at those addresses.
Parallel random access to memory data would rock. Imagaine being able to do multiple look ups on multiple LUTs at the same time that would be incredibly useful.
This combined with packed multiply and packed add for the address would be so cool.

Shame I don't design microprocessors :/
Post 24 Nov 2006, 06:27
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
Maverick



Joined: 07 Aug 2006
Posts: 251
Location: Citizen of the Universe
Maverick 24 Nov 2006, 07:52

r22 wrote:
Shame I don't design microprocessors :/

You can. Get a FPGA board and learn Verilog.

_________________
Greets,
Fabio
Post 24 Nov 2006, 07:52
View user's profile Send private message Visit poster's website Reply with quote
MCD



Joined: 21 Aug 2004
Posts: 602
Location: Germany
MCD 29 Dec 2006, 07:21
Maverick wrote:

r22 wrote:
Shame I don't design microprocessors :/

You can. Get a FPGA board and learn Verilog.

sure, but not everyone has time/money to do so

_________________
MCD - the inevitable return of the Mad Computer Doggy

-||__/
.|+-~
.|| ||
Post 29 Dec 2006, 07:21
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 29 Dec 2006, 12:24
Hmm, CRC32 opcode seems pretty silly to me - it's pretty blazing fast already with a LUT implementation. Where would this be useful? Outside of networking, where it really should be moved onto the NIC. Ok, intel mentions iSCSI and RDMA, but ho humm.
Post 29 Dec 2006, 12:24
View user's profile Send private message Visit poster's website Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 31 Dec 2006, 17:34
CISC all the way!
Post 31 Dec 2006, 17:34
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.