flat assembler
Message board for the users of flat assembler.

Index > Main > Intel releases Haswell's new AVX2 instructions

Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 13 Jun 2011, 08:16
See the PDF: http://software.intel.com/file/36945

PDET / PEXT look interesting. Looks like a lot of silicon will be needed to support just these two. I wonder where they are useful?
Post 13 Jun 2011, 08:16
View user's profile Send private message Visit poster's website Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 13 Jun 2011, 16:46
Very interesting instructions, I'd love to read the rationale for their implementation.

PEXT is a move bit mask with a bit selector, I can kind of see how this would be convenient.
But PDEP is just an inverse of PEXT for the sake of symmetry...?

Maybe custom error correction/parity bit encoding?
Or perhaps the intended usage is more mathematically motivated like checking whether integer N is a multiple of some polynomial by PEXT the necessary bit patterns and comparing them?
Post 13 Jun 2011, 16:46
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 13 Jun 2011, 18:17
Remember the undocumented IBTS/XBTS instructions from the first-generation 80386? It seems that they finally landed on this idea again.
Post 13 Jun 2011, 18:17
View user's profile Send private message Visit poster's website Reply with quote
idle



Joined: 06 Jan 2011
Posts: 440
Location: Ukraine
idle 13 Jun 2011, 18:52

Tomasz,

'tables.inc' file uses DW directive
will it satisfy elements growth?
are you planning to use DD instead?
will reserved fields appear by the way too... let it say, for cpu version(.386) control?
why do not you use macro for that purpose?

how about:
CPU equ etc1
CPU equ etc2
restore CPU
?

do you have unrealized/desired, undesired, arguing plans about fasm?
Post 13 Jun 2011, 18:52
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 09 Jul 2013, 21:41
I recently discovered this document from a blog's comment section (couldn't find it again) where they talked about BMI taking much silicon.
http://palms.princeton.edu/system/files/IEEE_TC09_NewBasisForShifters.pdf
In there you can find that butterfly circuitry will not require more (they even assure that its less) transistors to implement shifts, rotates and those almost all permutations-allowing PDEP/PEXT pair instructions.

I've found that the throughput is actually 1 clock (with 3 clock latency):
Code:
 instruction            u  p0  p1  p5  p6  p23  p4  p7  tp  l
pext/dep r,r,r          1      1                        1   3
pext/dep r,r,m          2      1           1            1   3
bextr r,r,r             2  x   x   x   x                1   2
bextr r,m,r             3  x   x   x   x   1            1   2
blsr/i/msk r,r          1      x   x                    0.5 1
blsr/i/msk r,m          2      x   x       1            0.5 1
l/tzcnt r,r             1      1                        1   3
l/tzcnt r,m             2      1           1            1   3
bzhi r,r,r              1      x   x                    0.5 1
bzhi r,m,r              2      x   x       1            0.5 1
    

My guess is that if they would allow pext/pdep to go to any port (0, 1, 5, 6), which means 4 times the transistors, it could have the throughput of a NOP or ADD r,r on Haswell right now.
Post 09 Jul 2013, 21:41
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.