flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > SSE4 (topic is not only about implementing) POPCNT answered

Author
Thread Post new topic Reply to topic
Madis731



Joined: 25 Sep 2003
Posts: 2140
Location: Estonia
Madis731
Hi,
The main reason for this is to get some understanding about POPCNT (SSE4.2, optional). These SSE4 instructions are split in half. The second iteration has only 7 instructions, but is it actually 7?
Wikipedia wrote:

Population count (count number of bits set to 1); shares the same opcode for JMPE, the instruction used in Itanium CPUs to escape from IA-32 mode. POPCNT instruction may also be implemented in some processors that do not support SSE4 instruction set extensions (such as AMD K10) and a separate bit can be tested to confirm POPCNT presence.

When its optional and replaces JMPE, then what happens if its not present - should all future code to run on any processor have runtime detection of POPCNT presence? Then you need to replace every occurrence of JMPE respectively Neutral This is NOT possible!

And FASM should incorporate that Smile

Thoughts, answers......questions?

_________________
My updated idol Very Happy http://www.agner.org/optimize/


Last edited by Madis731 on 09 Jan 2008, 11:44; edited 1 time in total
Post 09 Jan 2008, 10:00
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
There's a difference: although JMPE employs 0FB8, POPCNT is F30FB8. Wikipedia seems to be wrong here.
Post 09 Jan 2008, 11:40
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2140
Location: Estonia
Madis731
Hmm Smile and we all rest assured! Thanks for clearing this...
Post 09 Jan 2008, 11:43
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Personally I wouldn't worry about JMPE/POPCNT overlay unless you know your app is going to run on itanium...

But since you ought to check for very late-and-breaking-news instructions such as POPCNT and choose your code paths accordingly, I don't think it'd ever end up being a problem - Itanium CPUs obviously wouldn't support POPCNT Smile
Post 09 Jan 2008, 12:20
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
f0dder wrote:
Itanium CPUs obviously wouldn't support POPCNT Smile

Why not?
Post 09 Jan 2008, 12:40
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
MazeGen wrote:
f0dder wrote:
Itanium CPUs obviously wouldn't support POPCNT Smile

Why not?


Because of the opcode overlaying issue... now, they might support it in native mode, but probably not for emulation. (And x86 hardware emulation on itanium is slower than software emulation, oh well).

Is intel still working on Itanium btw., or have they stopped after Itanium-2?

_________________
Image - carpe noctem
Post 09 Jan 2008, 12:42
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
Since POPCNT uses F3 "prefix", I don't think it really overlays JMPE, as I wrote above. Colision with some older code, which would encode JMPE as F30FB8, is very unlikely.
Post 09 Jan 2008, 12:48
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2140
Location: Estonia
Madis731
I think its pretty clear now - its one of those errors where people think that "assume makes an ass out of U and me" Razz

Assuming really has made me that, but the U in this context is Wikipedia, which I trusted in so blindly, ... oh well :S

I think that when JMPE equ 0FB8 and POPCNT equ F30FB8 means compatibility problems then its as serious as say:
MOV EAX,90909090h clash with NOPs ^o)

OR: Maybe the answer hides in the fact that you do JMPE to escape from IA-32 and then you do POPCNT when actually its the very same instruction!!! You don't need JMPE (escape from IA-32) in IA-64 mode, do you? You need POPCNT. Maybe its pure ingeniousness (if that's a word).


Last edited by Madis731 on 09 Jan 2008, 13:08; edited 1 time in total
Post 09 Jan 2008, 13:05
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Madis731 wrote:

Assuming really has made me that, but the U in this context is Wikipedia, which I trusted in so blindly, ... oh well :S

I've made that mistake a number of times in the past as well, and will probably do it some times in the future too. We should chant to ourselves: "wikiepedia should only be used as a starting point giving hints to where our real research must be done" Razz

_________________
Image - carpe noctem
Post 09 Jan 2008, 13:08
View user's profile Send private message Visit poster's website Reply with quote
Raedwulf



Joined: 13 Jul 2005
Posts: 375
Location: United Kingdom
Raedwulf
Madis731 wrote:
Maybe its pure ingeniousness (if that's a word).

No, but i think you meant "ingenuity". Wink

_________________
Raedwulf
Post 02 May 2008, 11:25
View user's profile Send private message MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.