flat assembler
Message board for the users of flat assembler.

Index > Main > Finding max clique size and C++ is better than asm.

Goto page Previous  1, 2, 3, 4
Author
Thread Post new topic Reply to topic
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 20 May 2016, 22:37
I suppose it was redsock, but I'm not going to take the time to work it into a form I can test. I'll forget all the annoying rules and award it to him.

redsock, post or PM me a btc address.
Post 20 May 2016, 22:37
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 23 Feb 2017, 03:40
revolution wrote:
So who won the awesome life changing prize of $60? I'm so jealous.
$165 now, btw. Razz
Post 23 Feb 2017, 03:40
View user's profile Send private message Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 435
Location: Australia
redsock 23 Feb 2017, 20:32
Nice! I trust you have treated yourself to some fine food and wine with your unclaimed prize Smile

_________________
2 Ton Digital - https://2ton.com.au/
Post 23 Feb 2017, 20:32
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 23 Feb 2018, 20:24
The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement.

I should also reference the AVX2+ improvements:
Faster Population Counts Using AVX2 Instructions
https://github.com/WojciechMula/sse-popcount

The importance of this algorithm in Chemistry and Bioinformatics is interesting.

p.s. Those who enjoyed this thread will love this:
http://danluu.com/assembly-intrinsics/

[Thread has been dormant for exactly a year. Very Happy]
Post 23 Feb 2018, 20:24
View user's profile Send private message Visit poster's website Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 27 Feb 2018, 06:00
bitRAKE wrote:
The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement.

I should also reference the AVX2+ improvements:
Faster Population Counts Using AVX2 Instructions
https://github.com/WojciechMula/sse-popcount

The importance of this algorithm in Chemistry and Bioinformatics is interesting.

p.s. Those who enjoyed this thread will love this:
http://danluu.com/assembly-intrinsics/

[Thread has been dormant for exactly a year. Very Happy]


avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower...
Post 27 Feb 2018, 06:00
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 27 Feb 2018, 06:56
Melissa wrote:
avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower...
For which CPU is that timing?
Post 27 Feb 2018, 06:56
View user's profile Send private message Visit poster's website Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 27 Feb 2018, 23:40
Haswell.
Post 27 Feb 2018, 23:40
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 23 Apr 2018, 06:48
https://github.com/WojciechMula/sse-popcount/blob/master/results/haswell/haswell-i7-4770-gcc5.3.0-avx2.rst

The results for Haswell show the Harley-Seal popcount algorithm to be the fastest on 4K input. The paper also states this result. I've not read the other paper referenced at the end - regarding GPU results.

The github code seems to be up to date on other processors as well.
Post 23 Apr 2018, 06:48
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.