flat assembler
Message board for the users of flat assembler.
Index
> Main > Finding max clique size and C++ is better than asm. Goto page Previous 1, 2, 3, 4 |
Author |
|
Tyler 23 Feb 2017, 03:40
revolution wrote: So who won the awesome life changing prize of $60? I'm so jealous. |
|||
23 Feb 2017, 03:40 |
|
redsock 23 Feb 2017, 20:32
Nice! I trust you have treated yourself to some fine food and wine with your unclaimed prize
|
|||
23 Feb 2017, 20:32 |
|
bitRAKE 23 Feb 2018, 20:24
The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement.
I should also reference the AVX2+ improvements: Faster Population Counts Using AVX2 Instructions https://github.com/WojciechMula/sse-popcount The importance of this algorithm in Chemistry and Bioinformatics is interesting. p.s. Those who enjoyed this thread will love this: http://danluu.com/assembly-intrinsics/ [Thread has been dormant for exactly a year. ] |
|||
23 Feb 2018, 20:24 |
|
Melissa 27 Feb 2018, 06:00
bitRAKE wrote: The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement. avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower... |
|||
27 Feb 2018, 06:00 |
|
revolution 27 Feb 2018, 06:56
Melissa wrote: avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower... |
|||
27 Feb 2018, 06:56 |
|
Melissa 27 Feb 2018, 23:40
Haswell.
|
|||
27 Feb 2018, 23:40 |
|
bitRAKE 23 Apr 2018, 06:48
https://github.com/WojciechMula/sse-popcount/blob/master/results/haswell/haswell-i7-4770-gcc5.3.0-avx2.rst
The results for Haswell show the Harley-Seal popcount algorithm to be the fastest on 4K input. The paper also states this result. I've not read the other paper referenced at the end - regarding GPU results. The github code seems to be up to date on other processors as well. |
|||
23 Apr 2018, 06:48 |
|
Goto page Previous 1, 2, 3, 4 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.