flat assembler
Message board for the users of flat assembler.
  
|  Index
      > Main > Finding max clique size and C++ is better than asm. Goto page Previous 1, 2, 3, 4 | 
| Author | 
 | 
| Tyler 23 Feb 2017, 03:40 revolution wrote: So who won the awesome life changing prize of $60? I'm so jealous.  | |||
|  23 Feb 2017, 03:40 | 
 | 
| redsock 23 Feb 2017, 20:32 Nice! I trust you have treated yourself to some fine food and wine with your unclaimed prize   | |||
|  23 Feb 2017, 20:32 | 
 | 
| bitRAKE 23 Feb 2018, 20:24 The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement.
 I should also reference the AVX2+ improvements: Faster Population Counts Using AVX2 Instructions https://github.com/WojciechMula/sse-popcount The importance of this algorithm in Chemistry and Bioinformatics is interesting. p.s. Those who enjoyed this thread will love this: http://danluu.com/assembly-intrinsics/ [Thread has been dormant for exactly a year.  ] | |||
|  23 Feb 2018, 20:24 | 
 | 
| Melissa 27 Feb 2018, 06:00 bitRAKE wrote: The goal would also benefit almost linearly from multi-threading. The only data which isn't read-only is the common best clique size - which only needs updates on improvement. avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower... | |||
|  27 Feb 2018, 06:00 | 
 | 
| revolution 27 Feb 2018, 06:56 Melissa wrote: avx2 popcount is same speed as native instruction when unrolling loop at 4 instructions, and sse3 is slower... | |||
|  27 Feb 2018, 06:56 | 
 | 
| Melissa 27 Feb 2018, 23:40 Haswell. | |||
|  27 Feb 2018, 23:40 | 
 | 
| bitRAKE 23 Apr 2018, 06:48 https://github.com/WojciechMula/sse-popcount/blob/master/results/haswell/haswell-i7-4770-gcc5.3.0-avx2.rst
 The results for Haswell show the Harley-Seal popcount algorithm to be the fastest on 4K input. The paper also states this result. I've not read the other paper referenced at the end - regarding GPU results. The github code seems to be up to date on other processors as well. | |||
|  23 Apr 2018, 06:48 | 
 | 
| Goto page  Previous  1, 2, 3, 4 < Last Thread | Next Thread > | 
| Forum Rules: 
 | 
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.