Mandelbrot Benchmark FPU/SSE2 released

Index > Windows > Mandelbrot Benchmark FPU/SSE2 released

Goto page Previous 1, 2, 3 ... 18, 19, 20

Author

Thread

Alphonso

Joined: 16 Jan 2007
Posts: 295

Alphonso 24 Jan 2011, 12:14

AFAIK the Linpack benchmark with AVX can show better than 80% improvement over SSE. It would be interesting to see what it could do for Mandelbrot.

24 Jan 2011, 12:14

tthsqe

Joined: 20 May 2009
Posts: 767

tthsqe 27 Mar 2011, 08:17

Has anyone updated this CPU burn to AVX support yet?

27 Mar 2011, 08:17

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 28 Mar 2011, 09:19

yeah, that would be sweet!

28 Mar 2011, 09:19

tthsqe

Joined: 20 May 2009
Posts: 767

tthsqe 01 Apr 2011, 13:09

Ok! I'll see what I can do to modify the previously posted code.
I'll start a new thread in the projects section within the next week.

01 Apr 2011, 13:09

kalambong

Joined: 08 Nov 2008
Posts: 165

kalambong 14 Mar 2012, 08:57

tthsqe wrote:

Ok! I'll see what I can do to modify the previously posted code.
I'll start a new thread in the projects section within the next week.

Can someone point me to the new thread that tthsqe mentioned above, please?

Thank you !

Since last discussion Win7 has been out of beta for quite some time, and Sandy Bridge is about to be replaced by Ivy Bridge in a few months

Wonder what has happened to the Mandelbrot benchmark? Has it been updated with AVX?

14 Mar 2012, 08:57

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20689
Location: In your JS exploiting you and your system

revolution 14 Mar 2012, 09:05

kalambong wrote:

Can someone point me to the new thread that tthsqe mentioned above, please?

It appears to be this:

http://board.flatassembler.net/topic.php?p=127809#127809

It is the very next post by tthsqe after the one above.

14 Mar 2012, 09:05

kalambong

Joined: 08 Nov 2008
Posts: 165

kalambong 01 Jun 2012, 06:33

Thanks !!

01 Jun 2012, 06:33

rugxulo

Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)

rugxulo 01 Jun 2012, 16:20

You need at least Win7 SP1 to use AVX, right? <sarcasm> Anyways, AVX is obsolete, AVX2 is teh real dealz!!! </sarcasm>

01 Jun 2012, 16:20

Vitor_boss

Joined: 04 Jun 2012
Posts: 3
Location: Brazil

Vitor_boss 04 Jun 2012, 01:16

Intel Core i3 M330 2130MHz
Cores: 2
Threads: 4
FPU: 528,867
SSE2: 1493,606
SSE4.1: 1511,689

EDIT: Using KMB V 0.53I-32b-MT

04 Jun 2012, 01:16

Bernhard Schornak

Joined: 19 Dec 2009
Posts: 5
Location: Augsburg, Germany

Bernhard Schornak 21 Aug 2012, 09:07

AMD FX-8150 (3600 MHZ)

FPU : 1318.538
SSE2 : 4641.840
SSE4.1: 4711.906

Processors with one execution pipe per core greatly benefit from your code, because it performs multiple operations on one and the same register in a row. Hence, the results do not show how 'performant' a CPU/FPU combination really is.

Your current code simply puts additional execution pipes to sleep. In the case of the FX-8xxx, three out of four pipes are not fed with appropriate food most of the time, rendering them quite useless. Reordering instructions properly can gain much better results for processors with multiple execution pipes.

21 Aug 2012, 09:07

Xorpd!

Joined: 21 Dec 2006
Posts: 161

Xorpd! 21 Aug 2012, 21:28

Serial operations on a single register is the path to using a big chunk of the physical register file given limited architected registers and out of order execution. In fact the numbers attained in this thread are not all that far off the maximum floating point throughput for pre-AVX processors.

The problem with Bulldozer is that an 8-core CPU only has 4 FPUs and you need to do as much work as possible using FMACs to attain maximum throughput. I am not sure that the code in the companion thread http://board.flatassembler.net/topic.php?p=127809#127809 does this; I think it's all written for Intel processors which would mean that it does not.

You just can't schedule 3 or 4 instruction streams, expecially in 32-bit mode, without counting on the out of order properties of the processor to interleave instruction streams for you. Maybe with 128 FP registers, but that is an expensive processor and doesn't provide SIMD as far as I know.

21 Aug 2012, 21:28

tthsqe

Joined: 20 May 2009
Posts: 767

tthsqe 25 Aug 2012, 02:30

hey - it looks like someone with a bulldozer chip!
Could you tell me how bulldozer performs on the program at http://board.flatassembler.net/topic.php?p=127809#127809?
You have to hit "R" to change the calculation path used.
Just explore a little bit and tell me the max GFLOPS (upper left) you encountered on each calculation path.

25 Aug 2012, 02:30

Kuemmel

Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany

Kuemmel 27 Aug 2012, 16:57

I would be also curious if tthsqe's benchmark shows an improvement regarding the "Bulldozer" issue. My benchmark is let's say kind of obselete in times of AVX, but of course not everyone got an AVX chip yet so it's still relevant for those...

I can only state that my benchmark was never developed to favour either Intel or AMD. I tested reordering of instructions with old Semprons and Phenoms with almost no difference.

I remember when I got the help from Xorpd regarding the multiple instruction streams the Intel's with Core and Core 2 architecture just where so much faster out of the box, while AMD's didn't do much. Phenom just picked up the speed because (I think) doubleing the path at that time.

Either the out-of-order design or the overall amount of instruction units of the AMD's is just not as good as Intels.

As Xorpd stated with AVX fused multiply add instructions it might be a different story. My benchmark maybe reflects more the past and current -non-AVX software where I would say Intel's FPU/SSE is just superior.

27 Aug 2012, 16:57

Goto page Previous 1, 2, 3 ... 18, 19, 20

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum