flat assembler
Message board for the users of flat assembler.
Index
> Windows > Mandelbrot Benchmark FPU/SSE2 released Goto page Previous 1, 2, 3 ... 10, 11, 12 ... 18, 19, 20 Next |
Author |
|
Madis731 31 Mar 2008, 06:22
Damn - I was meant to try that at home if it crashed on my quad, but I forgot I'll do it today!
|
|||
31 Mar 2008, 06:22 |
|
asmfan 31 Mar 2008, 07:18
sse2 crashes at 4025b0h too
Code: .finished_end_of_line: ;Crash in there: mov ebp, [ebx+edx*4] ; get colour word Did you test it? |
|||
31 Mar 2008, 07:18 |
|
Kuemmel 31 Mar 2008, 17:05
@asmfan: Thanks for the hint...I discovered that in very few cases there seems to be a result of the iteration count that is negative.
So for the moment I just capture the error and I uploaded that modified version again to my homepage...anybody still got crashes ? I'll try to find the error in my logic how this does come up, though I don't expect that the error harms the basic performance much. To find out when this happens, is there a nice routine in Direct-Draw that saves a screendump on the harddrive ? @edit: I found the problem...in like 9 cases cases of the millions of iterations done it happens that in a pixel pair of one xmm register one pixel reached maximum iterations and the other is diverged at the same time...fixed that by first detecting for maximum iterations. I guess now it's working...though I have to check also if the case is vice versa...at least no more crash should happen. Just the whole logic needs some corrections. |
|||
31 Mar 2008, 17:05 |
|
Madis731 31 Mar 2008, 22:13
SSE2 crashes - even when affinity set to any ONE CPU or any two or three
FPU runs fine with some nearly 600 result. Q6600, W2K3s, x64, SP2 |
|||
31 Mar 2008, 22:13 |
|
f0dder 31 Mar 2008, 23:04
The FPU version also works here, but there's some rendering artifacts - perhaps there's a few bugs in the graphics code? Gives 756 result here, I'm running my Q6600@3.0GHz.
|
|||
31 Mar 2008, 23:04 |
|
Kuemmel 01 Apr 2008, 20:37
@Madis731 and Fodder,
I'm still on the search for bugfix, think I didn't catach all overflows with my split 16bit counter in one 32bit register...for the moment I try to capture all the overflows. I attached this version to test directly here...please test again if you got time. Thanks for the help ! I still can't reproduce any crashes on an Core 2 Duo 6300.
|
|||||||||||
01 Apr 2008, 20:37 |
|
edfed 01 Apr 2008, 20:47
i've tested. it crashes my machine.
must shutdown with brute force. no keyboard, only the mouse pointer. my machine is not sse2, but at least it shall tell me it require sse2. please try to correct this problem. |
|||
01 Apr 2008, 20:47 |
|
Kuemmel 02 Apr 2008, 19:12
@edfed: Yep, something on my list...CPU/SSE2 detection and warning message...
In the meantime I got positive results from an Core 2 Duo 6550 with no crashes any more (crashed about every second time). Now the latest version is on my webpage for download. Hope it's working now also on the Quad Core's. http://www.mikusite.de/x86/KMB_V0.53F-32b-MT.zip |
|||
02 Apr 2008, 19:12 |
|
f0dder 02 Apr 2008, 22:00
Works - Q6600@3GHz scored 3976 with SSE2 version
|
|||
02 Apr 2008, 22:00 |
|
bitRAKE 02 Apr 2008, 22:58
1.6Ghz Dothan / 179.06 SSE2 / 106.72 FPU
About +10% for SSE2. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
02 Apr 2008, 22:58 |
|
asmfan 03 Apr 2008, 06:49
Kuemmel you are using CloseWindow at the end of program is it purposely? I beleave you need to use DestroyWindow instead because CloseWindow just minimizes window? I met the situation when it wasn't minimized and was top level black box above the final Message Box with results. Had to alt+tabbed to MB blindly.
|
|||
03 Apr 2008, 06:49 |
|
madmatt 03 Apr 2008, 10:13
No crashes here. My results for 'KMB_V0.53F-32b-MT'
CPU: Intel Celeron D 352 3.2Ghz (1.5GB ram) SSE2: 472.975 <---> FPU: 101.665 |
|||
03 Apr 2008, 10:13 |
|
Kuemmel 05 Apr 2008, 14:18
@asmfan: I tried 'DestroyWindow' but then my result message window wasn't shown...hm,...I don't have much of a clue about OS-coding, that part was done by plenty guys of the forum, still learning about that...so if somebody else got a solution...?
@edfed: I implemented a SSE2 detection now and uploaded it (still available on the same location on my webpage, didn't update the version for that)...hopefully it works and doesn't harm anything Tested on my old Athlon Thunderbird and Sempron where it works well. I don't check for FPU as it should be there since all of these 486SX cpu's... If there's anybody with a P4 and can test again with and without Hyperthreading...would be nice...should give a clue about how the upcoming Core 2 Duo Nehalem might behave may be... |
|||
05 Apr 2008, 14:18 |
|
edfed 10 Apr 2008, 20:20
look at teh three videos on this link, you'll see some mandelbrot zoom very interresting.
http://perso.numericable.fr/haasjn/haasjn/ |
|||
10 Apr 2008, 20:20 |
|
Kuemmel 12 Apr 2008, 15:38
...nice movies @edfed...!
Finally I gave the FPU-version also a big push, encouraged by what Xorpd! said to apply the same techniques to that shitty stack design... So I tried the following: 1) 2 points iteration in inner loop, seperate exits from the loop 2) 2 points iteration in inner loop, loop unroling 1 time, seperate exits from the loop 3) 3 points iteration in inner loop, seperate exits from the loop I tried all on a AMD Sempron, Core 2 Duo, P4 (no HT). With interesting overall results, gains written in percentage compared to the old 1 point iteration loop: AMD Sempron 1800 MHz 1) + 39 % 2) + 45 % 3) + 32 % Core 2 Duo, 1867 MHz 1) + 48 % 2) + 67 % 3) + 25 % P4, no HT, 2667 MHz 1) + 13 % 2) + 90 % 3) + 69 % So the clear winner is Version 2. 3 points only helped the P4 due to it's shitty long pipeline design and gives another hint why they implemented HT at that time. Still looking for P4 HT results. Anyone ? The problem with the 3 points loop is also that I couldn't use the fast FCOMIP instruction any more directly (lack of registers) as that one is not capable of using a memory operand, so I had to do a FLD before... Anyway, I'm quite happy that some lessons learned for SSE2 gave the FPU version finally a boost. Find the Version 2) attached here. I will update my website soon also with it. Any comments welcome like always.
|
|||||||||||
12 Apr 2008, 15:38 |
|
AlexP 12 Apr 2008, 16:31
THat FPU program is pretty cool, unfortunately under Vista 32-bit it screwed up my computer . THe taskbar is totally missing, windows are off the screen, I'll reboot and be right back...
[edit]: I got 969.453 for the SSE2, 304.922 for the FPU |
|||
12 Apr 2008, 16:31 |
|
f0dder 12 Apr 2008, 16:44
Q9660@3.0GHz: FPU: 1261, SSE2: 3976.
|
|||
12 Apr 2008, 16:44 |
|
Kuemmel 12 Apr 2008, 18:57
AlexP wrote: THat FPU program is pretty cool, unfortunately under Vista 32-bit it screwed up my computer . THe taskbar is totally missing, windows are off the screen, I'll reboot and be right back... Ups...but now it's working !? Hopefully no other harm...what is your CPU and MHZ ? Anyone with a Pentium-M here or other CPU's ? |
|||
12 Apr 2008, 18:57 |
|
edfed 12 Apr 2008, 19:16
pentium III M. is it good?
|
|||
12 Apr 2008, 19:16 |
|
Goto page Previous 1, 2, 3 ... 10, 11, 12 ... 18, 19, 20 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.