flat assembler
Message board for the users of flat assembler.

Index > Windows > More fun with AVX (3D mandelbox)

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 22 May 2013, 23:02
Randall's quaternion julia set renderer at http://board.flatassembler.net/topic.php?t=13676 inspired me try a 3D mandelbox(https://sites.google.com/site/mandelbox/what-is-a-mandelbox) renderer on windows.

Here is this program in action:
http://www.youtube.com/watch?v=-Jo1TqpfpRw (latest version)
http://www.youtube.com/watch?v=AoJcOxTGr4c (earlier version)

Here are some results of the Mandelbox2 benchmark @ 1280x720 (press 'B'):
Code:
processor               SSE     AVX      FMA3    FMA4
amd PhII N660(2C2T)    10250ms  -        -       -
intel i5-3320(2C2T)     7200ms  4750ms   -       -
intel i5-3320(2C4T)     5100ms  3350ms   -       -
intel i7-2600(4C8T)     2650ms  1850ms   -       -
intel i7-4770(4C8T)     2250ms  1500ms   ?       -    

AVX is not showing a 2:1 improvement in sandy/ivy bridge because the vdivpd is split internally, but the difference is better on ivy bridge than sandy bridge.

Mandelbox3 benchmark:
Code:
processor               SSE     AVX      FMA3    FMA4
intel i5-3320(2C4T)     4800ms  3150ms   -       -
intel i7-4770(4C8T)     2120ms  1420ms  1250ms  -    


Please read the beginning of the source before using.


Description: SSE+AVX+FMA3. shader+occlusion+blur.
Download
Filename: Mandelbox3.zip
Filesize: 125.04 KB
Downloaded: 412 Time(s)



Last edited by tthsqe on 09 Nov 2013, 08:32; edited 53 times in total
Post 22 May 2013, 23:02
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 23 May 2013, 01:26
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 20:20; edited 1 time in total
Post 23 May 2013, 01:26
View user's profile Send private message Reply with quote
randall



Joined: 03 Dec 2011
Posts: 155
Location: Poland
randall 25 May 2013, 13:44
Great work. Waiting for next versions with improved shading Smile
Post 25 May 2013, 13:44
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 26 May 2013, 21:53
Randall,
Suppose we are situated at view point V and we have a light source at point L.
For each point P on the fractal surface I have the following:

0) the number of ray steps taken to reach P
1) RGB values (each between 0 and 1) for color function at P
2) normal to surface (calculated as gradient of distance function at P)
4) a shadow coefficient (between 0 and 1; 1 means P can see L, 0 means P cannot see L)
5) screen space ambient occlusion coefficient (between 0 and 1) at P

Are these enough to make a photo realistic scene? How would one do it?

The shading in the posted picture is based off of 1) alone.


HaHa,
Once I get the AVX version optimized and beautiful, I'll scale it down to SSE.
You don't have Sandybridge/Ivybridge/Bulldozer?
Post 26 May 2013, 21:53
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4024
Location: vpcmpistri
bitRAKE 27 May 2013, 02:24
tthsqe wrote:
Are these enough to make a photo realistic scene? How would one do it?
Yes.
Code:
// Mandelbox shader by Rrrola
// http://rrrola.wz.cz/

// Blinn-Phong shading model with rim lighting (diffuse light bleeding to the other side).
// `normal`, `view` and `light` should be normalized.
vec3 blinn_phong(vec3 normal, vec3 view, vec3 light, vec3 diffuseColor) {
  vec3 halfLV = normalize(light + view);
  float spe = pow(max( dot(normal, halfLV), 0.0 ), 32.0);
  float dif = dot(normal, light) * 0.5 + 0.75;
  return dif*diffuseColor + spe*specularColor;
}


    col = color(p);
    col = blinn_phong(n, -dp, normalize(eye+vec3(0,1,0)+dp), col);
    col = mix(aoColor, col, ambient_occlusion(p, n));    

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 27 May 2013, 02:24
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 27 May 2013, 14:49
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 20:18; edited 1 time in total
Post 27 May 2013, 14:49
View user's profile Send private message Reply with quote
randall



Joined: 03 Dec 2011
Posts: 155
Location: Poland
randall 30 May 2013, 11:23
Very impressive! Looks great.
Unfortunately I don't have AVX CPU yet but I am buying Haswell (AVX2) next month so i will try it.
Post 30 May 2013, 11:23
View user's profile Send private message Visit poster's website Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 30 May 2013, 21:07
I have tried on Linux with Wine and it unfortunately
crashes on any input (resize, move, keypress).
Mandel example that comes with fasmw works.
Post 30 May 2013, 21:07
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 30 May 2013, 23:19
Here is my check list:
- If Wine is an emulator, then it needs to have the AVX instruction set not #UD. I also imagine that emulation of the AVX instructions on a non-AVX cpu is horribly slow...??
- Windows requires an update in order that the upper half of the ymm registers get saved to the thread context.
- Does it draw the initial image correctly and crash after you give it some input? If that is the case, then more investigation is required.
- Can you run it through a debugger and tell me the address on which it crashes?

Has anybody else got this thing to work on their system? It works perfectly for me...
Post 30 May 2013, 23:19
View user's profile Send private message Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 31 May 2013, 00:59
Ok,
1. wine is not emulator rather translates windows calls to linux
functions. It is a layer.
cpu is avx capable (i5 3570k).
2. ymm registers are taken care by Linux (as program is natively executed)
3. Yes, it draws initial image correctly, I run it in 1024x768window.
Ouptut is like picture you posted.
It crashes when pressing key or resizing.
4. I have attached Wine debug output after crash.


Description: Wine(Linux) debug output of MandelboxExplorer.EXE
Download
Filename: backtrace.txt
Filesize: 5.74 KB
Downloaded: 331 Time(s)

Post 31 May 2013, 00:59
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 31 May 2013, 01:58
So does wine load completely different libraries into the program? For example, does wine load a completely rewritten user32 from the one that my real windows OS loads? In that case, there is no guarantee that wine will be as forgiving as windows is to the fact that my window callback function does not preserve all of the registers it should http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

You can see that "Paint:" in "MandelbrotExplorer.asm" calls "PlotMain:" and PlotMain destroys lots of registers

While I investigation this further, you could in the mean time see if pushing all the nonvolatile registers at the beginning of my function Paint and then popping them in reverse order at the end of Paint fixes the problem.

If it does/doesn't let me know
Post 31 May 2013, 01:58
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 31 May 2013, 02:00
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 20:17; edited 1 time in total
Post 31 May 2013, 02:00
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 31 May 2013, 02:03
HaHa, if it draws the initial image correctly, THEN IT IS WORKING! Smile
I have no idea what wine is, but I suspect the problem is my fault. Sad

Quote:
In other words: Avoid Wine when possible.


Actually, this is not a bad idea here - this code is absolutely trivial to port to different OS's. The only OS specific code is mem allocation for pixel buffer and the bitmap display. Oh, and also thread management, but i suspect linux has a similar api for this.
Post 31 May 2013, 02:03
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 31 May 2013, 02:07
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 20:17; edited 1 time in total
Post 31 May 2013, 02:07
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 31 May 2013, 02:19
Well, I think I located the source of the problem - I did violate the register usage convention. We will have to wait to hear back from Melissa to see if this is the problem. Also, Melissa, make sure that you push an odd number of registers to keep the stack pointer dqword aligned. (at least some windows functions are sensitive to this.)
Post 31 May 2013, 02:19
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 31 May 2013, 02:22
tthsqe wrote:
... the fact that my window callback function does not preserve all of the registers it should ...
Why not? How is it a hardship to add a push and pop into the function?

Relying upon non-documented behaviour in the specific version of your OS is only going to give you problems.
Post 31 May 2013, 02:22
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 31 May 2013, 02:37
Like I said, this is my fault. It's not reliance, it unawareness. It is amazing that it even works at all - my Paint function destroys EVERY register except rbp, r12, r13, and r14. OOPS Embarassed

I have had programs crash when the callback didn't preserve rbp so I think that this is only one that needs to be preserved in my copy of windows.
Post 31 May 2013, 02:37
View user's profile Send private message Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 31 May 2013, 02:57
tthsqe wrote:
So does wine load completely different libraries into the program? For example, does wine load a completely rewritten user32 from the one that my real windows OS loads? In that case, there is no guarantee that wine will be as forgiving as windows is to the fact that my window callback function does not preserve all of the registers it should http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

You can see that "Paint:" in "MandelbrotExplorer.asm" calls "PlotMain:" and PlotMain destroys lots of registers

While I investigation this further, you could in the mean time see if pushing all the nonvolatile registers at the beginning of my function Paint and then popping them in reverse order at the end of Paint fixes the problem.

If it does/doesn't let me know


Seems that wine has own version of libraries.
It works now! Yeeeeeey!
I have pushed all registers and aligned stack on 16 bytes (just in case)Wink
Program runs perfectly!
Post 31 May 2013, 02:57
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 31 May 2013, 03:30
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 20:15; edited 1 time in total
Post 31 May 2013, 03:30
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 31 May 2013, 03:56
tthsqe wrote:
Like I said, this is my fault. It's not reliance, it unawareness. It is amazing that it even works at all - my Paint function destroys EVERY register except rbp, r12, r13, and r14. OOPS Embarassed
Rereading my post and I see it looks a bit mean. Sorry, I didn't intend it to come across like it did.

Anyhow, good to see bugs getting fixed and things improving.
Post 31 May 2013, 03:56
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.