flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2, 3 Next |
Author |
|
revolution 24 Mar 2013, 09:54
comrade wrote: 1) Dynamic code rewriting. Some of these extended instructions are so long that you can easily patch them over with a call to a subroutine. |
|||
![]() |
|
randall 24 Mar 2013, 11:46
comrade wrote:
Intel has such a tool: http://software.intel.com/en-us/articles/intel-software-development-emulator |
|||
![]() |
|
randall 03 Apr 2013, 11:01
I have updated my program. Now it is fully multithreaded. See starting post for more details.
|
|||
![]() |
|
HaHaAnonymous 03 Apr 2013, 16:01
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 21:07; edited 1 time in total |
|||
![]() |
|
keantoken 13 Apr 2013, 14:52
Well then how does one protect their computer? You have to do it somehow.
I will certainly study the multithreading code! Thanks a lot Randall! Here's the results with my FX8350 at 4GHz with 1880MHz RAM: EDIT: Actually here's the results: $ time ./qjulia -v real 0m1.200s user 0m8.864s sys 0m0.016s And here's the results with a 2560x1440 resolution like the original version: $ time ./qjulia -v real 0m4.671s user 0m35.442s sys 0m0.043s |
|||
![]() |
|
Turbo Lover 14 Apr 2013, 17:06
this is awesome! how long did it take you to develop this??? you must be very experienced in asm!
|
|||
![]() |
|
randall 15 Apr 2013, 16:44
Thanks! It took my three or four days for single threaded version and another three days for multithreading code. keantoken motivated me to implement multithreaded version (thanks
![]() The code is rather simple. You just have to know the technique. This is 'standard' distance field raymarching: http://www.iquilezles.org/www/articles/raymarchingdf/raymarchingdf.htm I knew the algo because some time ago I have written similar program using C++ and OpenGL. Using GPUs it can run in real-time. |
|||
![]() |
|
tthsqe 07 Jun 2013, 00:32
Ok, I just want to get some verification on how this works:
So, at point (X,Y,Z) in three space, You form the quaternions Code: z'=1+0i+0j+0k z=0+Xi+Yj+Zk Then you iterate Code: z'=2*z'*z z=z*z+c After a few iterations, you estimate the distance from the point (X,Y,Z) to the surface by Code: DE = 0.5* log(|z|)*|z|/|z'| I see that c is your g_Quat parameter, but is the choice Code: z=0+Xi+Yj+Zk arbitrary? |
|||
![]() |
|
tthsqe 07 Jun 2013, 02:51
Also, how do you know that the iteration for z' is not
Code: z'=z'*z+z*z' |
|||
![]() |
|
randall 07 Jun 2013, 09:53
'c' parameter is arbitrary, different values generate different shapes just like in standard 2D Julia Sets.
It can be shown that derivative of Zn+1 = Zn*Zn + c is simply Z'n+1 = 2*Z'n*Zn. Theory: http://www.iquilezles.org/www/articles/juliasets3d/juliasets3d.htm http://paulbourke.net/fractals/quatjulia/ http://devmaster.net/forums/topic/3432-ray-tracing-quaternion-julia-sets-on-the-gpu/ |
|||
![]() |
|
tthsqe 07 Jun 2013, 18:51
Ah, sorry for my long post. Even though (z*z)' = 2*z*z' is not correct for quaternions, it seems that is has been shown that the iteration of z' = 2*z*z' does give a correct lower bound on the distance to the set. My bad.
In that case, you can probably save a lot of time by not keeping track of z', but only its modulus |z'|, or even better the square of its modulus. |
|||
![]() |
|
randall 09 Jun 2013, 12:05
I have tested my program on Haswell CPU (Core i7-4770K CPU @ 3.50GHz).
It takes 870 ms to generate 1280x720 image. It is almost 10x faster than my previous CPU (Core2 Duo @ 1.86GHz). |
|||
![]() |
|
Melissa 09 Jun 2013, 14:29
bmaxa@maxa:~/fasm/examples/qjulia$ time ./qjulia
real 0m1.190s user 0m4.628s sys 0m0.000s Wow 50% faster then i5 ivy bridge ![]() |
|||
![]() |
|
keantoken 02 Aug 2017, 17:07
AMD Ryzen 5 1500X 3.5GHz:
$ time ./qjulia real 0m0.768s user 0m5.970s sys 0m0.004s |
|||
![]() |
|
randall 02 Aug 2017, 18:11
keantoken wrote: AMD Ryzen 5 1500X 3.5GHz: Nice, Ryzen seems quite fast. |
|||
![]() |
|
Furs 02 Aug 2017, 21:37
This is pretty amazing stuff. I've always been fascinated by quaternions since I can't really understand them (I'm not much of a math guy).
![]() |
|||
![]() |
|
keantoken 03 Aug 2017, 05:29
Using a tile size of 146 (default 80) gives about 30% performance gain for my CPU:
$ time ./qjulia real 0m0.668s user 0m4.683s sys 0m0.004s |
|||
![]() |
|
randall 03 Aug 2017, 09:29
keantoken wrote: Using a tile size of 146 (default 80) gives about 30% performance gain for my CPU: Ryzen 5 has 4 cores or more? |
|||
![]() |
|
keantoken 04 Aug 2017, 18:03
It has 4 cores, 2 threads per core. I checked the core counter in qjulia and it reported 8. Oddly enough however, the CPU only reports 25% usage while qjulia is running.
EDIT: Nevermind, it uses the whole CPU, my CPU usage program showed incorrectly. |
|||
![]() |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.