flat assembler
Message board for the users of flat assembler.

Index > Windows > How to access 3D attribute data in a nice way ?

Author
Thread Post new topic Reply to topic
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
I'm thinking in writing some software 3D-stuff with FASM, so I would need to access for example some coordinate attribute data in the form like p = DOWRD[x,y,z] or p = BYTE[x,y,z]. In other words like:

MOV eax, DWORD / BYTE p[x,y,z]
and
MOV DWORD / BYTE p[x,y,z], eax

Of some space like x:0...100, y:0...100, z:0...100 or something...

How can such thing set up in a proper and way and how to setup the data most efficiently (alignment, etc.) ? Some macros for that ?
Post 06 Jun 2006, 16:22
View user's profile Send private message Visit poster's website Reply with quote
Quantum



Joined: 24 Jun 2005
Posts: 122
Quantum
Quote:

p = DWORD[x,y,z]


That whould be:
p = [x + y*100 + z*100*100]

Computing those scales (x100 and x10000) whould require using a mul instruction. That's very inefficient. I suggest binding those arrays to a power of 2 (i.e. 128).

x e [0,127], y e [0,127], z e [0,127]

So that the above expression:
p = [x + y<<7 + z<<14]

Say x is eax, y is edx and z is ecx:
shl ecx,14
shl edx,7
add eax,edx
add eax,ecx
mov eax,[eax] ; here we get that p DWORD.

And you can make a macro from this if you like.
Post 06 Jun 2006, 18:30
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7715
Location: Kraków, Poland
Tomasz Grysztar
That's very efficient for speed but quite inefficient for the used memory if we are going to access only few actually important points in such 3D space. Thus to make a good solution we need first to know:
1) How many "attributed" point with what ranges of coordinates are we going to use;
2) What is more important for us: speed or memory usage (usually you can get more efficiency in one of those areas at the cost of the other one).
Post 06 Jun 2006, 18:53
View user's profile Send private message Visit poster's website Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Quantum wrote:

Say x is eax, y is edx and z is ecx:
shl ecx,14
shl edx,7
add eax,edx
add eax,ecx
mov eax,[eax] ; here we get that p DWORD.


Thanks Quantum, the hint about the power of 2 addresses is nice. In your example shouldn't I also do a shl eax,2 before the mov, as p would be a DOWRD and be aligned to a DWORD address ?

If I want to save some memory and my p(x,y,z) values is just the size of a byte in your example I would need
shl ecx,14
shl edx,7
add eax,edx
add eax,ecx
movsx eax,byte[eax]

Is that correct ? I'm still relatively new to x86 asm, so if I learned it right I can't access bytes with 'mov', I got to use the 'movsx' !?
Post 07 Jun 2006, 17:54
View user's profile Send private message Visit poster's website Reply with quote
Quantum



Joined: 24 Jun 2005
Posts: 122
Quantum
Quote:

In your example shouldn't I also do a shl eax,2 before the mov, as p would be a DOWRD and be aligned to a DWORD address ?

Yes, I missed that detail.

Quote:

Is that correct ?

Yes. zx/sx suffixes perform unsigned/signed extension.

Quote:

so if I learned it right I can't access bytes with 'mov'

Reading a byte at address given in eax and stoing it in al:
mov al,[eax]
Post 07 Jun 2006, 19:39
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
But why don't you interleave X,Y,Z,X,Y,... because you are going to need the appropriate XYZ together anyway!?

Maybe if you described more accurately the application that it is going to be? I know that 3DS file for example has them interleaved. If you are using DWORDs you can add a dummy there and this would also be efficient if using BYTE. Example:
Code:
align 4
db x1,y1,z1,dummy1
db x2,y2,z2,...
; Here you have the X-coordinate always aligned
    

...another example:
Code:
align 16
dd x1,y1,z1,dummy1
dd x2,y2,z2,...
; This is aligned by 16 so you can access it with SSE (128-bit)
    


My theory is that when you use x-buffer, y-buffer and z-buffer separately it is not very performance wise because no matter how you access them, they are ALWAYS far and CPU doesn't know how to handle the cache.
Post 08 Jun 2006, 09:16
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Madis731 wrote:
But why don't you interleave X,Y,Z,X,Y,... because you are going to need the appropriate XYZ together anyway!?

Maybe if you described more accurately the application that it is going to be? I know that 3DS file for example has them interleaved. If you are using DWORDs you can add a dummy there and this would also be efficient if using BYTE.

Year, of course your solution is the way to do it if I would have to access 3D coordinates. I was thinking in another application though.

I remembered an algoritm I was implementing to create a '2D-fire'. In the end it's a matrix with x*y bytes and your have to calculate sums of pixels (sum each 8 pixel sourrounding one plus the one) and calculate the average and place it one pixel higher basically for the whole matrix for one frame.

So I want to port this to 3D, as a real 'voluminous' fire, as a x * y * z matrix, each cooridnate represented by one byte. Then I got to have to sum 3*9 pixels and calculate an average and place the result.

As the coordinates of the pixels themselves are fixed in distance, like a perfect grid, I don't need to store them, they are just like constants.
I'll post some pseudo algoritm when I finished it.
Post 08 Jun 2006, 16:36
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
Ok, I understood better now. You should really use the x,x,x...,y,y,y...,z,z,z... way and make three passes. First go through the array linearly and sum every 3 consecutive pixels, then use some algorithm to go through the array once again, but now on Y-coordinate. You need to jump X times every time again summing 3 consecutive pixels. Finally, you do it the Z-way by jumping X*Y times.
Example:
Code:
;array
+--+--+--+
|9 |4 |3 |
+--+--+--+
|2 |5 |6 |
+--+--+--+
|7 |1 |8 |
+--+--+--+

First pass:
+--+--+--+
|- |16|- |
+--+--+--+
|- |13|- |
+--+--+--+
|- |16|- |
+--+--+--+

Second pass:
+--+--+--+
|- |- |- |
+--+--+--+
|- |45|- |
+--+--+--+
|- |- |- |
+--+--+--+
    


Third pass would take into account the third dimension. I hope it won't be hard because its very easy in my head right now Idea
Post 09 Jun 2006, 08:13
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Finally I come up with something working...it's by no way finished, just playing around with it, and needs huge optimizing and cleaning of the code...

EDIT: The file is attached...

Anyway, I followed your hints, Madis, regarding the basic algoritm.

So basically it's now a 128*128*128 - 3D-Space that is 'blured' to achieve some nice graphical effect and mapped in an easy way to the screen...just found out that all the memory access uses quite a lot of speed and a 256*256*256 Space would be too much.

To use it as a fire Algoritm wasn't looking nice...hard to map a 3D fire, I think...

Overall I want to turn it out like a kind of nice looking memory benchmark and I already see that an effect is quite clear on my 2 systems:

- Sempron 1,8 GHz: 3,2 frames/sec
- Old Athlon 1,0 GHz: 0,8 frames/sec

So, no more scaling along the pure CPU power, got something to do with the memory speed, I guess.

My next steps would be to

1) Enhance the visual appearance, like put in some moving spiral dots or something, change colours, etc...

2) Make use of multi-core CPU's, like spreading the grid in like 8 sections, like 64*64*64*8 for calculations

3) Optimize, correct and clean all code...

I'm open for any comments, results on other systems from you guys out there...

Oh, and a question...how can I get rid of the hourglass during full screen !?

Also, how can I find out the theoreticall maximum memory bandwidth of a system ? Is there some decent hardware info tool for this ?


Description:
Download
Filename: KFB_test.zip
Filesize: 11.14 KB
Downloaded: 34 Time(s)



Last edited by Kuemmel on 30 Oct 2006, 06:56; edited 2 times in total
Post 29 Oct 2006, 23:44
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
your link doesn't work for me
Post 30 Oct 2006, 01:02
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
attached it to my entry now...strange, the link was working from here...
Post 30 Oct 2006, 06:57
View user's profile Send private message Visit poster's website Reply with quote
MichaelH



Joined: 03 May 2005
Posts: 402
MichaelH
Quote:

Oh, and a question...how can I get rid of the hourglass during full screen !?


cominvk DDraw, SetCooperativeLevel, [mainhwnd], DDSCL_FULLSCREEN or DDSCL_NORMAL

Platform SDK says DDSCL_NORMAL - Application will function as a regular Windows application.
Post 30 Oct 2006, 07:51
View user's profile Send private message Reply with quote
madmatt



Joined: 07 Oct 2003
Posts: 1045
Location: Michigan, USA
madmatt
Kuemmel:
You can use "invoke ShowCursor, FALSE" to hide the mouse cursor, and "invoke ShowCursor, TRUE" to get it back.
Post 30 Oct 2006, 09:45
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
Crashes on Server 2003 Enterprise x86-64. Its a laptop with 1400x1050 res. and it doesn't go to fullscreen, but hides in the corner...
Runs successfully on 2000 SP4, though

Second thoughts...there might be the problem that is finishes too fast!?
Post 31 Oct 2006, 11:06
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Madis731 wrote:
Crashes on Server 2003 Enterprise x86-64. Its a laptop with 1400x1050 res. and it doesn't go to fullscreen, but hides in the corner...
Runs successfully on 2000 SP4, though

Second thoughts...there might be the problem that is finishes too fast!?


Hm, basically I use the same screen code like in my Fractal Benchmark, just the resolution is not 800x600, now it's 640x480...could that be a problem ?

Hm, finishes to fast, could be, what's your frame rate result ?

I limit it because I only got slow machines at hand...of course I'll change it later.
Post 31 Oct 2006, 17:43
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
1) Sorry, not Enterprise, but Standard. I don't know where I got the idea, that it was Enterprise :S

2) Sorry, I had wierd drivers (32?) and now I've put correct drivers and restarted.

Now the only thing is that I can't get it to compile. Btw, the app runs at 9.9FPS or near that. I included the ddraw.inc and stuff, but it doesn't like the LockSurface thingy Sad Any help?
Post 08 Nov 2006, 10:24
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Hi Madis731,

the problem with compiling could be the same for some people like when I did the mandelbrot benchmark. I still use the same FASM setup. I provide some link on my page to the include files used:

http://www.mikusite.de/x86/KMB_INCLUDE.zip

...may be that helps...
Post 08 Nov 2006, 17:39
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
yipee Very Happy I got it compiling, but I couldn't find one calculation bug. You have the box correctly drawn only with the 1024x768. Others have ghost boxes on the same height ±1 pixel.

This runs at 8.333FPS@1024x768 and 9.276FPS@640x480 resolution. Strange is that is uses only "left core" of my T7200 :S
Post 09 Nov 2006, 13:52
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 198
Location: Stuttgart, Germany
Kuemmel
Madis731 wrote:
yipee Very Happy I got it compiling, but I couldn't find one calculation bug. You have the box correctly drawn only with the 1024x768. Others have ghost boxes on the same height ±1 pixel.

This runs at 8.333FPS@1024x768 and 9.276FPS@640x480 resolution. Strange is that is uses only "left core" of my T7200 :S

That's my fault. At the moment the whole code is locked to one core. Anyway the whole code setup is like that any second CPU wouldn't benefit too much. So don't worry about this, it's forced at the moment.

Later on I want to change the calculation of the 128*128*128 blur field to lets say at least cubes to 8 times 64*64*64 and lock them on a core...the other stuff isn't that time critical, I think so that should help multi core machines.

Lomg way to go still...not so happy with the visual look...we'll see...Christmas holiday coming soon Wink
Post 09 Nov 2006, 22:09
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
Hehee - make the fireworks more colourful and fly some Santas and deers around Razz
Post 10 Nov 2006, 07:56
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.