flat assembler
Message board for the users of flat assembler.

Index > Projects and Ideas > working CUDA example

Author
Thread Post new topic Reply to topic
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 19 Aug 2011, 23:34
I finally tried cuda in fasm and it does work.
This simple test program shows how accurate the approximate log2 function is on the gpu vs cpu.

Note: The program posted 8 posts down seems to work better.


Description:
Download
Filename: CUDA.zip
Filesize: 7.95 KB
Downloaded: 1383 Time(s)



Last edited by tthsqe on 02 Dec 2013, 08:07; edited 2 times in total
Post 19 Aug 2011, 23:34
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 20 Aug 2011, 12:53
If you have a cuda-enabled gpu and this doesn't display a table of values, please let me know what error you are experiencing.
Post 20 Aug 2011, 12:53
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 794
Location: Adelaide
sinsi 21 Aug 2011, 10:32
error code dec:999

Does a GTX580 support CUDA? I assume so.

edit: fails at cleanup 'cuMemFree'
Post 21 Aug 2011, 10:32
View user's profile Send private message Reply with quote
ctl3d32



Joined: 30 Dec 2009
Posts: 206
Location: Brazil
ctl3d32 21 Aug 2011, 13:35
Works fine for me: GT540M
Post 21 Aug 2011, 13:35
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 21 Aug 2011, 16:22
sinsi,
It seems strange that everything would work except the cleanup.
Error code 999 is CUDA_UNKNOWN_ERROR as in cuda.inc.
Try taking out jnz Error and see if the table was computed correctly.
Also, I hate to ask, but what driver #?
The GTX580 does support cuda - this is what I tested it on.
Post 21 Aug 2011, 16:22
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 794
Location: Adelaide
sinsi 21 Aug 2011, 23:36
Driver is 280.26, first column is 1 to 20, second is all 0, 3rd is 0 to 4.321
OS is win7pro x64, CPU is AMD Phenom II X6 1100T.
Post 21 Aug 2011, 23:36
View user's profile Send private message Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany
Kuemmel 22 Aug 2011, 16:14
Works fine here on my GTX260. Great stuff ! So as far as I understood this is NVIDIA-CUDA-syntax. So I guess AMD uses a different syntax.

Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ?
Post 22 Aug 2011, 16:14
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Aug 2011, 16:38
Kuemmel wrote:
Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ?
DirectCompute Smile

_________________
Image - carpe noctem
Post 22 Aug 2011, 16:38
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2011, 00:52
@sinsi
The only relavent different in our systems is the driver #. I have 275.33, and have had problems with the newer ones in the past. You can see exactly how it is failing with this


Description:
Download
Filename: CUDA.zip
Filesize: 9.91 KB
Downloaded: 1129 Time(s)

Post 23 Aug 2011, 00:52
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 794
Location: Adelaide
sinsi 23 Aug 2011, 01:10
OK, the last one works fine.
Post 23 Aug 2011, 01:10
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2011, 01:42
What?
OK. This is further proof that cuParmSeti sucks and cuParmSetv is the way to go - that is the only thing I changed.
Post 23 Aug 2011, 01:42
View user's profile Send private message Reply with quote
gunblade



Joined: 19 Feb 2004
Posts: 209
gunblade 11 Jan 2012, 03:52
Sorry for reviving a (slightly) old thread.. but I got a Nvidia GTS 450 not long ago - and was keen to test out CUDA. However - I use linux as my primary (and only) system, so I decided to port your code to 64-bit linux to see if it would work (I hope you dont mind.. if you do, let me know, and I'll take it down Smile)

I've attached the code, the cuda.inc hasnt changed, the api_cuda.inc's been changed to match ELF64 "extrn" syntax rather than the import thing that winapi does, and the main cudatest.asm's been totally updated to work on linux 64-bit.
The api_cuda actually changed more than that, i dumped a list of the function names from the cuda library and generated the inc by extrn'ing all the available functions. Theres less functions in the linux cuda library than in the windows (ie: no Direct3D stuff, obviously)

Uploaded the archive as a .tar.bz2, should extract fine with tar -xvf cuda.tar.bz2, or using a gui archive program (if you insist on using the GUI Razz). inside's the code, etc, and a makefile. Typing make should build it fine. The second stage may vary depending on the system, although I'm pretty sure the location/name of the dynamic linker is quite standard, if yours is different from mine (/lib/ld-linux-x86-64.so.2), then you can change it in the makefile..

Thanks a lot tthsqe for the original code, you've saved me from using C/C++ and nvidia's big SDK/compiler.. now I can have fun writing some CUDA code in assembly (under linux, of course) Very Happy - Speaking of which, where did you get the syntax for the assembler used in your PTX function, the one that is actually assembled/run on the GPU? I looked up some of the CUDA documentation on the nvidia site, but it was mainly references on the functions in the cuda library, rather than the CUDA language itself. Razz


Description: 64-bit Linux version of tthsqe's cudatest application.
Download
Filename: cuda.tar.bz2
Filesize: 7.22 KB
Downloaded: 967 Time(s)

Post 11 Jan 2012, 03:52
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 11 Jan 2012, 05:49
Kuemmel wrote:
Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ?
OpenCL does that.
Post 11 Jan 2012, 05:49
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Jan 2012, 15:27
Quote:

Speaking of which, where did you get the syntax for the assembler used in your PTX function, the one that is actually assembled/run on the GPU?
Maybe from PTX: Parallel Thread Execution ISA? But if I remember right, this is not the one that is run on the GPU, is some sort of "Java assembly" (but still, this may be faster than using HLL code, at least in the DirectX world, drivers receive assembled code, not the HLL code for them to compile).
Post 11 Jan 2012, 15:27
View user's profile Send private message Reply with quote
gunblade



Joined: 19 Feb 2004
Posts: 209
gunblade 11 Jan 2012, 15:40
Ah, thanks for the link Loco, I know the "assembly" code is probably not the code thats actually executed on the card - but well, its as close as we'll get probably Very Happy
Post 11 Jan 2012, 15:40
View user's profile Send private message Reply with quote
ohara



Joined: 13 Oct 2006
Posts: 20
ohara 11 Apr 2012, 18:56
Fantastic!
I have this working on a geforce 9400 gt
One thing I wondered, as I increase the memory arrays up passed 10Mbytes, this appears in the .exe filesize...putting the data last does not seem to work as it does in 32-bit fasm. Does anyone know how I can make the .exe file much smaller?
Post 11 Apr 2012, 18:56
View user's profile Send private message Reply with quote
ohara



Joined: 13 Oct 2006
Posts: 20
ohara 12 Apr 2012, 11:03
Ah- found the answer, you must put assigned data before unassigned data at the end. 5K exe file now.
Post 12 Apr 2012, 11:03
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.