flat assembler
Message board for the users of flat assembler.
Index
> Projects and Ideas > working CUDA example |
Author |
|
tthsqe 19 Aug 2011, 23:34
I finally tried cuda in fasm and it does work.
This simple test program shows how accurate the approximate log2 function is on the gpu vs cpu. Note: The program posted 8 posts down seems to work better.
Last edited by tthsqe on 02 Dec 2013, 08:07; edited 2 times in total |
|||||||||||
19 Aug 2011, 23:34 |
|
tthsqe 20 Aug 2011, 12:53
If you have a cuda-enabled gpu and this doesn't display a table of values, please let me know what error you are experiencing.
|
|||
20 Aug 2011, 12:53 |
|
sinsi 21 Aug 2011, 10:32
error code dec:999
Does a GTX580 support CUDA? I assume so. edit: fails at cleanup 'cuMemFree' |
|||
21 Aug 2011, 10:32 |
|
ctl3d32 21 Aug 2011, 13:35
Works fine for me: GT540M
|
|||
21 Aug 2011, 13:35 |
|
sinsi 21 Aug 2011, 23:36
Driver is 280.26, first column is 1 to 20, second is all 0, 3rd is 0 to 4.321
OS is win7pro x64, CPU is AMD Phenom II X6 1100T. |
|||
21 Aug 2011, 23:36 |
|
Kuemmel 22 Aug 2011, 16:14
Works fine here on my GTX260. Great stuff ! So as far as I understood this is NVIDIA-CUDA-syntax. So I guess AMD uses a different syntax.
Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ? |
|||
22 Aug 2011, 16:14 |
|
f0dder 22 Aug 2011, 16:38
Kuemmel wrote: Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ? _________________ - carpe noctem |
|||
22 Aug 2011, 16:38 |
|
tthsqe 23 Aug 2011, 00:52
@sinsi
The only relavent different in our systems is the driver #. I have 275.33, and have had problems with the newer ones in the past. You can see exactly how it is failing with this
|
|||||||||||
23 Aug 2011, 00:52 |
|
sinsi 23 Aug 2011, 01:10
OK, the last one works fine.
|
|||
23 Aug 2011, 01:10 |
|
tthsqe 23 Aug 2011, 01:42
What?
OK. This is further proof that cuParmSeti sucks and cuParmSetv is the way to go - that is the only thing I changed. |
|||
23 Aug 2011, 01:42 |
|
gunblade 11 Jan 2012, 03:52
Sorry for reviving a (slightly) old thread.. but I got a Nvidia GTS 450 not long ago - and was keen to test out CUDA. However - I use linux as my primary (and only) system, so I decided to port your code to 64-bit linux to see if it would work (I hope you dont mind.. if you do, let me know, and I'll take it down )
I've attached the code, the cuda.inc hasnt changed, the api_cuda.inc's been changed to match ELF64 "extrn" syntax rather than the import thing that winapi does, and the main cudatest.asm's been totally updated to work on linux 64-bit. The api_cuda actually changed more than that, i dumped a list of the function names from the cuda library and generated the inc by extrn'ing all the available functions. Theres less functions in the linux cuda library than in the windows (ie: no Direct3D stuff, obviously) Uploaded the archive as a .tar.bz2, should extract fine with tar -xvf cuda.tar.bz2, or using a gui archive program (if you insist on using the GUI ). inside's the code, etc, and a makefile. Typing make should build it fine. The second stage may vary depending on the system, although I'm pretty sure the location/name of the dynamic linker is quite standard, if yours is different from mine (/lib/ld-linux-x86-64.so.2), then you can change it in the makefile.. Thanks a lot tthsqe for the original code, you've saved me from using C/C++ and nvidia's big SDK/compiler.. now I can have fun writing some CUDA code in assembly (under linux, of course) - Speaking of which, where did you get the syntax for the assembler used in your PTX function, the one that is actually assembled/run on the GPU? I looked up some of the CUDA documentation on the nvidia site, but it was mainly references on the functions in the cuda library, rather than the CUDA language itself.
|
|||||||||||
11 Jan 2012, 03:52 |
|
Tyler 11 Jan 2012, 05:49
Kuemmel wrote: Wasn't there something like a common shader language, like GLSL. Couldn't that be used for the same effort to avoid the need of writing code for both companies ? |
|||
11 Jan 2012, 05:49 |
|
LocoDelAssembly 11 Jan 2012, 15:27
Quote:
|
|||
11 Jan 2012, 15:27 |
|
gunblade 11 Jan 2012, 15:40
Ah, thanks for the link Loco, I know the "assembly" code is probably not the code thats actually executed on the card - but well, its as close as we'll get probably
|
|||
11 Jan 2012, 15:40 |
|
ohara 11 Apr 2012, 18:56
Fantastic!
I have this working on a geforce 9400 gt One thing I wondered, as I increase the memory arrays up passed 10Mbytes, this appears in the .exe filesize...putting the data last does not seem to work as it does in 32-bit fasm. Does anyone know how I can make the .exe file much smaller? |
|||
11 Apr 2012, 18:56 |
|
ohara 12 Apr 2012, 11:03
Ah- found the answer, you must put assigned data before unassigned data at the end. 5K exe file now.
|
|||
12 Apr 2012, 11:03 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.