flat assembler
Message board for the users of flat assembler.
Index
> Windows > FASM Neural Network Goto page 1, 2 Next |
Author |
|
ohara 13 Oct 2006, 23:00
Heres a neural network that should work on 64-bit processors.
All the best
|
|||||||||||
13 Oct 2006, 23:00 |
|
dead_body 14 Oct 2006, 06:26
a good work.
|
|||
14 Oct 2006, 06:26 |
|
vid 24 Oct 2006, 01:00
wonderful.
could you please post more info what it exactly learns? I didn't get it that from docs. |
|||
24 Oct 2006, 01:00 |
|
sylwek32 24 Oct 2006, 11:01
hey ohara, do you have a 32bit version for me?
|
|||
24 Oct 2006, 11:01 |
|
sylwek32 24 Oct 2006, 11:07
// I have read something about it and i read it isnt possible with 32bit processors, right?
|
|||
24 Oct 2006, 11:07 |
|
Rayslava 24 Oct 2006, 16:25
sylwek32, I've seen some 32bit neural networks and even 16bit, though they were written in C++ and Pascal
|
|||
24 Oct 2006, 16:25 |
|
vid 24 Oct 2006, 17:12
why is 32bit and 16bit so important in NN?
|
|||
24 Oct 2006, 17:12 |
|
ohara 24 Oct 2006, 19:47
Well, this network has 1008 neurons in the input layer, 256 in the hidden layer and 1008 in the output layer. The input layer is 16x21x3 and takes in a 16x21 bitmap picture with 3 bytes per pixel (RGB), and so the input (and output) is bitmap RGB values. Look at the bitmap file 320x21.bmp and you see LED pictures 0-9 on the left, these are the network inputs, and RGB drawn numbers on the right, these are what the network is going to be trained to output. The program cycles through placing each consecutive LED picture at the input and gradually trains the connections to produce the correct drawn number at the output. Once trained, the network will produce exactly the correct output drawn number for any of the LED inputs sent to it.
This network was built for speed, as you see, there are 2x(1008x256)=500k connection structures and each needs a lot of processing. The idea was to use the Single Instruction Multiple Data (SIMD) instructions (called SSE, SSE2, SSE3) in a 64-bit processor, together with the 128-bit memory-data path provided by the socket 939 (and later) motherboard and processor. The 64-bit processor provides 128 bit registers, which can be organised as 4x32 bit floating point units operating in parallel in SSE and so if the connections and neurons are grouped in sets of 4, each group of 4 can be processed from memory to the FPUs and out to memory in single instructions. All the core processes in the network (forward and backprop) are implemented in this way, and it is much faster than a 32-bit program which must send all maths through the single floating point unit separately. The coding for these processes is very short and straightforward, and the simple and large memory setup of FASM and the FASM access to DirectDraw routines means this is a high performance network. A 32-bit implementation will work, but will be a very different (and slower) program than this and I'm afraid I don't have one! I think the P4 does execute SIMD instructions. All the best, Austin O'Hara |
|||
24 Oct 2006, 19:47 |
|
Madis731 25 Oct 2006, 10:23
ohara: Your program is only using the very first version of SSE so starting from Pentium III your program works. The standard instructions you use are all 32-bit and also the PE format it creates is 32-bit.
To make it non-compatible with Win32 and 32-bit CPUz, the PE format needs to be changed to 64-bit and maybe the use64 directive and of course the use of 64-bit GPRs like rax, r9 etc. I didn't see any SSE2 neither the SSE3 (or the very new SSSE3) but I haven't tested. Until I haven't tested it on a PIII I can't be sure, though it seems that there is nothing that can stop it from running on that PS. I hope in the future you will convert it to TRUE 64-bit so we can see really great performances. The 64-bit CPU has 2x more 2x wider GPRs and there are 8 more of these 128-bit XMM registers |
|||
25 Oct 2006, 10:23 |
|
vid 25 Oct 2006, 11:51
so you have every of those 16x21x3 input neurons linked with every of 256 neurons, and every of 256 neurons linked with every of 16x21x3 output neurons. But i still don't understand how you learn...
|
|||
25 Oct 2006, 11:51 |
|
Frank 25 Oct 2006, 14:59
vid, the answer is in the documentation ("Explanation.doc") and in the source code comments: The network uses a variant of the backpropagation algorithm.
Has anyone noticed that this was ohara's first post at the board? Welcome, Austin, and thanks for sharing your great SIMD implementation of the algorithm in FASM! |
|||
25 Oct 2006, 14:59 |
|
Madis731 27 Oct 2006, 14:29
.... I think this connection style is called simply - MESH (a meshed network etc.)
|
|||
27 Oct 2006, 14:29 |
|
dead_body 19 Feb 2007, 01:02
anyone has same examples in f/m/t/n/etc/ # asm?
|
|||
19 Feb 2007, 01:02 |
|
LocoDelAssembly 19 Feb 2007, 01:24
Quote:
Yes, using an Athlon-64 you can't see anything, just one of the coloured numbers blinking and all the rest freezed I had to set a background program doing "jmp $" to be able to see what's really happening. Before using this background program I thought that the program entered in an unexpected infine loop |
|||
19 Feb 2007, 01:24 |
|
hamoz 19 Feb 2007, 01:26
Hello ,
Could I execute this 64bit neural version on 32bit processor by emulator or something like that........ thanks a lot |
|||
19 Feb 2007, 01:26 |
|
LocoDelAssembly 19 Feb 2007, 01:37
hamoz, an SSE capable processor (Pentium III, Athlon XP) is fine, no need for 64-bit processor.
BTW, is possible that this program is bugged? Actually without that "jmp $" background program there is no progress never. Without this background program keeping pressed PrintScreen I get progress but letting the program working without any interruption has the effect of never matching the above digits with the below digits. |
|||
19 Feb 2007, 01:37 |
|
hamoz 19 Feb 2007, 02:00
LocoDelAssembly, I understood it.......because of you
thanks alot alot |
|||
19 Feb 2007, 02:00 |
|
Madis731 19 Feb 2007, 08:30
btw, the code include files should be in the CODE section. You included them outside the code section, I would do it like this:
Code: section '.code' code readable executable include 'win1.inc' include 'win2.inc' There could be some more things, but this made it work - otherwise the program will get an access violation! EDIT1: Actually the most logical code section would look like: Code: section '.code' code readable executable include 'setcnxnadd64.inc' include 'cnxnforward64.inc' include 'calcouterror64.inc' include 'calcmiderror64.inc' include 'adjustweights64.inc' include 'setforiter64.inc' include 'setrandom64.inc' include 'storebmp.inc' include 'showfloats.inc' include 'bmp.inc' include 'win1.inc' include 'win2.inc' start: leaving only ddraw.inc after the win32 include file. EDIT2: I discovered that the 12MB is too large for a small application like this and moved the data sections around a bit. I won't bother to make any tutorials so I just post the final result and see for yourself. The 7.5KB executable could be even less...
|
|||||||||||
19 Feb 2007, 08:30 |
|
dead_body 19 Feb 2007, 10:13
on my computer(P4 915&32bit XP), previous version works fine.
|
|||
19 Feb 2007, 10:13 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.