flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Borsuc 24 Aug 2009, 16:05
You can't make an "optimized" matrix multiplication, without "special cases" (for special matrix types, like the examples), for any general-purpose matrix size.
For example, if you want to make optimized SSE code for 4x4, you have to make it specifically for that, you can't use it for 8x8 or other arbitrary sizes. Arbitrary things are always less optimized. You can, though, make a list of jumps to different parts of code depending on the size (use the matrix size as the 'index' to this array of jumps), but you will have to code for every case you want the program to accept. |
|||
![]() |
|
pal 24 Aug 2009, 17:23
I just meant an example by the matrix multiplication (by the way). Hmm fair enough, I'd have thought there would be a way to do it. I think there is as you can do it with a single piece of code in other languages I reckon. I dunno I'll think about it and try it later in C.
Just like if I let a user input a number e.g. 89748741884. This is bigger than 32-bits. How do I, without doing loads of comparisons, find out how many bytes of storage data I need? Aligned to 4 bytes I will need 8 bytes in this case. Oh. P.S. The matrix multiplication will work as if the matrix is bigger than 2x2 you pad them out so they are a power of two square matrix. You then break the matrix down into submatricies and do it from there. The below link may explain it (I haven't read it properly yet): Code: http://www-math.mit.edu/~djk/18.310/18.310F04/Matrix_%20Multiplication.html But it is possible I know that. |
|||
![]() |
|
r22 24 Aug 2009, 17:26
M x N matrix in memory is just an M x N element array.
If you want to create a structure to hold arbitrary sizes then just save the number of columns and the number of rows first. Even before the multiply of two matrices you know how many rows and columns the resulting matrix will have. Code: MATRIX_ELM_SIZE = 8; dq double fp for each data element STRUCT Matrix cols rd 1 rows rd 1 data rd 1 ENDS ... AllocMatrixMem: push esi ;;esp+8 == addr Matrix struc xor edx,edx mov esi,[esp+4] mov ecx,[esi.cols] mov eax,[esi.rows] mul ecx mov ecx,MATRIX_ELM_SIZE xor edx,edx mul ecx push eax call malloc ;;allocate the memory for the data mov [esi.data],eax pop esi ret 4 Same with big numbers. You know the lengths of the two inputs so it's trivial to get an estimate of the size of the product of both of them. 99 X 999 = 98901 99 = 2 digits 999 = 3 digits 2 + 3 = 5, so the result can't be larger than 5 digits. So you allocate 5 digits worth of space. [Replace "DIGITS" with "BYTES" in the above and it holds true] |
|||
![]() |
|
pal 24 Aug 2009, 19:55
Hmm yeah; I guess allocating more memory will be good any how. Cheers r22.
I'll have a go at implementing Strassen's algorithm for 32-bit numbers later. I know for numbers < 32-bits you can easily take logarithms to find the number of digits (e.g. when you are working with powers 1324^3563 you just take logarithms). Thats cool though, I'll try it out in a little bit. |
|||
![]() |
|
r22 24 Aug 2009, 20:32
fyi
http://software.intel.com/en-us/forums/strassens-algorithm/topic/67860/ The first post has an attachment with c++ implementation of the algorithm. |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.