flat assembler
Message board for the users of flat assembler. Index > Main > Variable Memory Size
Author
 Thread  pal

Joined: 26 Aug 2008
Posts: 227
pal
This is kind of a question related to big numbers, but also for other applications. I'll give an example.

Say I want to make a program which does matrix multiplication. Now I want the user to be able to input a matrix of specified size. E.g. 4x4. (Something on 2^n x 2^n so I can use Strassen's Algorithm). If I want to use SSE, or even just with GPRs, to multiply these matricies, how do I go about allocating enough memory?

With a 4 x 4 matrix I can use each byte in an XMM register to store one value, and another XMM register to store another value with a third XMM register to store the final matrix. If it is a 2x2 matrix I can do it all in one XMM register. But if the value is greater than 4x4 e.g. 5x5 (and I pad with zeros to get 8x8), that is obviously 64 bytes and so I would need 4 XMM registers just for one matrix. If you get what I am saying? So how would I know how many registers I need and then be able to do it all with my code?

Another example is allowing user input for a number * a big number. E.g. they input:

18916531566518561 * 891565

I know in this case I have to have a memory array, and the best way of finding the size would be taking logarithms. This would mean having to make the big number into smaller numbers (e.g. 88 becomes 2^3 * 11), but this becomes problematic when the big number is a prime number.

If this makes sense? If not then I'll try to explain better. 24 Aug 2009, 15:26  Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc
You can't make an "optimized" matrix multiplication, without "special cases" (for special matrix types, like the examples), for any general-purpose matrix size.

For example, if you want to make optimized SSE code for 4x4, you have to make it specifically for that, you can't use it for 8x8 or other arbitrary sizes.

Arbitrary things are always less optimized. You can, though, make a list of jumps to different parts of code depending on the size (use the matrix size as the 'index' to this array of jumps), but you will have to code for every case you want the program to accept. 24 Aug 2009, 16:05  pal

Joined: 26 Aug 2008
Posts: 227
pal
I just meant an example by the matrix multiplication (by the way). Hmm fair enough, I'd have thought there would be a way to do it. I think there is as you can do it with a single piece of code in other languages I reckon. I dunno I'll think about it and try it later in C.

Just like if I let a user input a number e.g. 89748741884. This is bigger than 32-bits. How do I, without doing loads of comparisons, find out how many bytes of storage data I need? Aligned to 4 bytes I will need 8 bytes in this case.

Oh. P.S. The matrix multiplication will work as if the matrix is bigger than 2x2 you pad them out so they are a power of two square matrix. You then break the matrix down into submatricies and do it from there. The below link may explain it (I haven't read it properly yet):

Code:
`http://www-math.mit.edu/~djk/18.310/18.310F04/Matrix_%20Multiplication.html    `

But it is possible I know that. 24 Aug 2009, 17:23  r22 Joined: 27 Dec 2004 Posts: 805 r22 M x N matrix in memory is just an M x N element array. If you want to create a structure to hold arbitrary sizes then just save the number of columns and the number of rows first. Even before the multiply of two matrices you know how many rows and columns the resulting matrix will have. Code:```MATRIX_ELM_SIZE = 8; dq double fp for each data element STRUCT Matrix cols rd 1 rows rd 1 data rd 1 ENDS ... AllocMatrixMem: push esi ;;esp+8 == addr Matrix struc xor edx,edx mov esi,[esp+4] mov ecx,[esi.cols] mov eax,[esi.rows] mul ecx mov ecx,MATRIX_ELM_SIZE xor edx,edx mul ecx push eax call malloc ;;allocate the memory for the data mov [esi.data],eax pop esi ret 4 ``` Same with big numbers. You know the lengths of the two inputs so it's trivial to get an estimate of the size of the product of both of them. 99 X 999 = 98901 99 = 2 digits 999 = 3 digits 2 + 3 = 5, so the result can't be larger than 5 digits. So you allocate 5 digits worth of space. [Replace "DIGITS" with "BYTES" in the above and it holds true] 24 Aug 2009, 17:26
pal

Joined: 26 Aug 2008
Posts: 227
pal
Hmm yeah; I guess allocating more memory will be good any how. Cheers r22.

I'll have a go at implementing Strassen's algorithm for 32-bit numbers later. I know for numbers < 32-bits you can easily take logarithms to find the number of digits (e.g. when you are working with powers 1324^3563 you just take logarithms). Thats cool though, I'll try it out in a little bit. 24 Aug 2009, 19:55  r22 Joined: 27 Dec 2004 Posts: 805 r22 fyi http://software.intel.com/en-us/forums/strassens-algorithm/topic/67860/ The first post has an attachment with c++ implementation of the algorithm. 24 Aug 2009, 20:32
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First  Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum