flat assembler
Message board for the users of flat assembler.
Index
> Windows > Somewhat useful function 
Author 

Tomasz Grysztar
The simplest algorithm is to split power by bits, in the same way as you do multiplication by adding values of one number shifted by the counts of bits determined by the position of ones in the second number.
For example, to multiply 5=101b by 7=111b, you do such addition: 5*1=101b (bit 0 of 7 is 1) +5*2=1010b (bit 1 of 7 is 1) +5*4=10100b (bit 2 of 7 is 1) (at each stage you shift the value to add one bit left) And analogously, to calculate the power 5^7, you do the multiplications: 5^1 (bit 0 of 7) *5^2 (bit 1 of 7) *5^4 (bit 2 of 7) This way you need only to calculate the powers that are themselves powers of two, so each one is calculated just as a x*x, where x is previous power (for instance 5^4=5^2*5^2). So the algorithm is: start with x equal to the base of power and result equal to 1, and in the loop shift the value of power by one bit left, if the bit you shifted out was 1, multiply result by your current x value, otherwise keep the result as it is, then do x=x*x and repeat loop until the shifted power value is 0. This algorithm is very simple to implement and doesn't need any large stack space or anything similar. 

26 Apr 2004, 18:20 

Madis731
I'm having some problems:
OK, the input should be in DEC and this shouldn't be a problem to convert, but the output should also be in DEC. Converting long strings HEX=>DEC will be rough The other thing is that I tried making a scheme for overflow handling, but lightly said, I failed I got stuck there where I found, that MUL doesn't always flow over with just one bit. Sometimes its even 32 bits. Maybe I could preserve EDX in some temp register ...troubles, troubles 

26 Apr 2004, 23:22 

Tomasz Grysztar
What do you mean by "converting long strings HEX=>DEC"? Are we not talking about elementary binary calculations here?


27 Apr 2004, 05:38 

Madis731
I was thinking of something like:
2^100=1267650600228229401496703205376 I made my program read user input and start coding on calculations as we speak:) My primary goal would be integers only, but in the future, maybe floats too... Oh, and for simplicity just console output for now. 

27 Apr 2004, 08:20 

Tomasz Grysztar
Well, to get 2^100 and such you will need to do the calculations on very large (128 bit or more) numbers, that means a bit more work to do multiplications.


27 Apr 2004, 08:23 

Madis731
I mean, there are no fixed lengths because 100^100 is even bigger and I want to output them all, like a 10000 character buffer.
That is why I said LONG strings, because it is hard to convert 100byte hex numbers to strings for output 

27 Apr 2004, 08:29 

Madis731
Some progress stalls
edx=0 and eax=*buffer and ebx=*endOfBuf Code: NextNo:shl edx,1 ;/* mov ecx,edx ; Multiply by 10 shl edx,2 ; x=2Â³x+2x add edx,ecx ;*/ add dl,[eax] ;edx has "0""9" sub dl,"0" ;edx has 09 add eax,1 ;take the next number sub ebx,1 ;The end is getting near:) jnz NextNo Update: Code: lea edx,[5*edx] ;that is lea edx,[edx+4*edx] shl edx,1 ;that is add edx,edx ;... here edx is 10times bigger than before ;and no need to waste ecx (speed remains the same on 3Âµop CPUs) The program only gave junk in console mode(*.com), but from the point where I moved to exe, it did ok. Last edited by Madis731 on 10 Aug 2004, 15:40; edited 1 time in total 

27 Apr 2004, 08:32 

Tomasz Grysztar
This will work only for numbers that can fit in 32 bits. For larger numbers you need to store all dwords in memory and perform shifts and additions on the whole dword sequences (as you need calculations on numbers with no fixed length, you will need also to store the count of dword for each number somewhere). This is quite easy, it's the multiplication of such numbers that may be your problem.
For floats it would be much easier, but you would loose the precision. 

27 Apr 2004, 09:26 

Madis731
Heh I just realized, that I can't calculate it at all, because it would take alot (ALOT) of time to convert them to decimal. The best way to go would be to make a string array and do calculations on them. To conserve space, I would have to pack 8 decimals in 32 bits. The only problem is that I can't abuse arithmetical advantages of digital logic :'(
Isn't there a way to get over this? If there is an easy algorithm to get that FFFFFFFFFFFFFFFF is 18446744073709551615... Hehe, I just came to an idea of lookuptables, but I can't possibly make one for 1000+ byte numbers, can I? P.S. discovered a nice trick that might help, to get the length of the output, I would do this: A^B=C length(C)=int(log(A)*B+1) Now I know how much memory should I set aside for calculations. P.P.S, another problem arises, because I don't know how to take a log() from a number. FPU has got 4 constants, but no log() operation. I'm trying to combine the constants to make my own operation haven't had much success Math tricks: To all of you wondering hot to get log(that's base 10 logarithm) I have a solution for you: Code: fninit ;If you haven't had calc in FP before fldlg2 ;ST0=log10(2) fild dword [dat] ;ST0=X,ST1=log10(2) fyl2x ;ST0=log10(2)*log2(X) fist dword [dat+4] ;Y=ST0 Last edited by Madis731 on 06 Jul 2004, 11:21; edited 1 time in total 

28 Apr 2004, 22:38 

Madis731
It's done:
Available @ http://www.itcollege.ee/~mkalme/ => click "Progemine" and then "Power" to download. NB! in 7zip format ...and if I'm lucky, should soon be in the examples section also. P.S. Privalov, you were right about the limit: the code only handled 1byte long integers meaning 255^255 was the biggest input, but I got over that and now it can handle upto 4294967295^4294967295 *theoretically* but another limit is 9char buffer so 99^999999 is about the largest number you can input. Caution!!! 9542425bytes long (9.4MB) can take hours to calculate even on high end computers 

10 Aug 2004, 12:10 

Imagist
Privalov, I'm not sure the algo you described is any different from the brute force algo. 5^7 can be calculated as 5*5*5*5*5*5*5 just as quickly. Your algo simply does 5*(5*5)*(5*5*5*5). Is there any difference that I'm not seeing?


10 Aug 2004, 19:26 

Madis731
Yes, a big difference:
when doing 5*5*5*5*5*5*5 you have 6 multiplies, but when using some neat tricks you can have as less as 4 multiplies! Here we go: 5 is our input let's square it: 5*5=25 <= the first now let's do the first part: 5*(5*5)=25 <=the second now we don't have to go through this again, do we? 5*5*5*5=25*25=625 <=the third and finally, all together 625*25=...=78125 voila! When 6/4=1.5X improvement isn't enough, try 2^32 2*2*...*2  that is 31 multiplies comparing to 5 multiplies with that ingenious trick :$ 31/5=6.2X improvement WOW!!! 

10 Aug 2004, 19:43 

Imagist
Here's the steps as I see it:
First algo: 1. 5*5=25 2. 25*5=125 3. 125*5=625 4. 625*5=3125 5. 3125*5=15625 6. 15625*5=71825 Second algo: 1. 5=5 2. 5*5=25 3. 5*5=25 (you already did this, but you're doing it again because your computer doesn't know you did this, and if you tell it that it will have to do two things: be told that it knows it (through a condition test) and go find it. One thing is better than two, no?) 4. 5*5=25 (same thing) 5. 25*25=625 6. 5*25=125 7. 125*625=71825 So while the human mind, while thinking of the thing it just did, doesn't need to recalculate 5*5, the computer does, or must go off and get it, which takes longer. I see what you are trying to do, using the same ideas as binary searching and sorting, but it doesn't work here because of the nature of the operations being performed. You may argue that this is only the case when using small numbers like 5 and 7, but this is not the case. That algorithm only ever matches the speed of the brute force algo when the exponent is a power of two: the farther away from being a power of two the exponent is, the worse the performance. 7 is only one away from 8 (a power of two), but numbers like 96, which are far away from the nearest powers of two would be horrible to calculate in cases like 5^96. 

10 Aug 2004, 20:13 

Madis731
Code: ;We'd like to calculate 5^96 ;OK, which one is faster? ;1) data dd 5 mov eax,1 ;Multiplying w/ 0 would be pointless mov ecx,95 Recalc: mul [data] dec ecx jnc ;The result in eax ;OR ;2) data dd 5 mov eax,[data] mul eax ;25 mul eax ;625 mul eax ;390625 mul eax ;152587890625 mul eax ;23283064365386962890625 mov ebx,eax ;" mul eax ;54210108624275221700372640043497\ ;0855712890625 mul ebx ;12621774483536188886587657044524\ ;57967477130296174436807632446290\ ;625 ;The result in eax ;This is a VERY bad example, because 5^96 is 7dwords ;in length and you get only the lowest dword, but ;it is just to show you that computer does NOT have ;to forget everything, computers DO have memory! 

10 Aug 2004, 20:47 

Imagist
The second, of course. But that is only because the parts which make the macro adaptable to any input (not just 5 and 96) have been taken out.


11 Aug 2004, 00:26 

Madis731
Well, if you still don't believe me, try out my code. Change some lines to
do the multiplying the hard way. I've implemented the same technique WIHTOUT any macros and it's working fine. 167^765 is calculated in only a second. The interesting part is that the binary value is calculated within under 1% of programs time, the conversion HEX=>BCD=>DEC takes longer. 

11 Aug 2004, 06:10 

Imagist
That's exactly my point: creating a macro that adapts to any input makes the second method slower. If you make the macro nonadaptive, or take it out of a macro entirely, then you are right, the second macro is faster. But that code would also defeat the purpose of the macro, because it can only be used for one specific situation.


11 Aug 2004, 16:46 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.