flat assembler
Message board for the users of flat assembler.
Index
> Main > the old square root problem, again 
Author 

revolution 14 Aug 2015, 22:47


14 Aug 2015, 22:47 

magicSqr 14 Aug 2015, 23:18
revolution wrote: There is a whole topic about this. Started by you. Hi revolution This is not the same question, hence the new thread. In the other thread I was asking the best way to *calculate* if a number was a perfect square. In this case I am not trying to find the square root, just a 'reasonable' a < square root and a 'reasonable' b > square root. Then, as I said in my OP, if I have a list of squares from a to b, a simple scan would find if the number is in the list. without having to scan from 1² to x/2. Bearing in mind that I need to do this for huge numbers (10^30+) a huge number of times, then as an example, I can say that for integer x then 1 <= sqrt x <= x/2 which is fine for small values of x, i.e. 1 <= sqrt 1,000,000 <= 500,000 leaves a relatively quick rep, scas of a list of squares between these values. When dealing with x = 490,000,000,000,000,000,000,000,000,000 then rep,scas from 1 to 245,000,000,000,000,000,000,000,000,000 on a list of squares is a long way to go to get the correct value of 700,000,000,000,000. What I am asking is, what are better ways of getting estimates for a and b rather than the trivial a=1 <= sqrt x <= b=x/2 magic² 

14 Aug 2015, 23:18 

redsock 15 Aug 2015, 06:15
does the normal "fastest integer square root" method still do the trick (regardless of whether translated for bigints or not)?
http://www.codecodex.com/wiki/Calculate_an_integer_square_root ? 

15 Aug 2015, 06:15 

redsock 15 Aug 2015, 07:30
Sitting here on my day off (Saturday night), and thought I'd have a quick play with this to see what you're really up against
I wrote this code against my own HeavyThing library's bigint implementation (https://2ton.com.au/HeavyThing/), and the code looks like: Code: include '../../ht_defaults.inc' include '../../ht.inc' ; two arguments: rdi == destination bigint, rsi == source bigint (not destroyed) ; performs an integer square root (truncating the decimals) falign isquare: prolog isquare push r12 r13 r14 r15 rdi mov rdi, rsi call bigint$new_copy mov r12, rax ; x = source call bigint$new mov r13, rax ; y = 0 call bigint$new mov r14, rax call bigint$new mov r15, rax ; temp = 0 mov rdi, r12 call bigint$bitcount add eax, 1 and eax, not 1 jz .alldone mov rdi, r14 mov esi, eax call bigint$bitset ; z = highest power of four <= source calign .outer: mov rdi, r15 mov rsi, r13 call bigint$assign mov rdi, r15 mov rsi, r14 call bigint$add_unsigned ; temp = y + z mov rdi, r12 mov rsi, r15 call bigint$compare_unsigned cmp eax, 0 jl .nomod mov rdi, r12 mov rsi, r15 call bigint$subtract ; x = temp mov rdi, r15 mov rsi, r14 call bigint$assign mov rdi, r15 mov esi, 1 call bigint$shl mov rdi, r13 mov rsi, r15 call bigint$add_unsigned ; y += z * 2 .nomod: mov rdi, r13 mov esi, 1 call bigint$shr ; y /= 2 mov rdi, r14 mov esi, 2 call bigint$shr ; z /= 4 mov rdi, r14 call bigint$is_zero test eax, eax jz .outer .alldone: pop rdi mov rsi, r13 call bigint$assign ; cleanup mov rdi, r15 pop r15 call bigint$destroy mov rdi, r14 pop r14 call bigint$destroy mov rdi, r13 pop r13 call bigint$destroy mov rdi, r12 pop r12 call bigint$destroy epilog public _start falign _start: call ht$init ; timing test for 1,000,000 of these operations: mov ebx, 1000000 mov rdi, .test mov esi, .testlen call bigint$new_encoded mov r12, rax ; our source call bigint$new mov r13, rax ; our destination calign .timing_loop: mov rdi, r13 mov rsi, r12 call isquare sub ebx, 1 jnz .timing_loop ; debug our result (compare with wolframalpha to base 16 if you are interested) mov rdi, r13 call bigint$debug mov eax, syscall_exit xor edi, edi syscall dalign .test: ; we want: 490,000,000,000,000,000,000,000,000,000, which is (in big endian form): db 0x06, 0x2f, 0x46, 0x80, 0x09, 0xe5, 0x15, 0x74, 0x94, 0x10, 0x00, 0x00, 0x00 ; or little endian: ; db 0x00, 0x00, 0x00, 0x10, 0x94, 0x74, 0x15, 0xe5, 0x09, 0x80, 0x46, 0x2f, 0x06 .testlen = $  .test include '../../ht_data.inc' Anyway, 1M tests on my local dev environment doing the big integer square root only (older 3930k virtual machine) runs the entire test in 4.14 seconds, or 241,545 per second for a single thread. (and that is with your example number of 490,000,000,000,000,000,000,000,000,000) And then if I modify the code again to do the square (only the timing loop itself with the relevant modification bit is included here): Code: calign .timing_loop: mov rdi, r13 mov rsi, r12 call isquare ; modified bit to square the result so we can compare it to our source mov rdi, r13 call bigint$square ; and final comparison against the source to see if it was a perfect square or not mov rdi, r12 mov rsi, r13 call bigint$compare_unsigned ; if eax == 0 here, perfect square it is ; end modified bit sub ebx, 1 jnz .timing_loop ; debug our result (compare with wolframalpha to base 16 if you are interested) ; (which in our modified form will be equal to our input) mov rdi, r13 call bigint$debug mov eax, syscall_exit xor edi, edi syscall And this runs the entire 1M test in 4.215 seconds, or 237,247 per second to do the "is it a perfect square root" per your commentary... Obviously the square operation is not expensive for such small numbers as your example... That algorithm (the one I just hacked together for my bigint isquare) is by no means heavily optimised, and doesn't even follow the current accepted fastest method. Did I misinterpret your Q, or am I on the right track here. 

15 Aug 2015, 07:30 

magicSqr 15 Aug 2015, 09:22
Hi Redsock,
I haven't had a chance to look through your code yet but thanks for the time you put in From your text it seems that you think I am trying to 'calculate' the square root. Here is a *small* example... Assume I have x dd 489 sqrs dd 1, 4, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961 I am trying to figure if there is a way to get a good approximation for the square root of x, for example sqrt(289) < sqrt(x) < sqrt(625) I could then just do a repne scasd through 324, 361, 400, 441, 484, 529, 576 to see if x is in the list, i.e. a perfect square. There would be no need to scan through the whole array, or to take any square roots. Thanks, magic² 

15 Aug 2015, 09:22 

redsock 15 Aug 2015, 09:27
Isn't "integer square roots" equal to approximated results to begin with?
I kinda see where you are going, so perhaps the Q is better left to hardcore math heads rather than programmer heads like me You are looking for an equation to reduce the complexity of approximating square roots (which again IMO integer square roots already are) sorry I wasn't of more service. Please update here when you find the answer as now I am invested in the outcome! Cheers 

15 Aug 2015, 09:27 

magicSqr 15 Aug 2015, 16:16
Hi Redsock,
Here's the actual problem I'm trying to overcome. I require numbers that are the sum of 2 squares in *at least* four ways. Although these can be found with prime factors of the form 4k + 3, when these factors have an even exponent, and can also have multiple factors of 2, I am interested in a very particular type. Any number that only has distinct prime factors of the form 4k + 1. If a number has 3 of these distinct factors it will be the sum of 2 squares in exactly four ways. The lowest of these is 1105 = 33^2 + 4^2 = 32^2 + 9^2 = 31^2 + 12^2 = 24^2 + 23^2 with more that 3 distinct prime factors 4k+1 the result with be the sum of 2 squares in an increasing amount of ways, which I also want. Now, the problem... (1) I can produce a list of various m^2 + n^2 = x, then check if x has factors of the of the required format, then find other m^2 + n^2 that equal the same x. This is very inefficient. (2) I can multiply prime numbers of the required format to give me an x. i.e. 5 * 13 * 17 = 1105. I know these numbers have the required property but do not know what the four ways that they are the sum of 2 squares is without checking. These can be found in 2 ways (starting with a list of square numbers from 1 up)... (2a) I can subtract increasing m^2 (m starting at 1) from x up to a limit of sqrt(x/2) and check if x  m^2 is a perfect square, or (2b) I think this will be my preferred option unless someone proposes a better solution... as (2a) subtract increasing m^2 (m starting at 1) from x but this time up to a limit where x < m^2 and store each result of x  m^2, call these numbers n^2. The if I check each n^2 against the limit reached for m^2 in decreasing values, every value of n^2 that occurs in m^2 will be known to be a perfect square. In English it sounds a lot more complex than it actually is. Here is a run for x = 5*13*17 = 1105 1105  1^2 = 1104 1105  2^2 = 1101 1105  3^2 = 1096 1105  4^2 = 1089 1105  5^2 = 1080 1105  6^2 = 1069 1105  7^2 = 1056 1105  8^2 = 1041 1105  9^2 = 1024 1105  10^2 = 1005 1105  11^2 = 984 1105  12^2 = 961 1105  13^2 = 936 1105  14^2 = 909 1105  15^2 = 880 1105  16^2 = 849 1105  17^2 = 816 1105  18^2 = 781 1105  19^2 = 744 1105  20^2 = 705 1105  21^2 = 664 1105  22^2 = 621 1105  23^2 = 576 1105  24^2 = 529 1105  25^2 = 480 1105  26^2 = 429 1105  27^2 = 376 1105  28^27 = 321 1105  29^2 = 264 1105  30^3 = 205 1105  31^2 = 144 1105  32^2 = 81 1105  33 ^2 = 16 checking values on the rhs against values in the middle the only numbers that appear in both are 16, 81, 144, 529, 576, 961, 1024, 1089. These give for four required sets. I think this will be the most efficient as it only uses subtraction and checking. However, when I get to *large* numbers (at *least* 10^25), which I will, this is still a very longwinded approach. Unfortunately, I do not believe it is possible to *calculate* what the numbers are in the different ways a particular number is the sum of 2 squares. i.e. I can say with certainty that 982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833 is the sum of 2 squares in exactly four ways, finding these four ways is not so easy, lol. Thanks again and sorry for the lllooooonnnngggggggggg post, magic² 

15 Aug 2015, 16:16 

Xorpd! 15 Aug 2015, 20:19
While I was riding the bus, I saw an address that was 6161. Obvious prime factorization was 61*101. We have 61 = 5²+6² = 5+6i², 101 = 1²+10² = 1+10i^2. Then (5+6i)(1+10i) = 55+56i, so 6161 = 55+56i² = 55²+56². Again, (5+6i)(110i) = 6544i, so 6161 = 65+44i² = 65²+44². Easy enough to do this in your head provided you are using public transportation and not driving.
So if you can factor the number you need only solve three relatively easy problems instead of one really hard problem. But perhaps I don't understand the purpose of your research? 

15 Aug 2015, 20:19 

magicSqr 16 Aug 2015, 19:35
You're a genius Xorpd!
I had tried something similar but kept tripping myself up. Your solution also gives me the following for a number with 3 factors... Let x = (a^2+b^2)(c^2+d^2)(e^2+f^2) then x = (aceadfbcfbde)^2+(acf+ade+bcebdf)^2 x = (ace+adfbcf+bde)^2+(acfade+bce+bdf)^2 x = (ace+adf+bcfbde)^2(acfadebcebdf)^2 x = (aceadf+bcf+bde)^2(acf+adebce+bdf)^2 For the big example I gave x = 982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833 we get x = 27817821433933^2 + 13207654355188^2 x = 30453641615477^2 + 4566074380952^2 x = 29922054378227^2 + 7276261724452^2 x = 30745933821163^2 + 1720719721792^2 Many thanks, magic² 

16 Aug 2015, 19:35 

Xorpd! 16 Aug 2015, 21:40
I offer a brief bibliography that may increase your enjoyment of this sort of problem.
Recreations in the Theory of Numbers: the Queen of Mathematics Entertains, A. H. Beiler Introduction to Number Theory, James E. Shockley An Introduction to the Theory of Numbers, G. H. Hardy and Edward M. Wright Advanced Number Theory, Harvey Cohn 

16 Aug 2015, 21:40 

magicSqr 16 Aug 2015, 22:11
Thanks again


16 Aug 2015, 22:11 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.