Message board for the users of flat assembler.
> Main > the old square root problem, again
Instead of trying to re-invent the wheel...
I have an integer, x, for argument sake let x = 123456789 and I want to know if it is a perfect square (i.e. it's square root is an integer).
Assume I have list of square integers that includes all squares from 1 to x^4.
What is the best way of finding a reasonable approximation for a < sqrt(x) and a b > sqrt(x) where a and b are integers?
If these are reasonable approximations then a simple and quick rep, scas will find whether x is a perfect square or not. There would be no need to check sqrt whether by fp or gmp.
Any ideas would be greatly appreciated,
|14 Aug 2015, 22:09||
There is a whole topic about this. Started by you.
|14 Aug 2015, 22:47||
There is a whole topic about this. Started by you.
This is not the same question, hence the new thread. In the other thread I was asking the best way to *calculate* if a number was a perfect square.
In this case I am not trying to find the square root, just a 'reasonable' a < square root and a 'reasonable' b > square root. Then, as I said in my OP, if I have a list of squares from a to b, a simple scan would find if the number is in the list. without having to scan from 1² to x/2.
Bearing in mind that I need to do this for huge numbers (10^30+) a huge number of times, then as an example, I can say that for integer x then 1 <= sqrt x <= x/2 which is fine for small values of x, i.e. 1 <= sqrt 1,000,000 <= 500,000 leaves a relatively quick rep, scas of a list of squares between these values. When dealing with x = 490,000,000,000,000,000,000,000,000,000 then rep,scas from 1 to 245,000,000,000,000,000,000,000,000,000 on a list of squares is a long way to go to get the correct value of 700,000,000,000,000.
What I am asking is, what are better ways of getting estimates for a and b rather than the trivial a=1 <= sqrt x <= b=x/2
|14 Aug 2015, 23:18||
does the normal "fastest integer square root" method still do the trick (regardless of whether translated for bigints or not)?
|15 Aug 2015, 06:15||
Sitting here on my day off (Saturday night), and thought I'd have a quick play with this to see what you're really up against
I wrote this code against my own HeavyThing library's bigint implementation (https://2ton.com.au/HeavyThing/), and the code looks like:
Note that you'll likely have to adjust the include '../../ht goods if you move it somewhere else relative to my library and want to play around with it.
include '../../ht_defaults.inc' include '../../ht.inc' ; two arguments: rdi == destination bigint, rsi == source bigint (not destroyed) ; performs an integer square root (truncating the decimals) falign isquare: prolog isquare push r12 r13 r14 r15 rdi mov rdi, rsi call bigint$new_copy mov r12, rax ; x = source call bigint$new mov r13, rax ; y = 0 call bigint$new mov r14, rax call bigint$new mov r15, rax ; temp = 0 mov rdi, r12 call bigint$bitcount add eax, 1 and eax, not 1 jz .alldone mov rdi, r14 mov esi, eax call bigint$bitset ; z = highest power of four <= source calign .outer: mov rdi, r15 mov rsi, r13 call bigint$assign mov rdi, r15 mov rsi, r14 call bigint$add_unsigned ; temp = y + z mov rdi, r12 mov rsi, r15 call bigint$compare_unsigned cmp eax, 0 jl .nomod mov rdi, r12 mov rsi, r15 call bigint$subtract ; x -= temp mov rdi, r15 mov rsi, r14 call bigint$assign mov rdi, r15 mov esi, 1 call bigint$shl mov rdi, r13 mov rsi, r15 call bigint$add_unsigned ; y += z * 2 .nomod: mov rdi, r13 mov esi, 1 call bigint$shr ; y /= 2 mov rdi, r14 mov esi, 2 call bigint$shr ; z /= 4 mov rdi, r14 call bigint$is_zero test eax, eax jz .outer .alldone: pop rdi mov rsi, r13 call bigint$assign ; cleanup mov rdi, r15 pop r15 call bigint$destroy mov rdi, r14 pop r14 call bigint$destroy mov rdi, r13 pop r13 call bigint$destroy mov rdi, r12 pop r12 call bigint$destroy epilog public _start falign _start: call ht$init ; timing test for 1,000,000 of these operations: mov ebx, 1000000 mov rdi, .test mov esi, .testlen call bigint$new_encoded mov r12, rax ; our source call bigint$new mov r13, rax ; our destination calign .timing_loop: mov rdi, r13 mov rsi, r12 call isquare sub ebx, 1 jnz .timing_loop ; debug our result (compare with wolframalpha to base 16 if you are interested) mov rdi, r13 call bigint$debug mov eax, syscall_exit xor edi, edi syscall dalign .test: ; we want: 490,000,000,000,000,000,000,000,000,000, which is (in big endian form): db 0x06, 0x2f, 0x46, 0x80, 0x09, 0xe5, 0x15, 0x74, 0x94, 0x10, 0x00, 0x00, 0x00 ; or little endian: ; db 0x00, 0x00, 0x00, 0x10, 0x94, 0x74, 0x15, 0xe5, 0x09, 0x80, 0x46, 0x2f, 0x06 .testlen = $ - .test include '../../ht_data.inc'
Anyway, 1M tests on my local dev environment doing the big integer square root only (older 3930k virtual machine) runs the entire test in 4.14 seconds, or 241,545 per second for a single thread.
(and that is with your example number of 490,000,000,000,000,000,000,000,000,000)
And then if I modify the code again to do the square (only the timing loop itself with the relevant modification bit is included here):
calign .timing_loop: mov rdi, r13 mov rsi, r12 call isquare ; modified bit to square the result so we can compare it to our source mov rdi, r13 call bigint$square ; and final comparison against the source to see if it was a perfect square or not mov rdi, r12 mov rsi, r13 call bigint$compare_unsigned ; if eax == 0 here, perfect square it is ; end modified bit sub ebx, 1 jnz .timing_loop ; debug our result (compare with wolframalpha to base 16 if you are interested) ; (which in our modified form will be equal to our input) mov rdi, r13 call bigint$debug mov eax, syscall_exit xor edi, edi syscall
And this runs the entire 1M test in 4.215 seconds, or 237,247 per second to do the "is it a perfect square root" per your commentary...
Obviously the square operation is not expensive for such small numbers as your example...
That algorithm (the one I just hacked together for my bigint isquare) is by no means heavily optimised, and doesn't even follow the current accepted fastest method.
Did I misinterpret your Q, or am I on the right track here.
|15 Aug 2015, 07:30||
I haven't had a chance to look through your code yet but thanks for the time you put in
From your text it seems that you think I am trying to 'calculate' the square root. Here is a *small* example...
Assume I have
x dd 489
sqrs dd 1, 4, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961
I am trying to figure if there is a way to get a good approximation for the square root of x, for example sqrt(289) < sqrt(x) < sqrt(625)
I could then just do a repne scasd through 324, 361, 400, 441, 484, 529, 576 to see if x is in the list, i.e. a perfect square. There would be no need to scan through the whole array, or to take any square roots.
|15 Aug 2015, 09:22||
Isn't "integer square roots" equal to approximated results to begin with?
I kinda see where you are going, so perhaps the Q is better left to hardcore math heads rather than programmer heads like me
You are looking for an equation to reduce the complexity of approximating square roots (which again IMO integer square roots already are)
sorry I wasn't of more service. Please update here when you find the answer as now I am invested in the outcome!
|15 Aug 2015, 09:27||
Here's the actual problem I'm trying to overcome. I require numbers that are the sum of 2 squares in *at least* four ways. Although these can be found with prime factors of the form 4k + 3, when these factors have an even exponent, and can also have multiple factors of 2, I am interested in a very particular type. Any number that only has distinct prime factors of the form 4k + 1. If a number has 3 of these distinct factors it will be the sum of 2 squares in exactly four ways. The lowest of these is
1105 = 33^2 + 4^2 = 32^2 + 9^2 = 31^2 + 12^2 = 24^2 + 23^2
with more that 3 distinct prime factors 4k+1 the result with be the sum of 2 squares in an increasing amount of ways, which I also want.
Now, the problem...
(1) I can produce a list of various m^2 + n^2 = x, then check if x has factors of the of the required format, then find other m^2 + n^2 that equal the same x. This is very inefficient.
(2) I can multiply prime numbers of the required format to give me an x.
i.e. 5 * 13 * 17 = 1105. I know these numbers have the required property but do not know what the four ways that they are the sum of 2 squares is without checking.
These can be found in 2 ways (starting with a list of square numbers from 1 up)...
(2a) I can subtract increasing m^2 (m starting at 1) from x up to a limit of sqrt(x/2) and check if x - m^2 is a perfect square, or
(2b) I think this will be my preferred option unless someone proposes a better solution... as (2a) subtract increasing m^2 (m starting at 1) from x but this time up to a limit where x < m^2 and store each result of x - m^2, call these numbers n^2. The if I check each n^2 against the limit reached for m^2 in decreasing values, every value of n^2 that occurs in m^2 will be known to be a perfect square.
In English it sounds a lot more complex than it actually is. Here is a run for x = 5*13*17 = 1105
1105 - 1^2 = 1104
1105 - 2^2 = 1101
1105 - 3^2 = 1096
1105 - 4^2 = 1089
1105 - 5^2 = 1080
1105 - 6^2 = 1069
1105 - 7^2 = 1056
1105 - 8^2 = 1041
1105 - 9^2 = 1024
1105 - 10^2 = 1005
1105 - 11^2 = 984
1105 - 12^2 = 961
1105 - 13^2 = 936
1105 - 14^2 = 909
1105 - 15^2 = 880
1105 - 16^2 = 849
1105 - 17^2 = 816
1105 - 18^2 = 781
1105 - 19^2 = 744
1105 - 20^2 = 705
1105 - 21^2 = 664
1105 - 22^2 = 621
1105 - 23^2 = 576
1105 - 24^2 = 529
1105 - 25^2 = 480
1105 - 26^2 = 429
1105 - 27^2 = 376
1105 - 28^27 = 321
1105 - 29^2 = 264
1105 - 30^3 = 205
1105 - 31^2 = 144
1105 - 32^2 = 81
1105 - 33 ^2 = 16
checking values on the rhs against values in the middle the only numbers that appear in both are 16, 81, 144, 529, 576, 961, 1024, 1089. These give for four required sets.
I think this will be the most efficient as it only uses subtraction and checking.
However, when I get to *large* numbers (at *least* 10^25), which I will, this is still a very long-winded approach.
Unfortunately, I do not believe it is possible to *calculate* what the numbers are in the different ways a particular number is the sum of 2 squares.
i.e. I can say with certainty that
982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833
is the sum of 2 squares in exactly four ways, finding these four ways is not so easy, lol.
Thanks again and sorry for the lllooooonnnngggggggggg post,
|15 Aug 2015, 16:16||
While I was riding the bus, I saw an address that was 6161. Obvious prime factorization was 61*101. We have 61 = 5²+6² = |5+6i|², 101 = 1²+10² = |1+10i|^2. Then (5+6i)(1+10i) = -55+56i, so 6161 = |55+56i|² = 55²+56². Again, (5+6i)(1-10i) = 65-44i, so 6161 = |65+44i|² = 65²+44². Easy enough to do this in your head provided you are using public transportation and not driving.
So if you can factor the number you need only solve three relatively easy problems instead of one really hard problem. But perhaps I don't understand the purpose of your research?
|15 Aug 2015, 20:19||
You're a genius Xorpd!
I had tried something similar but kept tripping myself up.
Your solution also gives me the following for a number with 3 factors...
Let x = (a^2+b^2)(c^2+d^2)(e^2+f^2) then
x = (ace-adf-bcf-bde)^2+(acf+ade+bce-bdf)^2
x = (ace+adf-bcf+bde)^2+(acf-ade+bce+bdf)^2
x = (ace+adf+bcf-bde)^2-(acf-ade-bce-bdf)^2
x = (ace-adf+bcf+bde)^2-(acf+ade-bce+bdf)^2
For the big example I gave
x = 982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833
x = 27817821433933^2 + 13207654355188^2
x = 30453641615477^2 + 4566074380952^2
x = 29922054378227^2 + 7276261724452^2
x = 30745933821163^2 + 1720719721792^2
|16 Aug 2015, 19:35||
I offer a brief bibliography that may increase your enjoyment of this sort of problem.
Recreations in the Theory of Numbers: the Queen of Mathematics Entertains, A. H. Beiler
Introduction to Number Theory, James E. Shockley
An Introduction to the Theory of Numbers, G. H. Hardy and Edward M. Wright
Advanced Number Theory, Harvey Cohn
|16 Aug 2015, 21:40||
|16 Aug 2015, 22:11||
< Last Thread | Next Thread >
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.