flat assembler
Message board for the users of flat assembler.

Index > Main > the old square root problem, again

Author
Thread Post new topic Reply to topic
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
Hi all,

Instead of trying to re-invent the wheel...
I have an integer, x, for argument sake let x = 123456789 and I want to know if it is a perfect square (i.e. it's square root is an integer).
Assume I have list of square integers that includes all squares from 1 to x^4.
What is the best way of finding a reasonable approximation for a < sqrt(x) and a b > sqrt(x) where a and b are integers?

If these are reasonable approximations then a simple and quick rep, scas will find whether x is a perfect square or not. There would be no need to check sqrt whether by fp or gmp.

Any ideas would be greatly appreciated,

magic²
Post 14 Aug 2015, 22:09
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17284
Location: In your JS exploiting you and your system
revolution
There is a whole topic about this. Started by you.

http://board.flatassembler.net/topic.php?t=13324
Post 14 Aug 2015, 22:47
View user's profile Send private message Visit poster's website Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
revolution wrote:
There is a whole topic about this. Started by you.

http://board.flatassembler.net/topic.php?t=13324


Hi revolution Wink

This is not the same question, hence the new thread. In the other thread I was asking the best way to *calculate* if a number was a perfect square.
In this case I am not trying to find the square root, just a 'reasonable' a < square root and a 'reasonable' b > square root. Then, as I said in my OP, if I have a list of squares from a to b, a simple scan would find if the number is in the list. without having to scan from 1² to x/2.

Bearing in mind that I need to do this for huge numbers (10^30+) a huge number of times, then as an example, I can say that for integer x then 1 <= sqrt x <= x/2 which is fine for small values of x, i.e. 1 <= sqrt 1,000,000 <= 500,000 leaves a relatively quick rep, scas of a list of squares between these values. When dealing with x = 490,000,000,000,000,000,000,000,000,000 then rep,scas from 1 to 245,000,000,000,000,000,000,000,000,000 on a list of squares is a long way to go to get the correct value of 700,000,000,000,000.
What I am asking is, what are better ways of getting estimates for a and b rather than the trivial a=1 <= sqrt x <= b=x/2

magic²
Post 14 Aug 2015, 23:18
View user's profile Send private message Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 357
Location: Australia
redsock
does the normal "fastest integer square root" method still do the trick (regardless of whether translated for bigints or not)?

http://www.codecodex.com/wiki/Calculate_an_integer_square_root ?
Post 15 Aug 2015, 06:15
View user's profile Send private message Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 357
Location: Australia
redsock
Sitting here on my day off (Saturday night), and thought I'd have a quick play with this to see what you're really up against Smile

I wrote this code against my own HeavyThing library's bigint implementation (https://2ton.com.au/HeavyThing/), and the code looks like:
Code:
include '../../ht_defaults.inc'
include '../../ht.inc'

        ; two arguments: rdi == destination bigint, rsi == source bigint (not destroyed)
        ; performs an integer square root (truncating the decimals)
falign
isquare:
        prolog  isquare
        push    r12 r13 r14 r15 rdi
        mov     rdi, rsi
        call    bigint$new_copy
        mov     r12, rax                        ; x = source
        call    bigint$new
        mov     r13, rax                        ; y = 0
        call    bigint$new
        mov     r14, rax
        call    bigint$new
        mov     r15, rax                        ; temp = 0
        mov     rdi, r12
        call    bigint$bitcount
        add     eax, 1
        and     eax, not 1
        jz      .alldone
        mov     rdi, r14
        mov     esi, eax
        call    bigint$bitset                   ; z = highest power of four <= source
calign
.outer:
        mov     rdi, r15
        mov     rsi, r13
        call    bigint$assign
        mov     rdi, r15
        mov     rsi, r14
        call    bigint$add_unsigned             ; temp = y + z
        mov     rdi, r12
        mov     rsi, r15
        call    bigint$compare_unsigned
        cmp     eax, 0
        jl      .nomod
        mov     rdi, r12
        mov     rsi, r15
        call    bigint$subtract                 ; x -= temp
        mov     rdi, r15
        mov     rsi, r14
        call    bigint$assign
        mov     rdi, r15
        mov     esi, 1
        call    bigint$shl
        mov     rdi, r13
        mov     rsi, r15
        call    bigint$add_unsigned             ; y += z * 2
.nomod:
        mov     rdi, r13
        mov     esi, 1
        call    bigint$shr                      ; y /= 2
        mov     rdi, r14
        mov     esi, 2
        call    bigint$shr                      ; z /= 4
        mov     rdi, r14
        call    bigint$is_zero
        test    eax, eax
        jz      .outer
.alldone:
        pop     rdi
        mov     rsi, r13
        call    bigint$assign
        ; cleanup
        mov     rdi, r15
        pop     r15
        call    bigint$destroy
        mov     rdi, r14
        pop     r14
        call    bigint$destroy
        mov     rdi, r13
        pop     r13
        call    bigint$destroy
        mov     rdi, r12
        pop     r12
        call    bigint$destroy
        epilog


public _start
falign
_start:
        call    ht$init

        ; timing test for 1,000,000 of these operations:
        mov     ebx, 1000000

        mov     rdi, .test
        mov     esi, .testlen
        call    bigint$new_encoded
        mov     r12, rax                ; our source
        call    bigint$new
        mov     r13, rax                ; our destination
calign
.timing_loop:
        mov     rdi, r13
        mov     rsi, r12
        call    isquare
        sub     ebx, 1
        jnz     .timing_loop

        ; debug our result (compare with wolframalpha to base 16 if you are interested)
        mov     rdi, r13
        call    bigint$debug

        mov     eax, syscall_exit
        xor     edi, edi
        syscall
dalign
.test:
        ; we want: 490,000,000,000,000,000,000,000,000,000, which is (in big endian form):
        db      0x06, 0x2f, 0x46, 0x80, 0x09, 0xe5, 0x15, 0x74, 0x94, 0x10, 0x00, 0x00, 0x00
        ; or little endian:
        ; db    0x00, 0x00, 0x00, 0x10, 0x94, 0x74, 0x15, 0xe5, 0x09, 0x80, 0x46, 0x2f, 0x06
.testlen = $ - .test

include '../../ht_data.inc'
    
Note that you'll likely have to adjust the include '../../ht goods if you move it somewhere else relative to my library and want to play around with it.

Anyway, 1M tests on my local dev environment doing the big integer square root only (older 3930k virtual machine) runs the entire test in 4.14 seconds, or 241,545 per second for a single thread.

(and that is with your example number of 490,000,000,000,000,000,000,000,000,000)

And then if I modify the code again to do the square (only the timing loop itself with the relevant modification bit is included here):
Code:
calign
.timing_loop:
        mov     rdi, r13
        mov     rsi, r12
        call    isquare

        ; modified bit to square the result so we can compare it to our source
        mov     rdi, r13
        call    bigint$square
        ; and final comparison against the source to see if it was a perfect square or not
        mov     rdi, r12
        mov     rsi, r13
        call    bigint$compare_unsigned
        ; if eax == 0 here, perfect square it is
        ; end modified bit

        sub     ebx, 1
        jnz     .timing_loop

        ; debug our result (compare with wolframalpha to base 16 if you are interested)
        ; (which in our modified form will be equal to our input)
        mov     rdi, r13
        call    bigint$debug

        mov     eax, syscall_exit
        xor     edi, edi
        syscall
    


And this runs the entire 1M test in 4.215 seconds, or 237,247 per second to do the "is it a perfect square root" per your commentary...

Obviously the square operation is not expensive for such small numbers as your example...

That algorithm (the one I just hacked together for my bigint isquare) is by no means heavily optimised, and doesn't even follow the current accepted fastest method.

Did I misinterpret your Q, or am I on the right track here.
Post 15 Aug 2015, 07:30
View user's profile Send private message Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
Hi Redsock,

I haven't had a chance to look through your code yet but thanks for the time you put in Wink
From your text it seems that you think I am trying to 'calculate' the square root. Here is a *small* example...
Assume I have

x dd 489
sqrs dd 1, 4, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961

I am trying to figure if there is a way to get a good approximation for the square root of x, for example sqrt(289) < sqrt(x) < sqrt(625)
I could then just do a repne scasd through 324, 361, 400, 441, 484, 529, 576 to see if x is in the list, i.e. a perfect square. There would be no need to scan through the whole array, or to take any square roots.

Thanks,

magic²
Post 15 Aug 2015, 09:22
View user's profile Send private message Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 357
Location: Australia
redsock
Isn't "integer square roots" equal to approximated results to begin with?

I kinda see where you are going, so perhaps the Q is better left to hardcore math heads rather than programmer heads like me Smile

You are looking for an equation to reduce the complexity of approximating square roots (which again IMO integer square roots already are)

Smile sorry I wasn't of more service. Please update here when you find the answer as now I am invested in the outcome!

Cheers
Post 15 Aug 2015, 09:27
View user's profile Send private message Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
Hi Redsock,

Here's the actual problem I'm trying to overcome. I require numbers that are the sum of 2 squares in *at least* four ways. Although these can be found with prime factors of the form 4k + 3, when these factors have an even exponent, and can also have multiple factors of 2, I am interested in a very particular type. Any number that only has distinct prime factors of the form 4k + 1. If a number has 3 of these distinct factors it will be the sum of 2 squares in exactly four ways. The lowest of these is

1105 = 33^2 + 4^2 = 32^2 + 9^2 = 31^2 + 12^2 = 24^2 + 23^2

with more that 3 distinct prime factors 4k+1 the result with be the sum of 2 squares in an increasing amount of ways, which I also want.

Now, the problem...
(1) I can produce a list of various m^2 + n^2 = x, then check if x has factors of the of the required format, then find other m^2 + n^2 that equal the same x. This is very inefficient.
(2) I can multiply prime numbers of the required format to give me an x.
i.e. 5 * 13 * 17 = 1105. I know these numbers have the required property but do not know what the four ways that they are the sum of 2 squares is without checking.

These can be found in 2 ways (starting with a list of square numbers from 1 up)...
(2a) I can subtract increasing m^2 (m starting at 1) from x up to a limit of sqrt(x/2) and check if x - m^2 is a perfect square, or
(2b) I think this will be my preferred option unless someone proposes a better solution... as (2a) subtract increasing m^2 (m starting at 1) from x but this time up to a limit where x < m^2 and store each result of x - m^2, call these numbers n^2. The if I check each n^2 against the limit reached for m^2 in decreasing values, every value of n^2 that occurs in m^2 will be known to be a perfect square.

In English it sounds a lot more complex than it actually is. Here is a run for x = 5*13*17 = 1105

1105 - 1^2 = 1104
1105 - 2^2 = 1101
1105 - 3^2 = 1096
1105 - 4^2 = 1089
1105 - 5^2 = 1080
1105 - 6^2 = 1069
1105 - 7^2 = 1056
1105 - 8^2 = 1041
1105 - 9^2 = 1024
1105 - 10^2 = 1005
1105 - 11^2 = 984
1105 - 12^2 = 961
1105 - 13^2 = 936
1105 - 14^2 = 909
1105 - 15^2 = 880
1105 - 16^2 = 849
1105 - 17^2 = 816
1105 - 18^2 = 781
1105 - 19^2 = 744
1105 - 20^2 = 705
1105 - 21^2 = 664
1105 - 22^2 = 621
1105 - 23^2 = 576
1105 - 24^2 = 529
1105 - 25^2 = 480
1105 - 26^2 = 429
1105 - 27^2 = 376
1105 - 28^27 = 321
1105 - 29^2 = 264
1105 - 30^3 = 205
1105 - 31^2 = 144
1105 - 32^2 = 81
1105 - 33 ^2 = 16

checking values on the rhs against values in the middle the only numbers that appear in both are 16, 81, 144, 529, 576, 961, 1024, 1089. These give for four required sets.
I think this will be the most efficient as it only uses subtraction and checking.
However, when I get to *large* numbers (at *least* 10^25), which I will, this is still a very long-winded approach.

Unfortunately, I do not believe it is possible to *calculate* what the numbers are in the different ways a particular number is the sum of 2 squares.

i.e. I can say with certainty that

982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833

is the sum of 2 squares in exactly four ways, finding these four ways is not so easy, lol.

Thanks again and sorry for the lllooooonnnngggggggggg post,

magic²
Post 15 Aug 2015, 16:16
View user's profile Send private message Reply with quote
Xorpd!



Joined: 21 Dec 2006
Posts: 161
Xorpd!
While I was riding the bus, I saw an address that was 6161. Obvious prime factorization was 61*101. We have 61 = 5²+6² = |5+6i|², 101 = 1²+10² = |1+10i|^2. Then (5+6i)(1+10i) = -55+56i, so 6161 = |55+56i|² = 55²+56². Again, (5+6i)(1-10i) = 65-44i, so 6161 = |65+44i|² = 65²+44². Easy enough to do this in your head provided you are using public transportation and not driving.

So if you can factor the number you need only solve three relatively easy problems instead of one really hard problem. But perhaps I don't understand the purpose of your research?
Post 15 Aug 2015, 20:19
View user's profile Send private message Visit poster's website Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
You're a genius Xorpd! Very Happy

I had tried something similar but kept tripping myself up.
Your solution also gives me the following for a number with 3 factors...
Let x = (a^2+b^2)(c^2+d^2)(e^2+f^2) then

x = (ace-adf-bcf-bde)^2+(acf+ade+bce-bdf)^2
x = (ace+adf-bcf+bde)^2+(acf-ade+bce+bdf)^2
x = (ace+adf+bcf-bde)^2-(acf-ade-bce-bdf)^2
x = (ace-adf+bcf+bde)^2-(acf+ade-bce+bdf)^2

For the big example I gave
x = 982,451,653 * 982,451,629 * 982,451,609 = 948,273,322,896,298,772,346,363,833

we get

x = 27817821433933^2 + 13207654355188^2
x = 30453641615477^2 + 4566074380952^2
x = 29922054378227^2 + 7276261724452^2
x = 30745933821163^2 + 1720719721792^2 Very Happy

Many thanks,

magic²
Post 16 Aug 2015, 19:35
View user's profile Send private message Reply with quote
Xorpd!



Joined: 21 Dec 2006
Posts: 161
Xorpd!
I offer a brief bibliography that may increase your enjoyment of this sort of problem.

Recreations in the Theory of Numbers: the Queen of Mathematics Entertains, A. H. Beiler
Introduction to Number Theory, James E. Shockley
An Introduction to the Theory of Numbers, G. H. Hardy and Edward M. Wright
Advanced Number Theory, Harvey Cohn
Post 16 Aug 2015, 21:40
View user's profile Send private message Visit poster's website Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr
Thanks again Wink
Post 16 Aug 2015, 22:11
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.