flat assembler
Message board for the users of flat assembler.

Index > Main > quickest way to check if a number is a square

Goto page Previous  1, 2, 3, 4
Author
Thread Post new topic Reply to topic
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
@Enko:
Because its hard to fit 2GB in RAM and even if you do, it won't be efficient.
If you read it from the disk, then CPU instructions FSQRT or SQRTSD are faster.

Look-up tables are used if the fit in say 256 bytes or 64KB.
@Edfed: Your findings are interesting, but because the pattern happens only because you've thrown out non-squares, the algorithm for calculating it, is going to be very long, and probably slower.

Code:
936ms SSE v1
687ms SSE v2 doing nonFPU IMUL
780ms FPU v1
811ms FPU v2
530ms FPU v3 doing nonFPU IMUL
468ms Wikipedia + FPU v3 if needed
2855ms Bit compression
4025ms Bit compression (alternative)
421ms isqrt (Revolution implementation)
    
Post 05 Sep 2011, 17:18
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Madis731 wrote:
Code:
421ms isqrt (Revolution implementation)    
Which one is that? There were four alternatives.
Post 05 Sep 2011, 21:02
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Here is an old topic about isqrt stuff

http://board.flatassembler.net/topic.php?p=28971#28971

Perhaps there is some faster code in there?


Last edited by revolution on 06 Sep 2011, 03:41; edited 3 times in total
Post 06 Sep 2011, 03:39
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 2915
Location: [RSP+8*5]
bitRAKE
Code:
iSqrt: ; u32
    bsr edx, [esp+4]
    xor ecx, ecx
    xor eax, eax
    and edx, -2        ; make even
    bts ecx, edx

.0: add eax, ecx
    sub [esp+4], eax
    jnc .1
    add [esp+4], eax
    sub eax, ecx
    jmp .2

.1: add eax, ecx
.2: shr eax, 1
    shr ecx, 2
    jne .0

; EAX is square root of u32 passed on stack
; if [esp+4] is zero then u32 is a perfect square

    retn 4    
Runtime is O(log2(u32)/2); conversion to amd64 is trivial.
(It's really slow compared to others posted.)

(Happy Birthday to me. Smile)
Post 06 Sep 2011, 03:40
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
Sorry revolution, I didn't test all. Here's the full set:
Code:
;Core 2 Mobile (T9300), COUNTER = 7FFFFFFh
3635ms SSE v1
2730ms SSE v2 doing nonFPU IMUL
3167ms FPU v1
3151ms FPU v2
2153ms FPU v3 doing nonFPU IMUL
2152ms Wikipedia + FPU v3 if needed
1701ms isqrt Revolution
1778ms isqrt Revolution PII
1966ms isqrt Rev_SSE
2699ms isqrt Rev_SSE2
    

2nd gen Core:
Code:
;Core i5 650 (Desktop), COUNTER = 7FFFFFFh
1810ms SSE v1
1466ms SSE v2 doing nonFPU IMUL
1342ms FPU v1
1341ms FPU v2
1295ms FPU v3 doing nonFPU IMUL
515ms Wikipedia + FPU v3 if needed
655ms isqrt Revolution
780ms isqrt Revolution PII
889ms isqrt Rev_SSE
1436ms isqrt Rev_SSE2
    
Post 06 Sep 2011, 05:25
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22
Seems like the "Wikipedia + isqrt Rev" would produce the fastest overall results.
This would seem to coincide with the StackOverflow answer (the link was posted earlier in the thread).
Post 06 Sep 2011, 12:33
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
MHajduk



Joined: 30 Mar 2006
Posts: 6034
Location: Poland
MHajduk
bitRAKE wrote:
(Happy Birthday to me. Smile)
Happy Birthday bitRAKE! I wish you happiness, good health and myriads of the new ideas that will result in creation of another awesome code snippets. Smile

BTW, seems that you share the same birthday date as shoorick, interesting, isn't it? Wink
Post 06 Sep 2011, 13:20
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 2915
Location: [RSP+8*5]
bitRAKE
MHajduk wrote:
BTW, seems that you share the same birthday date as shoorick, interesting, isn't it? Wink
Thank you. Hm...something to think about. Me should meet and compare notes. Surprised

_________________
¯\(°_o)/¯ unlicense.org
Post 06 Sep 2011, 19:53
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.