flat assembler
Message board for the users of flat assembler.

Index > Windows > Sieve of Eratosthenes crash

Author
Thread Post new topic Reply to topic
Kris_A



Joined: 24 Dec 2004
Posts: 2
Kris_A 24 Dec 2004, 14:14
Hi, (First post, woo)

I've just started dabbling with Win32 api and I've been trying to make a Prime Number counter.
It crashes, which wasn't exactly unexpected since I'm not exactly sure what I'm doing Very Happy Weird thing is, it only crashes when I use large enough numbers in the 'size' constant (1000 is fine, 1000000 is not). Does anyone know why this is? Thanks.

Also, if you have any tips on improving it, please tell me!

I've got this code so far:

Code:
include '%fasminc%/win32ax.inc'

size equ 1000000

.data
     string      rd   256
     sieve       rd   size

.code
        start:

        ; Note: EAX reserved for MUL\loops\etc
        MOV ebx, 0  ; Prime counter
        MOV ecx, 2  ; Outer loop ctr
        MOV edx, 0  ; Inner loop ctr

        ;------------------------------
        ;    Reset sieve array
        ;------------------------------

        reset_loop:
                MOV [sieve+eax*4], 0    ; sieve[eax] = 0;
                INC eax                 ; eax++;
                CMP eax, size           ; if (eax < size)
                JL reset_loop           ;    goto reset_loop;

        ;------------------------------
        ;    Find primes
        ;------------------------------

        outer_loop:

                CMP ecx, size           ; if (ecx >= size)
                JGE exit_outer_loop     ;    break;

                CMP [sieve+ecx*4], 1    ; if (sieve[ecx] = 1)
                JE exit_inner_loop      ;    continue;

                MOV eax, ecx            ; eax = ecx;
                MUL ecx                 ; eax = ecx*ecx;
                MOV edx, eax            ; edx = eax;

                INC ebx                 ; ebx++; (Prime counter)

                inner_loop:

                        CMP edx, size           ; if (edx >= size)
                        JGE exit_inner_loop     ;    break;

                        ;; ********************
                        ;;  This is the line that crashes it V
                        ;; ********************

                        MOV [sieve+edx*4],1     ; sieve[edx] = 1;

                        ADD edx, ecx            ; edx += ecx;
                        JMP inner_loop

                exit_inner_loop:

                INC ecx                 ; ecx++;

                JMP outer_loop

        exit_outer_loop:

        ;------------------------------
        ;    Show results
        ;------------------------------

        invoke    wsprintf, string, '%i Primes', ebx
        invoke    MessageBox,NULL,string,'',MB_OK
        invoke    ExitProcess,0

.end start 
    
Post 24 Dec 2004, 14:14
View user's profile Send private message Reply with quote
iklin



Joined: 20 Mar 2004
Posts: 120
Location: Russia, Siberia
iklin 24 Dec 2004, 18:46
You use JL and JGE instead of JB and JAE.
Try to look into fasm's doc. Smile
Post 24 Dec 2004, 18:46
View user's profile Send private message ICQ Number Reply with quote
Kris_A



Joined: 24 Dec 2004
Posts: 2
Kris_A 24 Dec 2004, 19:11
Worked great, thanks Very Happy I'll have to remember that in the future...
Post 24 Dec 2004, 19:11
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 26 Dec 2004, 19:40
Hi, if you're interested I have a 16% speed improvement to your code without needing to change vary much.
It starts like this:
Code:
.code
        start:
        ; Note: EAX reserved for MUL\loops\etc
        MOV ebx, 2  ; Prime counter set to min=2
        MOV edi, 5  ; Outer loop ctr 2-4 are checked
        MOV edx, 0  ; Inner loop ctr

        ;------------------------------
        ;    Reset sieve array
        ;------------------------------
        xor eax,eax  ;It's good to zero out eax too
        mov [sieve+eax*4+00],1
        mov [sieve+eax*4+04],1
        mov [sieve+eax*4+08],0
        mov [sieve+eax*4+12],0
        mov [sieve+eax*4+16],1
        mov [sieve+eax*4+20],0
        mov eax,6
    myloop:
        mov [sieve+eax*4+00],1
        mov [sieve+eax*4+04],0
        mov [sieve+eax*4+08],1
        mov [sieve+eax*4+12],1
        mov [sieve+eax*4+16],1
        mov [sieve+eax*4+20],0
        add eax,6
        cmp eax,size - (size mod 6)
        jc  myloop
    

...but if you need further improvement you should make your initialized table 2 times smaller so you check ONLY odd numbers and this way you can make your code 50% faster.
You might want to try filtering out %3-s at the start but it only gives you about 15% more performance (because 6, 12 etc are already divizable by two).

P.S. BTW, nice code you got there. How did you come to that algorithm - something on the net or your own ideas?

_________________
My updated idol Very Happy http://www.agner.org/optimize/
Post 26 Dec 2004, 19:40
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 27 Jan 2005, 02:09
;something like this?
XOR eax,eax
PXOR mm0,mm0
MOVQ qword[sieve+eax*8],mm0
MOVQ qword[sieve+eax*8+8],mm0
MOVQ qword[sieve+eax*8+16],mm0
MOVQ qword[sieve+eax*8+24],mm0
ADD eax,4
reset:
MOVQ qword[sieve+eax*8],mm0
MOVQ qword[sieve+eax*8+8],mm0
MOVQ qword[sieve+eax*8+16],mm0
MOVQ qword[sieve+eax*8+24],mm0
ADD eax,4
CMP eax, (size / 8) ;better used on large SIZEs
JB reset
Post 27 Jan 2005, 02:09
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 27 Jan 2005, 10:41
Yes MMX gains speed even more, but my idea was to show the techique
behind my algorith. That was to precalculate some known iterations like
even numbers (2,4,6,...) and mod(3)=0 numbers like 3,6,9,...

Your code combined with mine makes the code smaller and it has less moves
Very Happy look@this:
Code:
.code
        start:
        ; Note: EAX reserved for MUL\loops\etc
        MOV ebx, 2  ; Prime counter set to min=2
        MOV edi, 5  ; Outer loop ctr 2-4 are checked
        MOV edx, 0  ; Inner loop ctr

        ;------------------------------
        ;    Reset sieve array
        ;------------------------------
        pattern1 dq (1 shl 32 + 1) ;2^32+1
        pattern2 dq (0 shl 32 + 0) ;Zero
        pattern3 dq (0 shl 32 + 1) ;One
        movq mm0,[pattern1]
        movq mm1,[pattern2]
        movq mm2,[pattern3]
        xor eax,eax  ;It's good to zero out eax too
        mov [sieve+eax*8+00],mm0
        mov [sieve+eax*8+08],mm1
        mov [sieve+eax*8+16],mm2
        mov eax,3
    myloop:
        mov [sieve+eax*8+00],mm2
        mov [sieve+eax*8+08],mm0
        mov [sieve+eax*8+16],mm2
        add eax,3
        cmp eax,size - (size mod 6)
        jc  myloop 
    

_________________
My updated idol Very Happy http://www.agner.org/optimize/
Post 27 Jan 2005, 10:41
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 28 Jan 2005, 03:30
the benchmark on that must be something compared to the original
Post 28 Jan 2005, 03:30
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 28 Jan 2005, 13:32
Sorry, I didn't follow - was it a positive or negative comment^o)

Kris_A's original or some other original on google?
Post 28 Jan 2005, 13:32
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
codelab



Joined: 03 Apr 2006
Posts: 16
Location: Denmark
codelab 02 Oct 2006, 20:47
Hi, all, Another optimization of primesearching:

Code:
;   simple prime search by masking out divisible numbers with increasing step thus leaving primes
;   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
;2:       X   X   X    X     X     X     X     X     X
;3:           X     X        X        X        X        X
;5:                    X              X              X

include 'win32ax.inc'

maxp=1000000
sqrmaxp=1000

.code

sieve:
  mov esi, pbuf
  mov eax, 2
stepfill:
  mov ecx, eax
l1:
  add ecx, eax
  cmp ecx, maxp
  jge l2
  mov byte [esi+ecx],1
  jmp l1
l2:
  inc eax
  cmp byte [esi+eax],0
  jne l2
  cmp eax, sqrmaxp
  jle stepfill


  invoke  ExitProcess,0

section '.data' data readable writeable


pbuf   db 1,1,maxp-2 dup(0)

.end sieve


    


had fewer iterations caused by the startstep is less or equal to sqrt(prime range)
Smile
Post 02 Oct 2006, 20:47
View user's profile Send private message Reply with quote
codelab



Joined: 03 Apr 2006
Posts: 16
Location: Denmark
codelab 03 Oct 2006, 09:38
Im terrible sorry, had a second dwell about it, starting values for stepfilling shall be a prime squared as should JGE l2 be JG l2:

Code:
; simple prime search by masking out divisible numbers
; with increasing step thus leaving primes
;   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
;2:       X   X   X    X     X     X     X     X     X
;3:           X     X        X        X        X        X
;5:                    X              X              X


include 'win32ax.inc'

maxp=1000000
sqrmaxp=1000

.code


sieve:
  mov esi, pbuf
  mov edi, 2
stepfill:
  mov eax, edi
  mul eax
  mov ecx, eax
l1:
  cmp ecx, maxp
  jg l2
  mov byte [esi+ecx],1
  add ecx, edi
  jmp l1

l2:
  inc edi
  cmp byte [esi+edi],0
  jne l2
  cmp edi, sqrmaxp
  jle stepfill



  invoke  ExitProcess,0

section '.data' data readable writeable


pbuf   db 1,1,maxp-2 dup(0)

.end sieve
    

Smile
Post 03 Oct 2006, 09:38
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.