flat assembler
Message board for the users of flat assembler.
Index
> Windows > sieve of eratosthenes |
Author |
|
LocoDelAssembly 19 Sep 2010, 02:19
It is too long for me to correct it now completely so I'll show you a partial fix that it will be useful for you to patch the rest of the code:
Code: push [upto] call [malloc] ; pop [List] ; <<< Incorrect, this is not storing the pointer mov [List], eax ; This time the pointer returned by malloc is stored in "List" add esp, 4 ; Adjust stack since malloc returns with "RET" instead of "RET 4" (Previously this was implicitly done with "pop [List]" mov [i],0 initFor: mov eax,[i] cmp eax,[upto] jnl endFor mov eax,[List] ;mov byte[eax+i], 1 ; Incorrect, this is actually translated as "mov byte[eax+ebp-c], 1" (where "c" is the offset relative to EBP) mov edx, [i] mov byte[eax+edx-1], 1 ; Now the array is correctly addressed I've re-written the code, this one seems to work: Code: format PE console entry main include 'MACRO\IMPORT32.INC' include 'MACRO\proc32.inc' section '.data' data readable writeable pausemsg db "pause>nul",0 fmt db "%i", 10 section '.code' code readable executable proc SeviePrimesUpTo uses ebx edi, upTo:DWORD ; This proc adheres to stdcall calling convention ; The code is not prepared to work with values (unsigned) less than two and is not required anyway since the print procedure never prints 1 nor 0 cmp [upTo], 2 jb .exit ; Init boolean vector push [upTo] call [malloc] add esp, 4 lea ebx, [eax - 1] ; Re-based array so List[1] is automatically converted to List[0] mov ecx, [upTo] mov edi, eax mov al, 1 rep stos byte[edi] mov edx, 2 jmp .theWhileCheck .theWhile: cmp byte [ebx+edx], 1 jne .next .theFirstFor: mov byte [ebx+eax], 0 add eax, edx cmp eax, [upTo] ; k not needed, since the comparison can be changed to "p*p <= upTo" instead of "p*p < upTo+1" jbe .theFirstFor .next: inc edx .theWhileCheck: mov eax, edx imul eax, eax cmp eax, [upTo] jb .theWhile mov edi, 2 .theSecondFor: cmp byte [ebx+edi], 1 jne .advance push edi push fmt call [printf] add esp, 8 .advance: inc edi cmp edi, [upTo] ; Not used "k" for the same reason as above jbe .theSecondFor add ebx, 1 push ebx call [free] add esp, 4 .exit: ret endp proc main stdcall SeviePrimesUpTo,102 push pausemsg call [system] push 0 call [exit] endp section '.idata' import data readable library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ scanf,'scanf',\ system,'system',\ getchar,'getchar',\ realloc,'realloc',\ malloc,'malloc',\ free,'free',\ calloc,'calloc',\ exit,'exit' |
|||
19 Sep 2010, 02:19 |
|
ishkabible 19 Sep 2010, 03:01
all right, i will have to chew on this one a bit, it's difficult to look at code in a language you have no reference for, i was able to get a good grasp on Java and c# becuase there so much like c++, assembly is a much lower level beast. so i will have to as i said "chew" on this code for while before i can respond with an intelligible question/response. thanks, i wish there was a thank you button so you can tell how many times someone has been thanked
|
|||
19 Sep 2010, 03:01 |
|
Tyler 19 Sep 2010, 03:12
http://flatassembler.net/docs.php?article=manual is a good reference if you need to look something up.
|
|||
19 Sep 2010, 03:12 |
|
ishkabible 19 Sep 2010, 03:28
i know i have it bookmarked already
|
|||
19 Sep 2010, 03:28 |
|
ishkabible 19 Sep 2010, 18:48
can you explain the reason for this line after calling malloc?
Code: add esp, 4 Last edited by ishkabible on 19 Sep 2010, 19:15; edited 2 times in total |
|||
19 Sep 2010, 18:48 |
|
ishkabible 19 Sep 2010, 18:52
dose it move the stack up 4 bytes becuase that 4 bytes is not being returned?
|
|||
19 Sep 2010, 18:52 |
|
LocoDelAssembly 19 Sep 2010, 19:04
Yes, and you have to do this for all cdelc functions. the ones that are stdcall is not required since it is the responsibility of the function to restore the stack.
To avoid having take care of this you could use cinvoke macro. Small note: the "ret" inside the SeviePrimesUpTo proc is assembled as "pop edi / pop ebx / leave / retn 4" (unless you didn't write "ret" all lowercase), so this one is stdcall and doesn't need stack adjustment. |
|||
19 Sep 2010, 19:04 |
|
ishkabible 19 Sep 2010, 19:16
also what is the -1 for in mov byte[eax+edx-1],1, the +edx makes seince now that you say it becuase other wise i'm adding a pointer to i to eax witch is not what i want but what is the -1 for?
|
|||
19 Sep 2010, 19:16 |
|
LocoDelAssembly 19 Sep 2010, 19:29
Because the first array index is zero, not one. Also, since upTo bytes were allocated, it means that there is no room for either zero or upTo number. Since I'm interested in having the upTo number, then I use -1 to pretend that List[0] is List[1].
Another way of seeing it is that the array indexes are [1..upTo], but since addresses must be calculated as if the lower bound was zero, I have to substract the lower bound in the address calculation (Pascal compilers have to do this for instance, since programmers are allowed to specify a lower bound other than zero and even negative). Alternatively, you could change the pointer to the array in such a way that the address calculation can be done straight like I did in my own code (but at the time of freeing the memory you must undo the adjustment of the pointer before passing it to free). |
|||
19 Sep 2010, 19:29 |
|
ishkabible 19 Sep 2010, 19:45
ok i see what your saying, this is how you takled the problom i solved int c++ by allocating upto+1. thanks im almost done "chewing" on you code, i'm re working my code right now to see if the stack adjustment's, correct array access and better conditionals will make this work. thanks your extremely helpful
edit: thanks again i got it working, i used my style of coding that i did previously, is the way you broke each control structure's up into labeled parts the convention? because i'm just going off of what made sense to me when i code it, i would rather follow convention as it typically ends up helping me. here is my working code, just had to "patch" it as you put it with the stack adjustments and correct array assess. also i had some screwy jumps in there like when i compared List[i] to 1 and used the conditional jump jz, that made no sense. Code: format PE console entry main include 'INCLUDE\MACRO\import32.inc' include 'INCLUDE\MACRO\proc32.inc' section '.data' data readable writeable pausemsg db "pause>nul",0 isprime db "%i is prime",10,0 WSTZ db "List[%i] was set to zero",10,0 testmsg db "List[%i] = %i",10,0 testmsg2 db "was set List[%i] = %i",10,0 gothere db "it got here",10,0 dbugmsg db "%i<%i",10,0 section '.code' code readable executable proc SeviePrimesUpTo, upto:DWORD locals p dd 2 i dd 0 k dd 0 x dd 0 List dd 0 ;pointer to dynamicly allocated memroy endl push [upto] call [malloc] mov [List],eax add esp, 4 ;<-- here what dose this do? why is it here? initFor: mov eax,[i] cmp eax,[upto] jnl endFor mov eax, [List] mov edx, [i] mov byte[eax+edx-1],1 inc [i] jmp initFor endFor: mov [i],2 initFor2: mov eax,[i] cmp eax,[upto] jnl endFor2 mov eax, [List] mov edx, [i] movsx edi,byte[eax+edx-1] push edi push edx push testmsg call [printf] add esp,12 inc [i] jmp initFor2 endFor2: mov eax,[upto] mov [k], eax initMain: mov eax,[p] imul eax,eax cmp eax,[upto] jnle endMain mov eax,[List] mov edx,[p] cmp byte[eax+edx-1], 0 je endIfMain mov eax,[p] imul eax,eax mov [i],eax initForMain: mov eax,[i] cmp eax,[k] jnle endForMain mov eax,[List] mov edx,[i] mov byte[eax+edx-1],0 push [i] push WSTZ call [printf] add esp,8 mov eax,[i] add eax,[p] mov [i],eax jmp initForMain endForMain: endIfMain: inc [p] jmp initMain endMain: mov [i],2 initPrint: mov eax,[i] cmp eax,[k] jnl endPrint mov eax,[List] mov edx,[i] cmp byte[eax+edx-1],0 je endPrintIf push [i] push isprime call[printf] add esp,8 endPrintIf: inc [i] jmp initPrint endPrint: push [List] call [free] add esp,4 ret endp proc main stdcall SeviePrimesUpTo,49 push pausemsg call [system] push 0 call [exit] endp section '.idata' import data readable library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ scanf,'scanf',\ system,'system',\ getchar,'getchar',\ realloc,'realloc',\ malloc,'malloc',\ free,'free',\ calloc,'calloc',\ exit,'exit' Last edited by ishkabible on 19 Sep 2010, 19:58; edited 2 times in total |
|||
19 Sep 2010, 19:45 |
|
LocoDelAssembly 19 Sep 2010, 19:53
Quote:
Yet another small note: Remember that the index must be multiplied by the size of the element to be used in address calculation. Here that is omitted because multiplying by one isn't useful at all, but if it was an int array, you should use [ebx+edx*4]. (The factor you can use in address calculation is not arbitrary, so sometimes you will need to scale the index with IMUL or whatever and then use it) |
|||
19 Sep 2010, 19:53 |
|
ishkabible 19 Sep 2010, 20:00
how complex can the addresses be? i think i read somewhere that [index+offset*size+byteoffset] was as complex as it could get but im not sure if that's correct or not.
|
|||
19 Sep 2010, 20:00 |
|
LocoDelAssembly 19 Sep 2010, 20:33
[reg+reg*{1,2,4,8}+{byte,dword} offset]. That is what the CPU supports, however, fasm tolerates some other scale factors at times, for instance in [eax*3], it is automatically transformed to [eax+eax*2]. Check the Intel's manuals for a full lists of addressing modes (if not included already in fasm manual, I don't remember).
|
|||
19 Sep 2010, 20:33 |
|
score_under 19 Sep 2010, 21:14
Just for fun, I decided to experiment with the same code.
This is what I came up with... not suitable for human consumption: Code: format PE Console 4.0 include 'win32a.inc' entry main section '.text' code data readable writeable executable sieve: ;stdcall(char*buf,dword length) cmp dword[esp+8],1 ;dude wat jle .ret push esi push edi mov esi,[esp+0xC] ;ESI = buffer mov edi,[esp+0x10] ;size push ebx ;send some registers my way lea ecx,[edi-2] ;ECX = size - 2 mov ebx,2 ;EBX = p push edi push edi fnstcw [esp] fnstcw [esp+2] wait or byte[esp+1],0xC fldcw [esp] fild dword[esp+4] fsqrt fistp dword[esp+4] fldcw [esp+2] pop edi pop edi ;EDI = upto push edi push ecx mov word[esi],0 mov al,1 lea edi,[esi+2] rep stosb pop ecx pop edi add esi,ebx db 0x66,0x90 ;Align to 0x10 manually with a "short" nop .loop: cmp byte[esi],0 jnz .Prime .cond: inc ebx inc esi dec ecx cmp ebx,edi jl .loop jmp @f .Prime: ;Less likely branch for conditional jump, thus below conditional jump in code. mov eax,ebx xor edx,edx mul eax ;i = p * p sub eax,ebx ;because esi is offsetted (esi = &buf[ebx]) .SetToNonPrime: mov byte[esi+eax],0 add eax,ebx cmp eax,ecx jl .SetToNonPrime jmp .cond @@: pop ebx pop edi pop esi .ret: retn 8 main: TABLESIZE = 2000 push edi push TABLESIZE call [malloc] mov edi,eax push eax call sieve push ebx push esi mov ebx,TABLESIZE-2 mov esi,2 push esi push numstr @@: cmp byte[esi+edi],0 je .NotPrime call [printf] ;We need a conditional call instruction. .NotPrime: inc esi mov [esp+4],esi dec ebx jnz @b pop eax pop eax pop esi pop ebx push edi call [free] pop edi call [getch] pop edi ret align 4 data import library crt,'MSVCRT.DLL' import crt,getch,'_getch',free,'free',malloc,'malloc',printf,'printf' end data numstr db '%u',9,0 |
|||
19 Sep 2010, 21:14 |
|
LocoDelAssembly 19 Sep 2010, 22:05
score_under, the "xor edx, edx" before "mul eax" is not needed as in (I)DIV is. Also, for speed, you might consider using "imul eax, eax" instead, since it produces a faster 32-bit result instead of a full 64-bit result in EDX:EAX of EAX*EAX.
Quote:
|
|||
19 Sep 2010, 22:05 |
|
score_under 19 Sep 2010, 22:14
LocoDelAssembly wrote:
No, it's a question to the sanity of whoever would call the function like that. (But yeah, was the code readable enough? I hope it wasn't too asinine in appearance...) |
|||
19 Sep 2010, 22:14 |
|
DJ Mauretto 20 Sep 2010, 09:17
Hello
More simply implementation taken from some C source code Code: format PE CONSOLE 4.0 entry start Include 'win32a.inc' MAX = 1'000 ;============================================================================= ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; CODE ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;============================================================================= section ".code" code readable writeable executable start: CALL @Sieve_of_Eratosthenes PUSH 0 CALL [ExitProcess] ;============================================================================= ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; PROC ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;============================================================================= @Sieve_of_Eratosthenes: MOV ECX,2 MOV [I],ECX XOR ESI,ESI @Loop: CMP [Vector + ECX*4],1 JNZ .1 PUSH ECX PUSH Prime CALL [printf] MOV ECX,[I] ADD ESP,8 .1: MOV EAX,ECX IMUL EAX,ECX CMP EAX,MAX JNC .3 .2: MOV [Vector + EAX*4],ESI ADD EAX,ECX CMP EAX,MAX JC .2 .3: INC ECX CMP ECX,MAX MOV [I],ECX JC @Loop RET ;============================================================================= ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; DATA ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;============================================================================= section '.data' data readable writeable Vector DD MAX dup (1) I DD ? Prime DB 13,10,"%d ",0 ;============================================================================= ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; IDATA ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;============================================================================= section '.idata' import data readable writeable library kernel32,'kernel32.dll',\ msvcrt,'msvcrt.dll' import kernel32,\ ExitProcess,'ExitProcess' import msvcrt,\ printf,'printf' You can redirect output to a file in this way: Prime.exe > Prime.txt EDIT: Removed only J _________________ Nil Volentibus Arduum Last edited by DJ Mauretto on 20 Sep 2010, 17:55; edited 2 times in total |
|||
20 Sep 2010, 09:17 |
|
score_under 20 Sep 2010, 17:22
Storing "J" in write-only memory?
If you want me to suggest improvements: 1. Remove write to J, you never read it later...? 2. Don't store I in a global variable, put it in a register or pop it off the arguments for printf. 3. As a rule of thumb, (if your module is to be linked with compiler-generated code, or if it is otherwise necessary to comply to cdecl/fastcall/stdcall conventions) EBX/EBP/ESI/EDI should be the same at the start and end of your call, so push them at the start and pop at the end. 4. Use bytes for boolean values, it's smaller and saves a multiplication by 4 (Vector+ECX*4). |
|||
20 Sep 2010, 17:22 |
|
DJ Mauretto 20 Sep 2010, 17:49
Hello, I leave you the task of this optimized algorithm, I have translated only from a source in C language, usually when translate a high-level languages code do not think much of what I do, my mind only works when I write code created from me
_________________ Nil Volentibus Arduum |
|||
20 Sep 2010, 17:49 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.