flat assembler
Message board for the users of flat assembler.
Index
> Windows > I have the assembler, now what? Goto page Previous 1, 2, 3, 4, 5, 6 Next |
Author |
|
Trinitek 27 Mar 2017, 22:04
C0deHer3tic wrote: Why can't I print out them both? https://en.wikipedia.org/wiki/X86_calling_conventions#stdcall EAX, ECX, and EDX are designated for use in functions, and so aren't guaranteed to be preserved. |
|||
27 Mar 2017, 22:04 |
|
system error 27 Mar 2017, 23:40
@Heretic
Seems to me that you got yourself into the wrong starting point of learning assembly. But I can see that you are legitimately trying. That's nice. We/I can attend to your code's problems because there are lots of errors. But I give you a simple but clean example how to achieve similar objective with hope that you can slowly digest it and make your next move after that. This code shows a lot of things that you should have when doing assembly and dealing with functions; Code: section '.data' data readable writeable greet db 'Hello. My first program',0ah,0 fmt db 'Result: %d + %d = %d',0ah,0 inp1 dd 6 inp2 dd 4 ans dd ? section '.code' code readable executable main: push greet ;arg1 call [printf] add esp,4 ;cleanup for arg1 push. cdecl convention push [inp2] ;arg2 push [inp1] ;arg1 call addTwo mov [ans],eax ;copy the return value in EAX to ans push [ans] ;arg4 push [inp2] ;arg3 push [inp1] ;arg2 push fmt ;arg1 call [printf] add esp,4*4 ;cdecl stack cleanup (calling convention) push 0 call [ExitProcess] ;add two integers ;Requires 2 arguments ;Returns to EAX ;Calling convention: stdcall (callee cleanup stack) addTwo: push ebp ;function prologue mov ebp,esp push ebx ;save EBX mov ebx,[ebp+12] ;arg2 from stack mov eax,[ebp+8] ;arg1 from stack add eax,ebx ;EAX stores the answer pop ebx ;restore EBX mov esp,ebp ;function epilogue pop ebp ret 4*2 ;stdcall stack cleanup (calling convention) You can slowly modify and expand it after according to your own pace. Others may help you if you have a working code like this instead as your starting point. It resembles this C code because it will be easier for you to understand it from your C background (only slightly differ in addTwo calling convention's use); Code: int main() { int inp1=6, inp2=4; int ans; printf("Hello. My first program\n"); ans = addTwo(inp1,inp2); printf("Result: %d + %d = %d\n",inp1,inp2,ans); return 0; } int addTwo(int x,int y) { return x+y; } |
|||
27 Mar 2017, 23:40 |
|
Furs 27 Mar 2017, 23:46
Your functions never return, that's not a good practice even if it works in this case. Use 'ret' instruction to return from a function (optionally if you have parameters on the stack for *your* function, use 'ret X' where X is number of bytes to pop).
Also you forgot newlines in your strings (they end up on the same line). Here's a small attempt with commented changes, but CAUTION: I didn't compile it, because it's missing 'dota.inc' so couldn't test, if there's a typo etc please ignore it: Code: format PE console entry start include 'win32a.inc' include 'dota.inc' section '.data' data readable Hello db 'Hello World! The number is now %d.',13,10,0 ; add newline to the strings (windows uses CR LF) String1 db "The number is %d",13,10,0 section '.main' data readable subtract: dec ebx ; subtract 1 with 'dec' instruction, smaller encoding mov eax,ebx ; could have used here 'lea eax, [ebx-1]' and get rid of 'dec' ; and the mov (the lea does the same thing), but let's keep it basic ret ; return from subtract! multiply: shl eax, 1 ; multiply by 2 with left shift by 1 bit, same thing but faster ret ; return from multiply with result in eax! start: mov eax, 4 add eax, 6 mov ebx, eax push eax ; push eax to the stack push String1 call [printf] add esp, 4 call subtract ; takes input in ebx, returns result in eax call multiply ; multiply takes value in eax and returns in eax push eax push Hello call [printf] add esp,4 push 0 call [ExitProcess] What happens is that your subtract function takes as INPUT 'ebx', decrements it and places the result in 'eax'. You have to understand the "flow" of this, just like in C, you have functions that take parameters and return a value. Of course in asm you can return multiple values (registers), but always document what you use (simple comments before your function) so you can have a clear grasp of what takes what and returns what. A call is the exact same thing as jmp (jump/goto) instruction except that it pushes the return address on the stack. ret takes the return address found at 'esp' and jumps back. So if you don't have ret instruction and don't use hacks to get the return address, then 'call' makes no sense at all you could just use 'jmp'. call is used for functions that are supposed to return, just like in C. system error wrote: Alignment is a processor thingy. Not OS. 64-bit calling conventions is tailored to suit such CPU requirement. It's there only to make sure the tiny minority of functions that use vectorized SSE aligned loads/stores don't have to realign the stack, which is quite stupid. First of all, that's a tiny minority of functions at the expense of bloating 99% of the functions' stack (resulting in less code cache/stack cache too). Secondly, it does NOT even work for anything better than 128-bit SSE. If you use 256-bit vectors, then it's nothing but a pure waste of space. You have to realign the stack anyway to 256-bits. Thus the 128-bit alignment is senseless in anything that uses AVX. And now we're stuck with catering to vectorized SSE even if we don't use it whatsoever (and use AVX instead). So fucking dumb. Keep in mind we're stuck with this forever as long as x86 64-bit exists with this dumb calling convention. Retarded decision. Last edited by Furs on 27 Mar 2017, 23:52; edited 1 time in total |
|||
27 Mar 2017, 23:46 |
|
C0deHer3tic 27 Mar 2017, 23:51
I thank you all for being patient with me. I will study both of theses answers. I appreciate the efforts from you all.
- Sincerely, C0deHer3tic. _________________ - Just because something is taught one way, does not mean there is not a different way, possibly more efficient. - |
|||
27 Mar 2017, 23:51 |
|
Trinitek 27 Mar 2017, 23:51
C0deHer3tic wrote: PS. I would like to make two lines of text. As an additional suggestion, you should consider using a debugger and view the disassembly of a C program you'd like to convert. That should give you some implementation hints. Last edited by Trinitek on 28 Mar 2017, 00:20; edited 1 time in total |
|||
27 Mar 2017, 23:51 |
|
Furs 27 Mar 2017, 23:54
C0deHer3tic wrote: I thank you all for being patient with me. I will study both of theses answers. I appreciate the efforts from you all. It's a great way to see the flow of control IMO. |
|||
27 Mar 2017, 23:54 |
|
system error 27 Mar 2017, 23:58
Furs wrote: Huh? I was talking about requiring alignment on a function call -- the processor does not need that at all. What do you mean the processor doesn't need aligned memory? stack is just an abstract view of the same memory in the same address space. IT IS MEMORY. So suggesting that SSE instructions require aligned memory but not aligned stack shows your fundamental understanding of how the 64-CPU works in bare metal is quite low. SSE/AVX instructions do exist / required inside many API functions. So where do you think they get aligned memory from if not from the aligned stack? |
|||
27 Mar 2017, 23:58 |
|
revolution 28 Mar 2017, 00:38
64-bit Windows uses the SSE/AVX instructions to move data from/to the stack when doing various internal things within the APIs. It is not just for arithmetic operations, so every API has the potential to cause a stack fault if you don't align the stack correctly.
|
|||
28 Mar 2017, 00:38 |
|
C0deHer3tic 28 Mar 2017, 02:55
Thank you everyone. So much studying!
_________________ - Just because something is taught one way, does not mean there is not a different way, possibly more efficient. - |
|||
28 Mar 2017, 02:55 |
|
C0deHer3tic 28 Mar 2017, 04:13
@system error
Your code is very confusing to me. I would love to know more about what everything does, and why. Your comments helped some, however I am not understanding things like: Code: add esp,4*4 I am not sure I understand the reason 4 times 4 added to esp Is this because 3 args + printf were pushed to the stack? Thus making 4? Also why does this ret 4*2? I am not quite understanding. Code: ret 4*2 ;stdcall stack cleanup (calling convention) @Furs Code: shl eax, 1 ; multiply by 2 with left shift by 1 bit, same thing but faster What? I am not clear with this command. How does it multiply by 2? Here is what I found on this command, and forgive my ignorance, but I still don't understand. Quote:
Also, why do you use 13,10,0 after each string? Code: Hello db 'Hello World! The number is now %d.',13,10,0 ; add newline to the strings (windows uses CR LF) String1 db "The number is %d",13,10,0 Now is this because 13 is the carriage return, and would do the same if I did 0dh? If so, why then is the newline used? Dec 10 (0ah) hex, or 1010b Of course 0 is null. Just like "Hello World!\0" Right? - Sincerely and curious, CodeHer3tic _________________ - Just because something is taught one way, does not mean there is not a different way, possibly more efficient. - Last edited by C0deHer3tic on 28 Mar 2017, 04:23; edited 1 time in total |
|||
28 Mar 2017, 04:13 |
|
Trinitek 28 Mar 2017, 04:23
C0deHer3tic wrote: @Furs |
|||
28 Mar 2017, 04:23 |
|
C0deHer3tic 28 Mar 2017, 04:28
@Trinitek
Thank you for the explanation. That was silly of me for not seeing that on my own. |
|||
28 Mar 2017, 04:28 |
|
revolution 28 Mar 2017, 04:44
C0deHer3tic wrote: Also, why do you use 13,10,0 after each string? In Windows CR: cursor goes to the beginning of the current line LF: cursor goes down to the next line |
|||
28 Mar 2017, 04:44 |
|
C0deHer3tic 28 Mar 2017, 05:04
Thank you, revolution. That was helpful. I understand now.
|
|||
28 Mar 2017, 05:04 |
|
Furs 28 Mar 2017, 12:04
I tested and it seems just '10' (i.e. \n, newline) is enough in Windows console app so you should just use that then I guess, if you want
But yeah, 13,10 is equivalent to "\r\n" in C string, since that's their ASCII/ANSI encoding, nothing special. system error wrote: What do you mean the processor doesn't need aligned memory? stack is just an abstract view of the same memory in the same address space. IT IS MEMORY. So suggesting that SSE instructions require aligned memory but not aligned stack shows your fundamental understanding of how the 64-CPU works in bare metal is quite low. Dude, a function can realign the stack with one instruction (and a frame pointer). Even if the stack is completely messed up and aligned to 1 byte only. This is REQUIRED for any functions using AVX regardless if you want performance. ALL that the stupid ABI does is guarantee functions that want to use 128-bit SSE (but not anything higher!!) that the stack is aligned so it will save 1-3 instructions in the function prolog at MOST and bloat EVERYTHING ELSE. So, we waste the stack for *every single function* (because the ABI applies to every single function, as long as it follows it) for that tiny minority of functions which use SSE (and not AVX)? Just to save a few stupid instructions in the prolog? Let's say on average 50% of functions need to waste 8 bytes of stack space to align the stack. Keep in mind, this applies ONLY to 128-bit SSE. Any new code using AVX will have to realign the stack anyway, doesn't matter if it's 16-byte, 8-byte or 1-byte aligned, you get the same extra prolog. The ONLY code that benefits from this is strictly SSE, and that's it. Ok, want an assembly example? Here's our AVX function prolog (this is required for performance regardless of alignment of stack to 16-bytes): Code: push rbp mov rbp, rsp and rsp, -32 ; realigns the stupid stack, WOW magic!!! [...] So to save that stupid instruction (which has so much overhead obviously and functions using SSE are 99% of them right?) we get this idiotic alignment requirement, WTF? This alignment requirement doesn't even work for AVX, only produces bloat in the stack (and stack is considered "hot" to store stuff to be in the cache). So technically all code using AVX+ will not benefit from it in any way, in fact it makes it worse. Short-sighted and pathetic design, period. system error wrote: SSE/AVX instructions do exist / required inside many API functions. So where do you think they get aligned memory from if not from the aligned stack? Don't mix up AVX with SSE. AVX already has to realign the stack, so they already do it. This ABI shit is ONLY for SSE. If this is such a wonderful idea, why not align the stack to 1024-bits just in case future AVX extensions will use 1024-bit vectors? Or let's align it to 4k bytes (a page) and be done with it, that way we can be sure it will work with any future vector instructions, right? After all, saving that "and rsp, -4096" for that ONE function using this massive vector is extremely important yea? Let's pollute every other function in existence with this requirement for that one function using 4096-byte vectors! |
|||
28 Mar 2017, 12:04 |
|
revolution 28 Mar 2017, 12:23
Furs wrote: ... for that tiny minority of functions which use SSE ... Anyhow, we have it now. It is what it is. If you want to write code that interfaces with the API then you have to comply or have your code crashing. |
|||
28 Mar 2017, 12:23 |
|
system error 28 Mar 2017, 12:30
Furs wrote: Dude, a function can realign the stack with one instruction (and a frame pointer). Even if the stack is completely messed up and aligned to 1 byte only. This is REQUIRED for any functions using AVX regardless if you want performance. this is where you get the idea wrong. In 64-bit ABI of any kind, the work of aligning the stack is not done by the function but rather the responsibility user codes / callers so that a function is free to do its job without any hassle. So the function do not bloat its code with function prologue and epilogue like your pathetic code is suggesting. It is nothing different than 32-bit calling conventions where the users / callers need to re-align the stack, especially in CDECL. Same old, same old. I understand your INCOMPETENCY when dealing with 64-bit thingy. You don't have to bark at Miscrosoft or Linus Torvalds You DO know that PUSH is a high-level / complex instruction, right? You DO know that PUSH RCX consumes more microcode than plain Code: sub rsp,8 mov [rsp],rcx right? right? Lets see how good your brain vs your big mouth. |
|||
28 Mar 2017, 12:30 |
|
system error 28 Mar 2017, 12:36
revolution wrote:
She doesn't get the idea that 64-bit programming is not for everybody. 64-bit programming is not for the faint of heart. If she tries to understand 64-bit calling conventions with 32-bit INVOKERS mindset, then she/he is going to be hysterical (like she's now). |
|||
28 Mar 2017, 12:36 |
|
system error 28 Mar 2017, 12:44
C0deHer3tic wrote: Thank you everyone. So much studying! The add esp,4*x is to restore the stack, aka Top of Stack, aka ESP to its previous positions prior to function calls. That means if you PUSHED 3 items onto the stack for function arguments, then after exiting the function, you're responsible to restore it back to its original value because ESP is going to be used by everybody else. In 32-bit computing, a push is 4 bytes. So 3 pushes is 4*3 to restore it. In 64-bit computing, a push is 8 bytes, So 5 pushes is 8*5 to restore the Top of Stack. Code: Example push a push b push c call D add esp,4*3 ;or simply add esp, 12 I told you you're picking the wrong entry point to learn assembly. Jumping right to calling convention or stack programming is not a wise move. You need to go back down a little bit to the basics. |
|||
28 Mar 2017, 12:44 |
|
Goto page Previous 1, 2, 3, 4, 5, 6 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.