flat assembler
Message board for the users of flat assembler.
Index
> Windows > CreateFileW(); call in Win x64 |
Author |
|
Chewy509 11 May 2007, 00:30
Hi Guys,
I'm having a problem with opening files (either r or rw). The code to open a file is as follows: (Filename is a UTF16 encoded string located at B0_DynStr0). Code: _B0__fopen: mov r2, B0_DynStr0 add r2, 4 ;; skip string length field. (pascal type string). mov r3, 0 mov r3d, 80000000h; mov r8, 1 mov r9, 0 mov r0, 3 push r0 mov r0, 0 push r0 mov r0, 0 push r0 sub r7, 20h call [CreateFileW] add r7, 20h mov r1, -1 cmp r0, r1 jne .B0_END_BLOCK_000028 mov r0, 0 .B0_END_BLOCK_000028: ret When the above code gets run, the app just quits completely, with no error message. I've run the code through MS's WinDbg (that comes with VC++ Express Edn), and it's getting an access violation from *within* the API call. (rIP is outside my application at the time of exception). I'm assuming that I've stuffed up the call in someway, but after rereading the API documentation (including the x64 calling convention), can't find anything wrong. Does anyone have an idea of what I'm doing wrong? PS. Running on WinXP x64 SP2. PPS. I've attached the complete source code for inspection...
|
|||||||||||
11 May 2007, 00:30 |
|
Xorpd! 11 May 2007, 01:49
[table][tr][td]Name[/td][td]Number[/td][/tr][tr][td]rax[/td][td]r0[/td][/tr][tr][td]rcx[/td][td]r1[/td][/tr][tr][td]rdx[/td][td]r2[/td][/tr][tr][td]rbx[/td][td]r3[/td][/tr][tr][td]rsp[/td][td]r4[/td][/tr][tr][td]rbp[/td][td]r5[/td][/tr][tr][td]rsi[/td][td]r6[/td][/tr][tr][td]rdi[/td][td]r7[/td][/tr][/table]
Fix this mapping up and see if you are making further progress. Ooh, this doesn't look good on preview. Just in case the above doesn't work, here is an alternative: Code: Name Number rax r0 rcx r1 rdx r2 rbx r3 rsp r4 rbp r5 rsi r6 rdi r7 |
|||
11 May 2007, 01:49 |
|
Chewy509 11 May 2007, 02:51
LocoDelAssembly wrote: With the debugger check if RSP is 16-bytes aligned when RIP is at the first CreateFileW's instruction. If RSP doesn't have such alignment you have to fix that. Thought of that, but rSP (aka r7) is aligned correctly... |
|||
11 May 2007, 02:51 |
|
Chewy509 11 May 2007, 02:54
Xorpd! wrote: Fix this mapping up and see if you are making further progress. That's an unusual mapping??? Where does it come from? From my understanding is that parameter passing is: 1. args in ecx (r2), edx (r3) , r8, r9, then pushed onto stack (from left to right). 2. stack aligned to 16, before call. Oh well, I'll have another play with the stack alignment later this weekend, and see what I come up with... |
|||
11 May 2007, 02:54 |
|
LocoDelAssembly 11 May 2007, 03:07
Quote:
Just to clarify, it should be [(RSP mod 16) = 8] after executing the call instruction. Here a link http://blogs.msdn.com/oldnewthing/archive/2004/01/14/58579.aspx [edit] I was mistaken, the 16-bytes alignment is needed just before executing the call instruction, not after. I corrected that[/edit] Last edited by LocoDelAssembly on 11 May 2007, 12:44; edited 3 times in total |
|||
11 May 2007, 03:07 |
|
Chewy509 11 May 2007, 03:10
I've just reread through the invoke macro in proc64.inc, and unsure of order which items should be pushed onto the stack?
In my code the stack looks like: [esp] = slack [esp+8] = slack [esp+16] = slack [esp+24] = slack [esp+32] = arg7 [esp+40] = arg6 [esp+48] = arg5 args 1 - 4 are in registers but the macro, if I'm reading correctly, the stack should be: [esp] = slack [esp+8] = slack [esp+16] = slack [esp+24] = slack [esp+32] = arg5 [esp+40] = arg6 [esp+48] = arg7 Can some please confirm which is the correct stack arrangement? Are the arguments that are to be pushed onto the stack, run from the left to right (arg5 gets pushed first, followed by arg6, etc), or the right to left (last arg gets pushed first, and backward that way)? |
|||
11 May 2007, 03:10 |
|
Chewy509 11 May 2007, 03:16
LocoDelAssembly wrote:
Thanks, that blog clarified my post just below yours... |
|||
11 May 2007, 03:16 |
|
Chewy509 11 May 2007, 03:26
LocoDelAssembly wrote:
Do you know of an easy solution in code that will allow the stack alignment to allows be correct for calling the API? eg something that I can insert before calling a function? It just seems like a PITA to keep stack alignment in mind, especially if you're a doing a lot of API calls that are wrapped into other functions... eg. would the following work? Code: push rsp mov rax, -16 and rsp, rax ;; rsp mod 16 = 0 sub rsp, 8 ;; rsp mod 16 = 8 ;; now setup call stack for arguments sub rsp, (numargs * 8 ) ;;insert args call [WriteFileA] add rsp, (numargs * 8 ) + 8 ;; clean up stack pop rsp ;; restore orginal stack pointer |
|||
11 May 2007, 03:26 |
|
Chewy509 11 May 2007, 04:04
Hmm... this might be better?
Code: mov rbp, rsp mov rax, -16 and rsp, rax ;; rsp mod 16 = 0 push rbp ;; rsp mod 16 = 8, and save original rsp onto stack ;; now setup call stack for arguments sub rsp, (numargs * 8 ) ;; if odd number of args OR sub rsp, (numargs * 8) + 8 ;; if even number of args ;;insert args call [WriteFileA] add rsp, (numargs * 8 ) ;; clean up stack if odd # of args OR add rsp, (numargs * 8 ) + 8 ;; if even # of args pop rsp ;; restore original stack pointer Or am I just confusing myself with all of this? |
|||
11 May 2007, 04:04 |
|
Xorpd! 11 May 2007, 04:28
Chewy509 wrote: That's an unusual mapping??? Where does it come from? Everybody execpt you considers r2 to be rdx and r3 to be rbx. These are the register numbers encoded into the instructions. You do have a problem with stack alignment in that you push an odd number of registers in the prolog, which you omitted in your original code. For some reason sometimes I can't see an attached file, as was the case here. However after logging in to respond to this message I could see it hence also all those pushes in the prolog of the procedure. In fact you push all registers except for r0 (rax), r4 (rsp), and r5 (rbp) (see how difficult it is when you don't follow standard conventions? I recall that Agner Fog numbered the core microarchitecture port that accepts branch instructions as port 2, but when Intel belatedly came out with their documentation for core microarchitecture where the called the it port 5, Agner Fog quickly came out with a new version of his documents where he conformed to Intel's numbering scheme. This made it much easier for me to talk about optimization issues with friends because I didn't have to always clarify which numbering scheme I was following. So please take a hint from Agner Fog and don't make up an arbitrary numbering scheme for Intel's integer registers that conflicts with the scheme that has been around over a quarter century.) You must subtract some number that is 8 mod 16 from r4 (rsp) before pushing the first argument on the stack. This will make it aligned 0 mod 16 just before the call assuming it was 8 mod 16 as it should have been on entry of your procedure. Also, it seems that you might not be aware that in the x64 calling convention, the callee does not clean up the stack in any way on return. This means that you will have to add 18h to the r4 (rsp) in addition to the 20h you added to it already, and the 8 mod 16 number you subtracted from it before you started to push arguments on the stack. Good luck. |
|||
11 May 2007, 04:28 |
|
LocoDelAssembly 11 May 2007, 04:34
Unbelievable, I corrected my post but told the opposite of the correction I really did. Better I gonna sleep now...
Anyway, (RSP mod 16) SHOULD BE zero BEFORE calling, in other words, (RSP mod 16) should be eight when executing the very first instruction of the callee (CreateFileW in this case). I also want to add that you have to reserve stack as if all the parameters goes to stack (the spill area). That's why fastcall macro does not care about discounting the parameters that will not go to stack: Code: if argscount and 1 stackspace = (argscount+1)*8 else stackspace = argscount*8 end if Supposing that you can't never be sure about alignment then this could be a way to call CreateFileW Code: push rbp mov rbp, rsp sub rsp, 8*4 + 3*8 ; Spill area + stack parameters and rsp, -16 ; Alignment mov qword [rsp+8*4+16], 0 ; I don't know if HANDLE is 64-bit or 32-bit wide so I used qword to be safe mov dword [rsp+8*4+8], FILE_ATTRIBUTE_NORMAL mov dword [rsp+8*4], OPEN_EXISTING xor r9, r9 ; lpSecurityAttributes xor r8, r8 ; dwShareMode mov edx, GENERIC_READ ; dwDesiredAccess lea rcx, [B0_DynStr0+4] ; lpFileName ; RSP+48 = hTemplateFile ; RSP+40 = dwFlagsAndAttributes ; RSP+32 = dwCreationDistribution ; RSP+24 = r9's spill ; RSP+16 = r8's spill ; RSP+8 = rdx's spill ; RSP = rcx's spill call [CreateFileW] mov rsp, rbp pop rbp I don't have a Win64 to test but I hope this time will work [edit]Corrected alignment, I was clearing the upper 28 bits of RSP instead of the first 4 Thanks Xorpd![/edit] Last edited by LocoDelAssembly on 11 May 2007, 12:48; edited 1 time in total |
|||
11 May 2007, 04:34 |
|
Chewy509 11 May 2007, 05:15
Xorpd! wrote:
I do realise what the *internal* encodings are, but have never heard someone referring to rcx as r1 in general conversation on register usage... Apologies if my comment appeared naive. (FYI: most of the code I post is not written in asm, but rather the asm output from a compiler. The HLL uses that naming convention, and I rely on fasm to equ r0 to rax, r1 to rbx, etc). Xorpd! wrote: You do have a problem with stack alignment in that you push an odd number of registers in the prolog, which you omitted in your original code. For some reason sometimes I can't see an attached file, as was the case here. However after logging in to respond to this message I could see it hence also all those pushes in the prolog of the procedure. Most of the code I write is cross-platform, and hence use *a lot* of wrapping code for calling the respective OS. Because of this I preserve all registers into and from a call that in turn calls an OS/API function. As I know stack alignment appears to critical to calling the Windows API, that's I asked for a generic way to call the API without worrying about stack alignment, (which LocoDelAssembly kindly offered). Xorpd! wrote: I recall that Agner Fog numbered the core microarchitecture port that accepts branch instructions as port 2, but when Intel belatedly came out with their documentation for core microarchitecture where the called the it port 5, Agner Fog quickly came out with a new version of his documents where he conformed to Intel's numbering scheme. This made it much easier for me to talk about optimization issues with friends because I didn't have to always clarify which numbering scheme I was following. So please take a hint from Agner Fog and don't make up an arbitrary numbering scheme for Intel's integer registers that conflicts with the scheme that has been around over a quarter century.) Apologies if you got confused. In future I'll ensure that *all* code I post follows Intel's/AMD's documentation. Xorpd! wrote:
I am aware of the calling convention, and the forgetting to clean up the stack correctly was an oversight on my part. eg one of the bugs in the code... PS. Just so you are aware, I'm coming back to Windows coding after approx 3yrs away from the Windows platform (mainly Linux/FreeBSD/hobby OS), and trying to increase my knowledge on a new/different calling convention. While I've got over 15yrs asm experience, I still consider myself a newbie in Windows x64 programming. |
|||
11 May 2007, 05:15 |
|
Chewy509 11 May 2007, 05:17
LocoDelAssembly, thanks for your help and clarification.
|
|||
11 May 2007, 05:17 |
|
Xorpd! 11 May 2007, 08:34
Quote:
Given the time that has passed since x64 systems first became available, we are all newbies at x64. What makes most sense to me is to program in x64 as though one were programming a RISC machine: use rsp-offset moves rather than pushes and pops to insert arguments and save and restore registers, rather like Loco's snippet, but also to keep in mind which registers are callee-save in the calling convention so as to avoid having to save them all (in a typical RISC architecture this would be a lot of registers, and would be quite a few in x64 if you entertained thoughts of saving xmm0:xmm15). Also the calling convention ensures that you know the stack alignment on procedure entry so it's not necessary to do gymnastics like Loco's and rsp, not 0fh which you would have to do in *32-land. Normally I would just subtract enough from rsp on procedure entry to leave room for saved registers, local variables, the most stack-passed parameters of any procedure that will be invoked, the 32 bytes required for any call, and the extra 8 bytes for alignment if necessary. If you keep the RISC philosophy in mind, most of the Windows x64 stuff, like the calling convention and the psycho structure alignment requirements make a lot of sense and are intuitive. This might not be an option for you, however, because you are trying to be multiplatform on a set of platforms that doesn't include an actual RISC processor and are working from an HLL. The HLL stuff definitely shows in your example, and I realize that also you are struggling with just getting something basic to work. How many times have we sat at our computers for hours at a time and not had anything to show for it in terms of code? Our culture seems to consider that such periods of not-obviously-productive activity are a bad thing, but I think it's normal and even a prerequisite for attaining computer skills, especially assembly language skills. So don't feel bad about it because you are probably going to get going pretty quickly from here on out. |
|||
11 May 2007, 08:34 |
|
Chewy509 13 May 2007, 22:50
Xorpd! wrote: So don't feel bad about it because you are probably going to get going pretty quickly from here on out. Thanks. Well, once I got the calling convention sorted, I've been pretty much on my way. Now I've just a couple of queries regarding the API, rather than just calling the damn thing. Quick question, I have to assume that ReadFile() and WriteFile() automatically increment the file pointer as needed? I've been playing with both of these, and ReadFile() doesn't appear to be incrementing the file pointer... Oh well, I'm away from my dev PC for a week, so I guess it'll have to wait. |
|||
13 May 2007, 22:50 |
|
vid 14 May 2007, 10:20
Quote: Quick question, I have to assume that ReadFile() and WriteFile() automatically increment the file pointer as needed? yes, they do. |
|||
14 May 2007, 10:20 |
|
Chewy509 14 May 2007, 23:36
vid wrote:
Sure, I figured it out last night. I was mis-interpreting the return fields of ReadFile(). The lastest Win2K3 R2 SDK Documentation is a little short of info, but going back to the old NT4 SDK documentation, cleared it up. |
|||
14 May 2007, 23:36 |
|
vid 14 May 2007, 23:56
Quote: The lastest Win2K3 R2 SDK Documentation is a little short of info, but going back to the old NT4 SDK documentation, cleared it up. use online MSDN whenever possible, that should be latest info |
|||
14 May 2007, 23:56 |
|
Chewy509 18 May 2007, 00:04
Here's one to add to the x64 calling convention...
I spent about 2hrs last night trying to figure out why repeated calls to certain API functions would cause an exception. (about the 3rd or 4th call the function). Triple checked the calling convention, and couldn't find anything wrong. Why would an API call fail on the fourth or fifth time, if you were just passing the same parameters to the API Call? So just out of frustration, I ensured that the slack space ([rsp] -> [rsp+19h]) was zero, the calls started working perfectly irrespective of the number of times I called the API function... Moral of the story, if a call is failing and you are 100% certain that you are calling the function correctly (or have called it multiple times in the past), try setting the slack space to be zero. eg. Code: sub rsp, 20h; slack space xor rax, rax ;; I use rax, since the call trashes it mov [rsp], rax mov [rsp+08h], rax mov [rsp+10h], rax mov [rsp+18h], rax call [API_Function] add rsp, 20h |
|||
18 May 2007, 00:04 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.