flat assembler
Message board for the users of flat assembler.
Index
> Windows > x64 calling convention oddity |
Author |
|
r22 18 May 2007, 01:41
the x64 fast call convention (for windows haven't got into the linux version if theirs much of a difference) is a pain to use and optimize.
Requirements for stack align, empty stack space more than 4 args, it's just not comfortable to use. There's no problem using the invoke macro for win64 but if your goal is to create optimal code (the invoke for win64 fast call isn't) it's a real hassle. MOV rax,[RANT] RET 0 Re: Chewy, if it doesn't happen in the SECOND call why would the stack get corrupted on the THIRD or FOURTH call, what API's did you get this abnormal behavior with? |
|||
18 May 2007, 01:41 |
|
LocoDelAssembly 18 May 2007, 02:17
Moreover, are you sure that by zeroing [ESP+$18] you fix the problem? Perhaps RAX = 0 does the magic. I'm pretty sure that some part of RAX is used in the Linux ABI to indicate the number of SSE registers used or something like that (sorry, I don't remember it clearly), and perhaps the RAX random values are causing this mess.
[edit] System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.98 wrote: For calls that may call functions that use varargs or stdargs (prototype-less |
|||
18 May 2007, 02:17 |
|
Xorpd! 18 May 2007, 02:52
The moral of many other stories has been: "If you have a problem and don't understand what is the going wrong, it does no good to report your perceptions of what the problem is because if your perceptions were correct, you would understand the problem and would be able to solve it yourself."
If you want help with the problem, provide a minimal but complete source that allows others to reproduce your problem so that we can try to determine what your error actually was. Description of the problem and a workaround are clearly insufficient here. |
|||
18 May 2007, 02:52 |
|
Chewy509 18 May 2007, 05:48
LocoDelAssembly wrote: Moreover, are you sure that by zeroing [ESP+$18] you fix the problem? When that's the only difference between a working example and a none working example... what else could it be? PS. just assembly both and execute. The example displays argc/argv and reads a file called 'README' and displays it's output. PPS. The abnormal behavior was noticed with the ReadFile() call.
|
|||||||||||||||||||||
18 May 2007, 05:48 |
|
Xorpd! 18 May 2007, 06:23
The attachments are totally garbled at my end. Am I alone in experiencing difficulty in reading files attached to this forum? The attachments are invisible until I log on to the forum, and when I click on the download link, I get a garbled file. If I right-click and select "save file as", for some reason the file doesn't get saved.
If you want me to look at it, I don't know what your alternatives are. You could attempt to upload it again, or you could PM me, but that would be my first attempt to read a PM! You could derive my email address from first principles; I can't recall if it's in my profile... |
|||
18 May 2007, 06:23 |
|
LocoDelAssembly 18 May 2007, 16:52
Code: ;Register renaming r0 equ rax r0d equ eax r0w equ ax r0b equ al r1 equ rbx r1d equ ebx r1w equ bx r1b equ bl r2 equ rcx r2d equ ecx r2w equ cx r2b equ cl r3 equ rdx r3d equ edx r3w equ dx r3b equ dl r4 equ rdi r4d equ edi r4w equ di r4b equ dil r5 equ rsi r5d equ esi r5w equ si r5b equ sil r6 equ rbp r6d equ ebp r6w equ bp r6b equ bpl r7 equ rsp r7d equ esp r7w equ sp r7b equ spl ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;; NON-WORKING CODE ;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; _B0__fgetc: push r1 push r2 push r3 push r4 push r5 push r8 push r9 push r10 push r11 push r12 push r13 push r14 push r15 mov r2, qword [r6+_B0__fgetc_handle] lea r3, [r6+_B0__fgetc_buffer] mov r8, 1 lea r9, [r6+_B0__fgetc_buffer2] mov r0, 0 push r6 mov r6, r7 and r7, -16 sub r7, 30h call [ReadFile] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;; WORKING CODE ;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; _B0__fgetc: push r1 push r2 push r3 push r4 push r5 push r8 push r9 push r10 push r11 push r12 push r13 push r14 push r15 mov r2, qword [r6+_B0__fgetc_handle] lea r3, [r6+_B0__fgetc_buffer] mov r8, 1 lea r9, [r6+_B0__fgetc_buffer2] mov r0, 0 push r6 mov r6, r7 and r7, -16 sub r7, 30h mov [r7], r0 mov [r7+08h], r0 mov [r7+10h], r0 mov [r7+18h], r0 mov [r7+20h], r0 mov [r7+28h], r0 call [ReadFile] In the non-working code you are calling ReadFile as Code: invoke ReadFile, [r6+_B0__fgetc_handle], addr r6+_B0__fgetc_buffer, 1, [r6+_B0__fgetc_buffer2], GARBAGE!!! Code: invoke ReadFile, [r6+_B0__fgetc_handle], addr r6+_B0__fgetc_buffer, 1, [r6+_B0__fgetc_buffer2], NULL |
|||
18 May 2007, 16:52 |
|
Chewy509 19 May 2007, 05:38
LocoDelAssembly, try making the last parameter null in the non-working code, eg add " mov [rsp+20h], r0" just before the call, and see what happens...
(I must have snipped too much between the two uploads, as I had to recreate the non-working one from the working one). |
|||
19 May 2007, 05:38 |
|
Chewy509 19 May 2007, 23:41
Chewy509 wrote: LocoDelAssembly, try making the last parameter null in the non-working code, eg add " mov [rsp+20h], r0" just before the call, and see what happens... Thinking about this last night, that last parameter is a pointer to the structure for overlapped I/O. If the file was opened for synchronise I/O only, why would the contents of that paramter matter anyway? And if it did, why wouldn't the call fail the first time? Only to fail on the 3rd or 4th call? But since the parameter was set correctly during testing (accidently snipped before upload), I guess that point is null and void? But just something to think about??? A lot doesn't make sense? PS. Anyone been able to confirm this issue on their own PC? |
|||
19 May 2007, 23:41 |
|
LocoDelAssembly 20 May 2007, 00:05
Quote: If hFile is not opened with FILE_FLAG_OVERLAPPED and lpOverlapped is not NULL, the read operation starts at the offset specified in the OVERLAPPED structure. If you wonder why it doesn't crash the first N times check [RSP] each time to see where the garbage points. |
|||
20 May 2007, 00:05 |
|
Xorpd! 21 May 2007, 23:57
So I finally got around to running your code. Don't know why I should have done so because nobody has ever run my codes -- and they don't even crash!
I made one change to both versions because it seemed to make sense to me: Code: ; db ((label2-label)/2)-3 ; db ((label2-label)/2)-3 dw label2-label-3 Results: Code: test argc, argv application Argc = 1 Argv = 4198935 Argv[0] = win64_broken Open File test File Handle = 0 Reading File Contents: Code: test argc, argv application Argc = 1 Argv = 4198935 Argv[0] = win64_working Open File test File Handle = 0 Reading File Contents: |
|||
21 May 2007, 23:57 |
|
Chewy509 22 May 2007, 03:22
Xorpd! wrote:
if label2 = 16 and label = 2, then: ((label2-label)/2)-3 = 4 label2-label-3 = 11 |
|||
22 May 2007, 03:22 |
|
Chewy509 22 May 2007, 03:28
After digging through my CVS repository (we all use those don't we?), I think I found that the original code was incorrect, and wouldn't work irrespective if the slack space was zero or not. Either that, or the first call to ReadFile() would trash the stack (due to poor parameter passing on my part), but not enough to cause an exception, but the 3rd or 4th call would trash it a little too much, that an exception would be raised.
Anyway, I've got some working code (and my compiler now runs on a reported 6 OSs - 3 confirmed by me, and 3 as reported by others)! So I guess you can ignore this thread... |
|||
22 May 2007, 03:28 |
|
Xorpd! 22 May 2007, 03:43
Your length may be correct for UTF16_STRING, but the correction I made was for UTF8_STRING. If you look at the data structures in your executable, the lengths of the UTF8_STRINGs come out negative sometimes. Perhaps that is your intent, but it seems wierd to a reader of your uncommented code.
I expected your original code to have caused the fault. BTW, I tried Loco's correction and your code then worked. |
|||
22 May 2007, 03:43 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.