flat assembler
Message board for the users of flat assembler.
Index
> Windows > Losing lower 127:0 bits on ymm0-5? |
Author |
|
bitshifter 23 May 2011, 16:17
Sorry i dont have modern Windoze machine to try...
Anyway, IIRC wsprintf is cdecl invoke is for stdcall where cinvoke is for cdecl invoke will not cleanup the stack like cinvoke does... Code: invoke wsprintf Not very important in this case, but just thought i would mention it... |
|||
23 May 2011, 16:17 |
|
Alphonso 23 May 2011, 17:13
I wondered about that. Do you mean cinvoke should still be used for "correctness"?
Disassembled code section of invoke wsprintf... Code: sub rsp,64 mov rcx,rdi mov rdx,wsformat mov r8,rbx mov r9,[rsi+TestYmm0+rsi] mov rax,[rsi+TestYmm0+8h] mov [rsp+20H],rax mov rax,[rsi+TestYmm0+10h] mov [rsp+28H],rax mov rax,[rsi+TestYmm0+18h] mov [rsp+30H],rax call near [rel imp_wsprintfA] add rsp, 64 In this case the code produced is the same whether cinvoke or invoke is used. |
|||
23 May 2011, 17:13 |
|
LocoDelAssembly 23 May 2011, 18:02
Alphonso, perhaps wsprintf itself is destroying your registers? I think the calling convention allows some SSE registers to be destroyed, and if I also remember right, writing to a XMM register clears the upper 128 bits of its YMM counterpart (so even if the calling convention disallows destroying SSE registers, using them and restoring before returning would still clear the YMM registers).
To test this better, perhaps you should try with LOOP $ just above @@: using a high enough RCX to give you enough time to launch something else also using YMM register (a second instance of this very same program for instance, but you'll need to initialize ymmreg with random values and then add code to check for difference) [edit] http://www.agner.org/optimize/calling_conventions.pdf wrote: A preliminary ABI published by Intel (see literature p. 53) is supported by operating systems Also, I've read a little of AVX, using SSE instruction (with SSE registers), won't modify the upper 128 bits of the YMM register, but when using VEX.128 enconding, preserving the upper half is possible.[/edit] |
|||
23 May 2011, 18:02 |
|
Alphonso 23 May 2011, 18:31
Maybe, I'll look further into it, thanks for intuitive feedback. It would be nice if it were documented which registers are volatile in that case. Makes me also wonder about 32-bit were only the first 8 are available.
EDIT: ^^ good find. I wonder how expensive xsave/xrstor is going to be. Double Edit: Okay, wsprintf seems fine, must have been the msgbox, even adding a Sleep between 2 blocks of reading ymm corrupts but running a long loop with reg/dec and all threads busy seems okay so the context seem okay. Thanks Loco. |
|||
23 May 2011, 18:31 |
|
Enko 23 May 2011, 19:20
wsprintf is stdcall, its located in the winapi, not c standart library. cprintf is cinvoke from mvcrt.dll
|
|||
23 May 2011, 19:20 |
|
LocoDelAssembly 23 May 2011, 20:11
Quote:
http://msdn.microsoft.com/en-us/library/ms647550%28VS.85%29.aspx wrote: Note: It is important to note that wsprintf uses the C calling convention (_cdecl), rather than the standard call (_stdcall) calling convention. As a result, it is the responsibility of the calling process to pop arguments off the stack, and arguments are pushed on the stack from right to left. In C-language modules, the C compiler performs this task. [edit] LocoDelAssembly wrote:
|
|||
23 May 2011, 20:11 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.