flat assembler
Message board for the users of flat assembler.
Index
> Main > RSI and RDI in x64 |
Author |
|
bitRAKE 12 Nov 2023, 00:09
The primary difference is the registers cleared. The 64-bit version clears RAX/RBX, but the 32-bit version clears EDX. Does the 64-bit version initialize EDX elsewhere?
Code: xor eax, eax next: mov al, [r8 + rcx - 1] sub al, [r9 + rcx - 1] ja skipswap neg al skipswap: cmp al, ah jb skipdiff mov ah, al skipdiff: loop next movzx eax, ah ret _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
12 Nov 2023, 00:09 |
|
Andy 12 Nov 2023, 00:26
Not just EDX but all parameters are passed from a high level language. If I get it right, x64 calling convention say that first 4 integer parameters are passed in RCX, RDX, R8, and R9. So in RCX I have the number of bytes, in RDX I pass 0 at start and R8,R9 are pointers to data structures. Now that you said that 64-bit version clears RAX/RBX I pushed them into stack before working with these registers and restore them before return. I did it with RDI and RSI also and seems to work fine.
Code: use64 push rsi push rdi push rax push rbx mov rsi, r8 mov rdi, r9 xor rax, rax xor rbx, rbx next: mov al, [rsi] mov bl, [rdi] cmp al, bl ja skipswap mov ah, al mov al, bl mov bl, ah skipswap: sub al, bl cmp al, dl jb skipdiff mov dl, al skipdiff: inc rsi inc rdi loop next pop rbx pop rax pop rdi pop rsi mov rax, rdx ret Thank you very much. |
|||
12 Nov 2023, 00:26 |
|
Andy 12 Nov 2023, 00:49
bitRAKE wrote: The primary difference is the registers cleared. The 64-bit version clears RAX/RBX, but the 32-bit version clears EDX. Does the 64-bit version initialize EDX elsewhere? Tried this and also works great. Thank you. |
|||
12 Nov 2023, 00:49 |
|
revolution 12 Nov 2023, 04:36
Andy wrote: ... RAX/RBX I pushed them into stack before working with these registers and restore them before return. I did it with RDI and RSI also and seems to work fine. Note that if you want to call another HLL function from within your code then there is a requirement to have the stack pointer, rsp, correctly aligned to 0 mod 16 before the call, else the code can potentially crash. |
|||
12 Nov 2023, 04:36 |
|
Andy 12 Nov 2023, 05:39
Thank you for clarification. I figure it out later reading more about x64 calling convention that RAX doesn't require to be saved.
Quote: The x64 ABI considers registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-XMM15 nonvolatile. They must be saved and restored by a function that uses them. |
|||
12 Nov 2023, 05:39 |
|
Andy 12 Nov 2023, 13:16
I was thinking about SIMD version of the code above but I have one more question. I am not sure if I get it right how MPSADBW works and if it helps me. If I have this sample data in XMM1 and XMM2, after executing MPSADBW the result will be as I anticipate below?
XMM1: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 XMM2: 00 00 00 00 00 00 00 00 00 00 06 05 04 03 02 01 MPSADBW XMM1: 00 00 00 00 00 00 00 00 00 00 00 00 00 0B 00 0A From what I read MPSADBW will compute absolute difference of quadruplets of 8-bit unsigned integers from first register compared to those in second register, and store the 16-bit results in first register. Also I am not sure if I fully understand the last operand, it looks like it's some kind of offset from where the quadruplets are formed. |
|||
12 Nov 2023, 13:16 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.