flat assembler
Message board for the users of flat assembler.
Index
> Main > Intel plans doubling 16 general purpose registers to 32 Goto page 1, 2, 3, 4 Next |
Author |
|
Feryno 28 Jul 2023, 10:54
|
|||
28 Jul 2023, 10:54 |
|
revolution 28 Jul 2023, 11:13
This is a win with no loss.
The extra REX2 prefix uses 0xd5 (legacy instruction AAD) so does not impact existing x64 instructions. AAD has always been invalid in 64-bit code, so now it will be repurposed to serve as REX2. If you can find a benefit to using 32 registers in your code then use REX2, else if you can find no benefit you can ignore REX2 and continue as normal. |
|||
28 Jul 2023, 11:13 |
|
bitRAKE 28 Jul 2023, 14:36
Absolutely, the internal registers already exceed 16. This is just a way to remove loads and control renaming. Total win. (Compilers can rejoice, lol.)
|
|||
28 Jul 2023, 14:36 |
|
tthsqe 29 Jul 2023, 03:43
Any guesses as to whether r15-31 will be volatile or non-volatile in the major ABIs?
|
|||
29 Jul 2023, 03:43 |
|
bitRAKE 29 Jul 2023, 06:50
With the SIMD registers YMM6+ and ZMM, Microsoft has made them volatile (non-preserved), but for general purpose registers R12+ they are non-volatile. I'm going to wager R16-R31 will be non-volatile as well. As wonky as the Windows x64 ABI is already - it's a very small wager. Seeing some sort of split would not surprise me one bit.
Isn't there some sort of systems theory which favors balanced resource on both sides of the caller/callee - assuming infinite layers to the system. Feels like something like that should exist. There isn't infinite layers though - not every call is deep-dive. Interesting to think about. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
29 Jul 2023, 06:50 |
|
AsmGuru62 29 Jul 2023, 12:12
Question: these additional registers will be available only for x64 coding?
|
|||
29 Jul 2023, 12:12 |
|
revolution 29 Jul 2023, 12:48
AsmGuru62 wrote: Question: these additional registers will be available only for x64 coding? Same for REX, you get INC and DEC instead. Don't expect any new instructions or registers for 32-bit code. The future is 64-bit only, apparently. |
|||
29 Jul 2023, 12:48 |
|
AsmGuru62 29 Jul 2023, 13:12
Thanks.
I wonder about the compatibility. There are a lot of x64 CPUs which do not have these registers, so lets say I code an app for something. I want it to run on old CPUs (no registers R16-R31) and on new ones, so I need to detect the CPU and then code some (complex) functions in two 'incarnations'. I guess, this is the way to go. Should not be very hard -- make the older CPU version first, where some variables are in local memory (stack). Then just copy/paste and replace the locals with R16-R31. Seems reasonable. |
|||
29 Jul 2023, 13:12 |
|
revolution 29 Jul 2023, 13:17
The only time you need to detect anything is if you want to use R16-R31. If you don't use them then there is nothing for you to do, it will run as normal.
|
|||
29 Jul 2023, 13:17 |
|
bitRAKE 29 Jul 2023, 14:35
bitRAKE wrote: With the SIMD registers YMM6+ and ZMM, Microsoft has made them volatile (non-preserved), but for general purpose registers R12+ they are non-volatile. I'm going to wager R16-R31 will be non-volatile as well. As wonky as the Windows x64 ABI is already - it's a very small wager. Seeing some sort of split would not surprise me one bit. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
29 Jul 2023, 14:35 |
|
revolution 29 Jul 2023, 14:42
bitRAKE wrote: ... why preserve additional state for context switches? |
|||
29 Jul 2023, 14:42 |
|
Ali.Z 29 Jul 2023, 15:11
bitRAKE wrote: I'm going to wager R16-R31 will be non-volatile as well. As wonky as the Windows x64 ABI is already - it's a very small wager. Seeing some sort of split would not surprise me one bit. based on current x64 abi, it is likely they gonna eat most registers, and keep only few for the application itself; this covers both unix, unix-like and windows. they tend to be very aggressive in their side. but it is still a win for both sides, except care should be taken for future software that want their thing to run on non-REX2 CPU; for me nothing will change, I still use 32-bit. _________________ Asm For Wise Humans |
|||
29 Jul 2023, 15:11 |
|
Furs 29 Jul 2023, 17:02
revolution wrote: This is a win with no loss. Sure, loads take offset to encode as well (from rbp or rsp), but they said 10% loads versus doubling the regs. So smells like bloat to me. |
|||
29 Jul 2023, 17:02 |
|
Furs 29 Jul 2023, 17:04
bitRAKE wrote: Absolutely, the internal registers already exceed 16. This is just a way to remove loads and control renaming. Total win. (Compilers can rejoice, lol.) Think carefully for the answer to that question. "Total win" means literally no downsides. Longer encodings is a downside. |
|||
29 Jul 2023, 17:04 |
|
revolution 29 Jul 2023, 17:58
Furs wrote: Sure, loads take offset to encode as well (from rbp or rsp), but they said 10% loads versus doubling the regs. So smells like bloat to me. Going from 1 reg to 2 regs, gives a great boost. Going from 2 to 4 a good boost. From 4 to 8 a moderate boost. etc. ... from 1G regs to 2G regs you get effectively to zero benefit (and probably a big loss from all the overheads). |
|||
29 Jul 2023, 17:58 |
|
bitRAKE 29 Jul 2023, 19:20
revolution wrote:
Intel wrote: We propose to define the new GPRs as caller-saved (volatile) state in application binary interfaces (ABIs), facilitating interoperability with legacy binaries. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
29 Jul 2023, 19:20 |
|
bitRAKE 29 Jul 2023, 19:56
PUSH2/POP2 require the stack to be 0mod16. Longer encoding.
SETcc.zu is usually what I'm after anyhow. All the different Boolean types are silly. MS BOOL is 32-bit, HRESULT S_OK = 0, S_FALSE = 1, 32-bit, C/C++ bool is 8-bit. Using the flags directly is still a better option. Fixing CMOVcc memory access is nice, revolution has mentioned this before a couple times. Is there really a need for JMPABS? The three operand forms are quite a large benefit. (Intel calls it "new data destination (NDD)".) * This might accelerate my move to FASM2 - needing more control over instruction encoding. Last edited by bitRAKE on 29 Jul 2023, 20:25; edited 3 times in total |
|||
29 Jul 2023, 19:56 |
|
revolution 29 Jul 2023, 20:20
bitRAKE wrote: Fixing CMOVcc memory access is nice, revolution has mentioned this before a couple times. |
|||
29 Jul 2023, 20:20 |
|
sylware 29 Jul 2023, 21:28
CMOVcc memory access fixed? You mean it won't segfault when pointing on invalid memory even though it is not supposed to be executed (speculation fix)?
And seriously, I am thinking about the assembly code I wrote lately, and doing that with 32 regs... yeah... it does turn me on. But my eyes are now looking at 64bits risc-v... |
|||
29 Jul 2023, 21:28 |
|
Goto page 1, 2, 3, 4 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.