flat assembler
Message board for the users of flat assembler.
Index
> High Level Languages > Compiler optimizations |
Author |
|
revolution 21 Aug 2022, 07:06
Hint: The magic number 4294981120 is 0x100003600
|
|||
21 Aug 2022, 07:06 |
|
FlierMate1 21 Aug 2022, 07:23
revolution wrote: Hint: The magic number 4294981120 is 0x100003600 I try to analyze it given the magic number. BT wrote: Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset (specified by the second operand) and stores the value of the bit in the CF flag. Code: 0001 0000 0000 0000 0000 0011 0110 0000 0000 0x9=9th bit position 0xA=10th bit position 0xC=12th bit position 0xD=13th bit position 0x20=32th bit position So if the selected bit is one, it will set the carry flag and stores in EAX through SETC. Clever, isn't it? Is this designed by GCC developer, or somehow discovered by "machine learning" itself? Last edited by FlierMate1 on 21 Aug 2022, 07:29; edited 1 time in total |
|||
21 Aug 2022, 07:23 |
|
FlierMate1 21 Aug 2022, 07:26
Note the old Assembly code by GCC with optimization is the following, a slightly different version:
Code: isHtmlWhitespace(int): cmp edi, 32 ja .L42 movabs rax, 4294981120 mov ecx, edi shr rax, cl and eax, 1 ret .L42: xor eax, eax ret |
|||
21 Aug 2022, 07:26 |
|
revolution 21 Aug 2022, 07:37
FlierMate1 wrote: Clever, isn't it? Is this designed by GCC developer, or somehow discovered by "machine learning" itself? It is a very old "trick". GCC didn't invent it. It goes way back into the dim dark history of computers. |
|||
21 Aug 2022, 07:37 |
|
bitRAKE 21 Aug 2022, 15:59
Here is a simple way that [*some] compiler(s) fail:
Code: bool os_file_exists(wchar_t *name) { auto attrib = GetFileAttributesW(name); if (attrib == INVALID_FILE_ATTRIBUTES) return false; if (attrib & FILE_ATTRIBUTE_DIRECTORY) return false; return true; } *gcc or msvc: on any architecture I've looked at. clang fixed it from 3.7 to present (even -O1, it's simple code elimination). FYI: I have a whitespace test with my bit vector helper macro (following post). The macro is more general than the compiler is capable of. Code: align 32 ValidFileNameChars BITVECTOR 256,"!#$%&'()-@^_`{}~+,.;=[]0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" bt [ValidFileNameChars],eax jnc .fail _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
21 Aug 2022, 15:59 |
|
sylware 23 Aug 2022, 11:30
uhoh! This is nice. I shall remember that and use it.
Ofc the least amount of branches is naively better for the speculation logic. There is a lot of boolean tricks like that, a properly indexed and readable library of those would be actually more than appropriate. |
|||
23 Aug 2022, 11:30 |
|
bitRAKE 27 Aug 2022, 05:28
Okay, the next place compilers fail: This one is a size optimization for amd64 code, that would cost zero in terms of performance.
Often we see the preservation of registers in the shadow space, and on the stack, imagine if you will (or grab a debugger and trace through any compiled code): Code: mov [rsp+8h],rbx mov [rsp+10h],rsi mov [rsp+18h],rdi mov [rsp+20h],r9 push rbp r12 r13 r14 r15 Well, why isn't the compiler pushing legacy registers as a priority and store R8-R15 in the shadow space? Legacy registers don't cost a prefix byte when being pushed, but all 64-bit memory moves need a prefix byte. No cost beyond changing the model for the machine. Yet, petabytes saved across the planet ... I don't know, maybe not - baby steps though. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
27 Aug 2022, 05:28 |
|
revolution 27 Aug 2022, 05:56
Why not this?
Code: pop rax add rsp, 8 * 4 push r9 rdi rsi rbx rax rbp r12 r13 r14 r15 |
|||
27 Aug 2022, 05:56 |
|
revolution 27 Aug 2022, 07:49
If your calling protocol uses rax then an alternative is this.
Code: add rsp, 8 * 5 push r9 rdi rsi rbx sub rsp, 8 push rbp r12 r13 r14 r15 |
|||
27 Aug 2022, 07:49 |
|
bitRAKE 27 Aug 2022, 11:36
For speed, we wouldn't want to modify RSP and then PUSH - this is why MOV used in the shadow space MS ABI. Yet, certainly for size those are superior options. My point was that there is no performance cost for this size reduction.
_________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
27 Aug 2022, 11:36 |
|
revolution 27 Aug 2022, 12:26
bitRAKE wrote: For speed, we wouldn't want to modify RSP and then PUSH ... |
|||
27 Aug 2022, 12:26 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.